Sql server - Using aggregate functions in where clause

Sql server - Using aggregate functions in where clause - sql-server

I am working on a sql query for Transport business, this query when executed should get the drivers information who got more than 20% star rating(5*) rating from his customers in last 30 days... also that should be a minimum of 5 trips..
Lets say if a driver completed 100 trips in last 30 days and he received 30 star rating (5*) feedback then this Driver and all his star (5*) Trips information should be retrieved by the query..this driver has completed more than 20% 5 star trips
select tr.[TripId], tr.[DriverId], tr.[Rating], dr.[DriverName]
from tblTripInfo
left outer join tblDriver dr
on tr.[DriverId] = dr.[DriverId]
where tr.[Rating] = 5 and tr.[TripDate] >= GetDate() - 30
the above query gets all the information of trips and driver who got 5* ratings in last 30 days, i want to get only those who have minimum of 20% 5* trips out of their total trips and that should me minimum of 5 trips
Initially i wanted to get only DriverId's who met the above condition and the below query worked
select DriverId,
count(case when Rating = 5 then DriverId end) as TotalStars,
100.0 * avg(case when Rating = 5 then 1.0 else 0 end) as Average5Stars
from tblTripInfo
where TripDate >= GetDate() - 30
group by DriverId
having
count(case when Rating = 5 then DriverId end) > 10
and
100.0 * avg(case when Rating = 5 then 1.0 else 0 end) > 25
But now i want to get all the information like tripId, driverName, trip date of those 5* trips as well

You need something in the line of this:
WITH TotalTrips as (
SELECT Count() as TotalTrips,
DriverId
FROM tblTripInfo
GROUP BY DriverId
)
SELECT DriverId,
count(case when Rating = 5 then DriverId end) as Total5StarTrips,
100.0 * avg(case when Rating = 5 then 1.0 else 0 end) as Average5Stars
FROM tblTripInfo t1
JOIN TotalTrips t2
ON t1.DriverId = t2.DriverId
AND t2.TotalTrips > 5 --more than 5 trips
where TripDate >= GetDate() - 30
group by DriverId
HAVING COUNT(case when Rating = 5 then DriverId end) / t2.TotalTrips > 0.2 --more than 20% 5-starred trips
No need of complicated logic if you can use some SubQuery for simplicity.

Related

SQL Server - Calculate AVG() using Joins

I have a Cab transport application
Each driver has a Trip and for each trip, there can be multiple customers (Cab pooling) giving their feedback.
Now I want to get the drivers of those drivers who got more than 10 five star ratings(5*) and a minimum of 20% Five-star ratings out of their total Ratings received from their customers.
Let's say a driver got a total 40 feedbacks in the last 30 days out of which 16 are 5-star ratings, then this driver has met the criteria of minimum 10 5* star ratings and more than 20% 5* ratings. This driver id should be fetched.
SELECT TR.[DriverId]
,100.0 * AVG(CASE
WHEN FE.[Rating] = 5
THEN 1.0
ELSE 0
END) AS Percentage
FROM tblFeedback FE
LEFT OUTER JOIN tblTrip TR ON FE.TripId = TR.TripId
WHERE FE.DATE >= GETDATE() - 30
AND FE.Rating = 5
GROUP BY DriverId
HAVING COUNT(CASE
WHEN FE.[Rating] = 5
THEN DriverId
END) >= 10
AND 100 * AVG(CASE
WHEN FE.[Rating] = 5
THEN 1.0
ELSE 0
END) > 20
The above query is showing the Percentage as 100.000 for all the Drivers whose Id's are fetched, even those drivers whose total percentage is 18% are also fetched and their percentage is shown as 100%.
This query has screwed my report completely

Try this. You need to include all the ratings in order to calculate the percentage:
SELECT r.[DriverId], 100.0*r.five_stars/r.total_ratings AS Percentage
FROM (
SELECT TR.[DriverId]
SUM(CASE WHEN FE.Rating =5 THEN 1 ELSE 0 END) AS five_stars,
SUM(*) AS total_ratings
FROM tblFeedback FE
INNER JOIN tblTrip TR ON FE.TripId = TR.TripId
WHERE FE.DATE >= GETDATE() - 30
GROUP BY TR.DriverId) r
WHERE r.five_stars>=10
AND 100.0*r.five_stars/r.total_ratings>20.0;

I think the issue is in your WHERE clause. This line in particular:
AND FE.Rating = 5
This is forcing the tblFeedback table to only return records that have a five-star rating, and therefore, only the five-star ratings are used in the calculation. Try taking that line out and see if the calculations are any closer to what you expect.

Calculate Bounce Rate SQL Server 2008

I'm trying to calculate the Bounce Rate of pages in SQL Server in a table with Audit Data from Sharepoint.
ItemId UserId DocLocation Occurred
1 1 Home.aspx 2016-08-02 13:39:41
1 2 Home.aspx 2016-08-02 13:40:07
2 1 Other.aspx 2016-08-02 13:40:16
3 1 Items.aspx 2016-08-02 13:40:17
2 2 Other.aspx 2016-08-02 13:40:11
ItemId is the id of the page, DocLocation the location of the page and Occurred when the user goes into the page.
To calculate the bounce rate we have to divide the number of bounces between the total number of visits.
A Bounce happens when an user leaves the page in less than 5 seconds.
This should be the results for that table:
ItemId Bounces Visits BounceRate(Bounces/Visits)
1 1 2 0.5
2 1 2 0.5
3 0 1 0
I want to count a bounce calculating how much passes since the user performs the check until the user makes a visit to another page. If that time is less than 5 seconds, it would be counted as a bounce.
I'm making a stored procedure that execute the query to show the bounce rate of each page, but this doesn´t work.
SELECT
SUM(CASE
WHEN (DATEDIFF(second, #Occurred,
(SELECT TOP 1 a.Occurred
FROM [AuditPages] a
WHERE a.UserId = #userId
AND a.Occurred > #occurred
ORDER BY a.Occurred ASC))) < 30
THEN 1.0
ELSE 0.0
END) / COUNT(#itemId)
Someone knows how i can calculate this Bounce Rate?
Thanks for all the answers.

I like using row_number for this type of sequenced problem. The query below gives the desired result. I find performance with CTEs can sometimes be problematic with larger tables and you may need to convert to a temp table. You might consider using milliseconds if there is a chance you would want to use 4.5 seconds or such in the future.
declare #bounce_seconds int = 5;
with audit_cte as (
select *, ROW_NUMBER() over (partition by UserId order by Occurred) row_num
from AuditPages
--order by UserId,row_num
)
select a.ItemId, sum(a.bounce) Bounces, count(1) Visits, sum(a.bounce)/convert(float, count(1)) BounceRate
from (
select a1.ItemId, datediff(s,a1.Occurred, a2.Occurred) elapsed, case when datediff(s,a1.Occurred, a2.Occurred) < #bounce_seconds then 1 else 0 end bounce
from audit_cte a1
left join audit_cte a2
on a2.UserId = a1.UserId
and a2.row_num = a1.row_num + 1
--order by a1.UserId, a1.row_num
) a
group by a.ItemId
order by a.ItemId;

SELECT ItemId,COUNT(1) VISITS,SUM(BOUNCE_IND) BOUNCE, cast(SUM(BOUNCE_IND) as decimal(5,2))/cast(COUNT(1) as decimal(5,2)) BOUNCE_RATE
FROM (
Select
UserID,
ItemID,
DocLocation,
Occurred as Entry_time,
Lead(Occurred,1) Over (Partition by Userid order by Occurred) Exit_time,
CASE WHEN DATEDIFF(ss,Occurred,Lead(Occurred,1) Over (Partition by Userid order by Occurred)) <= 5 THEN 1 ELSE 0 END BOUNCE_IND
FROM Web_Data_Sample
) TBL GROUP BY ItemId

How to subtract two values from the same column SQL

I am building a procedure that when given a customerID it will subtract an account's type 2 (Credit card) balance from an account type 1 (Savings) balance, if there is an savings account then it subtracts the credit card balance.
(ex savings balance - credit card balance = total balance)
My table is set up like such
ID Number Balance AccountType CustomerID
-----------------------------------------------------------
1 2434789 451.23 1 1
2 2435656 1425.12 1 2
3 2434789 12.56 2 1
4 4831567 45894.23 2 2
5 8994785 500.00 2 3
6 4582165 243.10 2 4
7 7581462 1567842.21 1 3
8 2648956 1058.63 2 5
9 4582165 4865.12 1 4
10 4186545 481.56 2 6
I have tried looking this up to get some guidance but everything I have found hasn't quite helped me. If someone can explain or show me what I need to do that would be great, this is the only part of my assignment I am stuck on.

You could group by CustomerId and get the sum of saving and credit balances
select
c.CustomerId,
SUM(CASE WHEN AccountType = 1 THEN Balance ELSE 0 END) Saving,
SUM(CASE WHEN AccountType = 2 THEN Balance ELSE 0 END) Credit,
from
Customer c
group by
c.CustomerId
And then you can easily get the total with below query:
Select
CustomerId,
Saving - Credit
from
(
select
c.CustomerId,
SUM(CASE WHEN AccountType = 1 THEN Balance ELSE 0 END) Saving,
SUM(CASE WHEN AccountType = 2 THEN Balance ELSE 0 END) Credit,
from Customer c
group by c.CustomerId
) cust

You join the table to itself, where each side of the join only includes the appropriate account type records:
SELECT coalesce(s.CustomerID, cc.CustomerID) CustomerID
,coalesce(s.Number, cc.Number) Number
coalesce(s.Balance,0) - coalesce(cc.Balance,0) Balance
FROM (SELECT * FROM [accounts] WHERE AccountType = 2) s
FULL JOIN (SELECT * FROM [accounts] WHERE AccountType = 1) cc on cc.customerID = s.customerID

Difference of two months expenses category wise in sql

I am trying to find the difference of expenses of previous and current month in sql.
I have a table like this
Date Amount Category
2/18/2015 100 Salary
2/12/2015 150 Rent
2/21/2015 200 Allowances
1/4/2015 200 Salary
1/17/2015 50 Rent
1/20/2015 100 Allowances
Now I want a result like this
Category CurrentMonthAmount PreviousMonthAmount Difference
Salary 100 200 100
Rent 150 50 100
Allowances 200 100 100

Try using conditional Aggregate
;WITH cte
AS (SELECT Category,
Max(CASE WHEN Month([date]) = Month(Getdate()) and year([date]) =year(getdate()) THEN amount END) CurrentMonthAmount,
Max(CASE WHEN Month([date]) = Month(Getdate()) - 1 and year([date]) =year(getdate()) THEN amount END) PreviousMonthAmount
FROM Yourtable
GROUP BY Category)
SELECT Category,
CurrentMonthAmount,
PreviousMonthAmount,
[Difference]=Abs(CurrentMonthAmount - PreviousMonthAmount)
FROM cte
SQLFIDDLE DEMO

Are you trying to do the computations within your SQL only or via some scripts like PHP.? Furthermore, can you state if you are choosing specific records for this operation (specific rows i mean). Give some more clarification

Why some dates give worse performance than other in MS SQL Server

I have a query in MS SQL Server asking for name and some date-related information, depending on two dates, a start- and an enddate.
The problem is, I´m not always getting the same performance. Whenever I request something between the dates;
2010-07-01 00:00:00.000 and
2011-07-21 23:59:59.999
the performance is excellent. I get my result within mseconds. When I request something between these dates, for example,
2011-07-01 00:00:00.000 and
2011-07-21 23:59:59.999
the performance is.. less than good, taking between 20-28 seconds for each query. Do note how the dates giving good performance is more than a year between, while the latter is 20 days.
Is there any particular reason (maybe related to how DATETIME work) for this?
EDIT: The query,
SELECT ENAME,
SUM(CASE DATE WHEN 0 THEN 1 ELSE 0 END) AS U2,
SUM(CASE DATE WHEN 1 THEN 1 ELSE 0 END) AS B_2_4,
SUM(CASE DATE WHEN 2 THEN 1 ELSE 0 END) AS B_4_8,
SUM(CASE DATE WHEN 3 THEN 1 ELSE 0 END) AS B_8_16,
SUM(CASE DATE WHEN 4 THEN 1 ELSE 0 END) AS B_16_24,
SUM(CASE DATE WHEN 5 THEN 1 ELSE 0 END) AS B_24_48,
SUM(CASE DATE WHEN 6 THEN 1 ELSE 0 END) AS O_48,
SUM(CASE DATE WHEN 7 THEN 1 ELSE 0 END) AS status,
AVG(AVG) AS AVG,
SUM(DATE) AS TOTAL
FROM
(SELECT ENAME,
(CASE
WHEN status = 'Ã–ppet' THEN 7
WHEN DATE < 48 THEN
(CASE WHEN DATE BETWEEN 0 AND 2 THEN 0
WHEN DATE BETWEEN 2 AND 4 THEN 1
WHEN DATE BETWEEN 4 AND 8 THEN 2
WHEN DATE BETWEEN 8 AND 16 THEN 3
WHEN DATE BETWEEN 16 AND 24 THEN 4
WHEN DATE BETWEEN 24 AND 48 THEN 5
ELSE - 1 END)
ELSE 6 END) AS DATE,
DATE AS AVG
FROM
(SELECT DATEDIFF(HOUR, cases.date, status.date) AS DATE,
extern.name AS ENAME,
status.status
FROM
cases INNER JOIN
status ON cases.id = status.caseid
AND status.date =
(SELECT MAX(date) AS Expr1
FROM status AS status_1
WHERE (caseid = cases.id)
GROUP BY caseid) INNER JOIN
extern ON cases.owner = extern.id
WHERE (cases.org = 'Expert')
AND (cases.date BETWEEN '2009-01-15 09:48:25.633'
AND '2011-07-21 09:48:25.633'))
AS derivedtbl_1)
AS derivedtbl_2
GROUP BY ENAME
ORDER BY ENAME
(parts of) The tables:
Extern
-ID (->cases.owner)
-name
Cases
-Owner (->Extern.id)
-id (->status.caseid)
-date (case created at this date)
Status
-caseid (->cases.id)
-Status
-Date (can be multiple, MAX(status.date) gives us date when
status was last changed)

I would have thought a statistics issue.
When you are only selecting the most recent dates these may be unrepresented in the statistics yet as the threshold has not yet been reached that would trigger auto updating.
See this blog post for an example.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Sql server - Using aggregate functions in where clause - sql-server

Related

SQL Server - Calculate AVG() using Joins

Calculate Bounce Rate SQL Server 2008

How to subtract two values from the same column SQL

Difference of two months expenses category wise in sql

Why some dates give worse performance than other in MS SQL Server

Categories

Resources