Group by two columns with case query in SQL Server

Group by two columns with case query in SQL Server - sql-server

I'm trying to retrieve drivers data with Total Accepted and Total Ignored ride requests for the current date.
Based on the Drivers and DriverReceivedRequests tables, I get the total count but the twists is I have duplicate rows that reside on the DriverReceivedRequest table against the driverId and the rideId. So there has to be group by clause on both the driverId and RideId, having the driver getting multiple requests for the current date, but also receiving twice or thrice request for the same ride as well.
This is the table structure for DriverReceivedRequests:
Id DriverId RideId ReceivedStatusId DateTime
------------------------------------------------------------------------
0014d26b 93665f55 fef6fb96 NULL 04:55.6
00175c65 6e62a94e cb214a84 NULL 09:32.1
0017c22b ec9e1297 4b47dc8a 4211357D 10:28:5
0014d26b 6e62a94e fef6fb96 NULL 04:56.8
This is the query I have tried:
select
d.Id, d.FirstName, d.LastName,
Sum(case when drrs.Number = 1 then 1 else 0 end) as TotalAccepted,
Sum(case when drrs.Number = 2 or drr.ReceivedStatusId is null then 1 else 0 end) as TotalRejected
from
dbo.[DriverReceivedRequests] drr
inner join
dbo.[Drivers] d on drr.DriverId = d.Id
left join
dbo.[DriverReceivedRequestsStatus] drrs on drr.ReceivedStatusId = drrs.Id
where
Day(drr.Datetime) = Day(getdate())
and month(drr.DateTime) = Month(getdate())
group by
d.FirstName, d.LastName, d.Id
In the above query if I group by with RideId as well, it generates duplicate names of drivers as well with incorrect data. I've also applied partition by clause with DriverId but not the correct result
This also generates the same result
WITH cte AS
(
SELECT
d.Id,
d.FirstName, d.LastName,
TotalAccepted = SUM (CASE WHEN drrs.Number = 1 THEN 1 ELSE 0 END) OVER (PARTITION BY drr.DriverId),
TotalRejected = SUM (CASE WHEN drrs.Number = 2 OR drr.ReceivedStatusId IS NULL THEN 1 ELSE 0 END)
OVER (PARTITION BY drr.DriverId),
rn = ROW_NUMBER() OVER(PARTITION BY drr.DriverId
ORDER BY drr.DateTime DESC)
FROM
DriverReceivedRequests drr
INNER JOIN
dbo.[Drivers] d ON drr.DriverId = d.Id
LEFT JOIN
dbo.[DriverReceivedRequestsStatus] drrs ON drr.ReceivedStatusId = drrs.Id
WHERE
DAY (drr.Datetime) = DAY (GETDATE())
AND MONTH (drr.DateTime) = MONTH (GETDATE())
)
SELECT
Id,
FirstName, LastName,
TotalAccepted, TotalRejected
FROM
cte
WHERE
rn = 1
My question is how can I group by individual driver data in terms of incoming ride requests?
Note: the driver receives same ride request multiple times

Related

Query running tally of open issues on a given day

I have been banging my head on this for a while now and I think I've drastically over-complicating things at this point. What I have is a table containing the fields
OpenDate
ClosedDate
Client
Contract
Service
What I need to turn that into is
Date
Client
Contract
Service
OpenedOnThisDay
OpenedYesterday
ClosedOnThisDay
ClosedYesterday
OpenAtStartOfTomorrow
OpenAtStartOfToday
For any given Date, there may or may not be any issues opened or closed ON that day. That day should still be included with 0's
I have come at this a number of ways and can produce one of the desired results at a time (opened on, closed on, Open at end of), but I cannot get them all at once, at least not without exponentially increasing the query time.
My queries currently as views are as follows
Opened On
select Cast(EntryDateTime as Date) as DateStamp
,ContractNumber
,Client
,services.Service
,sum(1) as Count
,lag(sum(1)) OVER (
partition by tickets.ContractNumber
,services.Service ORDER BY Cast(EntryDateTime as Date) ASC
) as CountDayBefore
from v_JiraImpactedServices as services
LEFT JOIN v_JiraTickets as tickets ON services.ticketnumber = tickets.TicketNumber
WHERE tickets.Client is not null
AND tickets.TicketNumber IS NOT NULL
and tickets.ContractNumber is not null
GROUP BY Cast(tickets.EntryDateTime as Date)
,tickets.ContractNumber
,tickets.Client
,services.Service;
Closed On
select Cast(ResolvedDateTime as Date) as DateStamp
,ContractNumber
,Client
,services.Service
,sum(1) as Count
,lag(sum(1)) OVER (
partition by tickets.ContractNumber
,services.Service ORDER BY Cast(ResolvedDateTime as Date) ASC
) as CountDayBefore
from v_JiraImpactedServices as services
LEFT JOIN v_JiraTickets as tickets ON services.ticketnumber = tickets.TicketNumber
WHERE tickets.Client is not null
and tickets.TicketNumber is not null
AND tickets.ContractNumber is not null
GROUP BY Cast(tickets.ResolvedDateTime as Date)
,tickets.ContractNumber
,tickets.Client
,services.Service;
Open On
SELECT calendar.FullDate as DateStamp
,tickets.ContractNumber
,tickets.client
,services.Service
,IsNull(count(tickets.TicketNumber), 0) as Count
,IsNull(lag(count(tickets.TicketNumber), 1) OVER (
partition by tickets.ContractNumber
,services.Service Order By FullDate ASC
), 0) as CountDayBefore
FROM v_Calendar as calendar
LEFT JOIN v_JiraTickets as tickets ON Cast(tickets.EntryDateTime as Date) <= calendar.FullDate
AND (
Cast(tickets.ResolvedDateTime as Date) > calendar.FullDate
OR tickets.ResolvedDateTime is null
)
LEFT JOIN v_JiraImpactedServices as services ON services.ticketnumber = tickets.TicketNumber
WHERE tickets.Client is not null
AND tickets.ContractNumber is not null
GROUP BY calendar.FullDate
,tickets.ContractNumber
,tickets.Client
,services.Service;
As I said each of these by itself gives ALMOST the desired results, but omits days with 0 values.
Aside from producing days with 0 values, I need to also combine these into a single table result. All attempts so far have either produced obviously wrong JOIN results, or takes an hour to execute.
I would be most grateful if someone could point me in the right direction here.

Just to give you an idea, although the fieldnames don't match your scenario, this is how I would approach this:
WITH
SourceData (ClientID, ContractID, ServiceID, DateStamp) AS (
SELECT a.ID, b.ID, c.ID, d.DateStamp
FROM clients a
JOIN contracts b ON a.ID = b.ClientID
JOIN [services] c ON b.ID = c.ContractID
CROSS JOIN calendar d
WHERE d.DateStamp >= DATEADD(day, -60, GETDATE())
)
SELECT d.DateStamp, s.ClientID, s.ContractID, s.ServiceID
, COUNT(CASE WHEN Cast(EntryDateTime as Date) = d.DateStamp THEN 1 END) AS OpenedOn
, COUNT(CASE WHEN Cast(ResolvedDateTime as Date) = d.DateStamp THEN 1 END) AS ClosedOn
, COUNT(CASE WHEN Cast(ResolvedDateTime as Date) > d.DateStamp OR ResolvedDateTime IS NULL AND EntryDateTime IS NOT NULL THEN 1 END) AS InProgress
FROM SourceData s
LEFT JOIN tickets t
ON s.ClientID = t.ClientID
AND s.ContractID = t.ContractID
AND s.ServiceID = t.ServiceID
AND s.DateStamp >= Cast(EntryDateTime as Date)
AND (s.DateStamp <= ResolvedDateTime OR ResolvedDateTime IS NULL)
GROUP BY d.DateStamp, s.ClientID, s.ContractID, s.ServiceID

Select from same column under different conditons

I need to join these two tables. I need to select occurrences where:
ex_head of_family_active = 1 AND tax_year = 2017
and also:
ex_head of_family_active = 0 AND tax_year = 2016
The first time I tried to join these two tables I got the warehouse data
dbo.tb_master_ascend AND warehouse_data.dbo.tb_master_ascend in the from clause have the same exposed names. As the query now shown below, I get a syntax error on the "where". What am I doing wrong? Thank you
use [warehouse_data]
select
parcel_number as Account,
pact_code as type,
owner_name as Owner,
case
when ex_head_of_family_active >= 1
then 'X'
else ''
end 'Head_Of_Fam'
from
warehouse_data.dbo.tb_master_ascend
inner join
warehouse_data.dbo.tb_master_ascend on parcel_number = parcel_number
where
warehouse_data.dbo.tb_master_ascend.tax_year = '2016'
and ex_head_of_family_active = 0
where
warehouse_data.dbo.tb_master_ascend.t2.tax_year = '2017'
and ex_head_of_family_active >= 1
and (eff_from_date <= getdate())
and (eff_to_date is null or eff_to_date >= getdate())
#marc_s I changed the where statements and updated my code however the filter is not working now:
use [warehouse_data]
select
wh2.parcel_number as Account
,wh2.pact_code as Class_Type
,wh2.owner_name as Owner_Name
,case when wh2.ex_head_of_family_active >= 1 then 'X'
else ''
end 'Head_Of_Fam_2017'
from warehouse_data.dbo.tb_master_ascend as WH2
left join warehouse_data.dbo.tb_master_ascend as WH1 on ((WH2.parcel_number = wh1.parcel_number)
and (WH1.tax_year = '2016')
and (WH1.ex_head_of_family_active is null))
where WH2.tax_year = '2017'
and wh2.ex_head_of_family_active >= 1
and (wh2.eff_from_date <= getdate())
and (wh2.eff_to_date is null or wh2.eff_to_date >= getdate())

I would use a CTE to get all your parcels that meet your 2016 rules.
Then join that against your 2017 rules on parcel ID.
I'm summarizing:
with cte as
(
select parcelID
from
where [2016 rules]
group by parcelID --If this isn't unique you will cartisian your results
)
select columns
from table
join cte on table.parcelid=cte.parcelID
where [2017 rules]

Group by in table

My code is :
SELECT
Student_ID ,dbo.tblVahed.Vahed_ID,
COUNT(Student_ID) AS State_All,
CASE
WHEN tblStudentsDocument.Student_Sex = N'مرد'
THEN COUNT(Student_ID)
END AS Count_Man,
CASE
WHEN tblStudentsDocument.Student_Sex = N'زن'
THEN COUNT(Student_ID)
END AS Count_Woman
FROM
dbo.tblStudentsDocument
INNER JOIN
dbo.tblVahed ON dbo.tblStudentsDocument.Vahed_ID = dbo.tblVahed.Vahed_ID
GROUP BY
dbo.tblVahed.Vahed_ID, Student_ID, Student_Sex
but I should group by only dbo.tblVahed.Vahed_ID. Any help appreicated.

Anything that's not aggregated in the SELECT field list must be included in the group by. If you don't want to group by it, then you shouldn't include it in the select list unaggregated. Your query should work as follows.
SELECT dbo.tblVahed.Vahed_ID,
COUNT( Student_ID) AS State_All ,
SUM(CASE WHEN tblStudentsDocument.Student_Sex = N'مرد' THEN 1 ELSE 0 END) AS Count_Man ,
SUM(CASE WHEN tblStudentsDocument.Student_Sex = N'زن' THEN 1 ELSE 0 END) AS Count_Woman
FROM dbo.tblStudentsDocument
INNER JOIN dbo.tblVahed ON dbo.tblStudentsDocument.Vahed_ID = dbo.tblVahed.Vahed_ID
GROUP BY dbo.tblVahed.Vahed_ID
Note I've removed student id, and rewritten the case statement to perform the aggregation in a different way.

Postgres COUNT number of column values with INNER JOIN

I am creating a report in Postgres 9.3. This is my SQL Fiddle.
Basically I have two tables, responses and questions, the structure is:
responses
->id
->question_id
->response
questions
->id
->question
->costperlead
for the column response there can only be 3 values, Yes/No/Possbily,
and my report should have the columns:
question_id
, # of Yes Responses
, # of No Responses
, # of Possbily Responses
, Revenue
Then:
# of Yes Responses - count of all Yes values in the response column
# of No Responses - count of all No values in the response column
# of Possbily Responses - count of all 'Possbily' values in the response column
Revenue is the costperlead * (Number of Yes Responses + Number of Possibly Responses).
I don't know how to construct the query, I'm new plus I came from MySQL so some things are different for postgres. In my SQL Fiddle sample most responses are Yes and Null, it's ok eventually, there will be Possibly and No.
So far I have only:
SELECT a.question_id
FROM responses a
INNER JOIN questions b ON a.question_id = b.id
WHERE a.created_at = '2015-07-17'
GROUP BY a.question_id;

You should try:
SELECT a.question_id,
SUM(CASE WHEN a.response = 'Yes' THEN 1 ELSE 0 END) AS NumsOfYes,
SUM(CASE WHEN a.response = 'No' THEN 1 ELSE 0 END) AS NumsOfNo,
SUM(CASE WHEN a.response = 'Possibly' THEN 1 ELSE 0 END) AS NumOfPossibly,
costperlead * SUM(CASE WHEN a.response = 'Yes' THEN 1 ELSE 0 END) + SUM(CASE WHEN a.response = 'Possibly' THEN 1 ELSE 0 END) AS revenue
FROM responses a
INNER JOIN questions b ON a.question_id = b.id
GROUP BY a.question_id, b.costperlead

Since the only predicate filters rows from table responses, it would be most efficient to aggregate responses first, then join to questions:
SELECT *, q.costperlead * (r.ct_yes + r.ct_maybe) AS revenue
FROM (
SELECT question_id
, count(*) FILTER (WHERE response = 'Yes') AS ct_yes
, count(*) FILTER (WHERE response = 'No') AS ct_no
, count(*) FILTER (WHERE response = 'Possibly') AS ct_maybe
FROM responses
WHERE created_at = '2015-07-17'
GROUP BY 1
) r
JOIN questions q ON q.id = r.question_id;
db<>fiddle here
This uses the aggregate FILTER clause (in Postgres 9.4 or later). See:
Aggregate columns with additional (distinct) filters
Aside: consider implementing response as boolean type with true/false/null.
For Postgres 9.3:
SELECT *, q.costperlead * (r.ct_yes + r.ct_maybe) AS revenue
FROM (
SELECT question_id
, count(response = 'Yes' OR NULL) AS ct_yes
, count(response = 'No' OR NULL) AS ct_no
, count(response = 'Possibly' OR NULL) AS ct_maybe
FROM responses
WHERE created_at = '2015-07-17'
GROUP BY 1
) r
JOIN questions q ON q.id = r.question_id;
Old sqlfiddle
Comprehensive comparison of techniques:
For absolute performance, is SUM faster or COUNT?

paging over SELECT UNION super slow and killing my server

I have an SP that returns paged data from a query that contains a UNION. This is killing my DB and taking 30 seconds to run sometimes, am I missing something obvious here? What can I do to improve it's performance?
Tables Involved: Products, Categories, CategoryProducts
Goal:
Any Products that are not in a Category or have been deleted from a category UNION all Products currently in a category and page over them for a web service.
I have Indexes on all columns that I am joining on and there are 427,996 Products, 6148 Categories and 409,691 CategoryProducts in the database.
Here is my query that is taking between 6, and 30 seconds to run:
SELECT * FROM (
SELECT ROW_NUMBER() OVER(ORDER BY Products.ItemID, Products.ManufacturerID) AS RowNum, *
FROM
(
SELECT Products.*,
CategoryID = NULL, CategoryName = NULL,
CategoryProductID = NULL,
ContainerMinimumQuantity =
CASE COALESCE(Products.ContainerMinQty, 0)
WHEN 0 THEN Products.OrderMinimumQuantity
ELSE Products.ContainerMinQty
END
Products.IsDeleted,
SortOrder = NULL
FROM CategoryProducts RIGHT OUTER JOIN Products
ON CategoryProducts.ManufacturerID = Products.ManufacturerID
AND CategoryProducts.ItemID = Products.ItemID
WHERE (Products.ManufacturerID = #ManufacturerID)
AND (Products.ModifiedOn > #tStamp )
AND ((CategoryProducts.IsDeleted = 1) OR (CategoryProducts.IsDeleted IS NULL))
UNION
SELECT Products.*,
CategoryProducts.CategoryID , CategoryProducts.CategoryName,
CategoryProducts.CategoryProductID ,
ContainerMinimumQuantity =
CASE COALESCE(Products.ContainerMinQty, 0)
WHEN 0 THEN Products.OrderMinimumQuantity
ELSE Products.ContainerMinQty
END
CategoryProducts.IsDeleted,
CategoryProducts.SortOrder
FROM Categories INNER JOIN
CategoryProducts ON Categories.CategoryID = CategoryProducts.CategoryID INNER JOIN
Products ON CategoryProducts.ManufacturerID = Products.ManufacturerID
AND CategoryProducts.ItemID = Products.ItemID
WHERE (Products.ManufacturerID = #ManufacturerID)
AND (Products.ModifiedOn > #tStamp OR CategoryProducts.ModifiedOn > #tStamp))
AS Products) AS C
WHERE RowNum >= #StartRow AND RowNum <= #EndRow
Any insight would be greatly appreciated.

If I read your situation correctly, the only reason for having two distinct queries is treatment of missing/deleted CategoryProducts. I tried to address this issue by left join with IsDeleted = 0 to bring all deleted CategoryProducts to nulls, so I don't have to test them again. ModifiedOn part got another test for null for missing/deleted Categoryproducts you wish to retrieve.
select *
from (
SELECT
Products.*,
-- Following three columns will be null for deleted/missing categories
CategoryProducts.CategoryID,
CategoryProducts.CategoryName,
CategoryProducts.CategoryProductID ,
ContainerMinimumQuantity = COALESCE(nullif(Products.ContainerMinQty, 0),
Products.OrderMinimumQuantity),
CategoryProducts.IsDeleted,
CategoryProducts.SortOrder,
ROW_NUMBER() OVER(ORDER BY Products.ItemID,
Products.ManufacturerID) AS RowNum
FROM Products
LEFT JOIN CategoryProducts
ON CategoryProducts.ManufacturerID = Products.ManufacturerID
AND CategoryProducts.ItemID = Products.ItemID
-- Filter IsDeleted in join so we get nulls for deleted categories
-- And treat them the same as missing ones
AND CategoryProducts.IsDeleted = 0
LEFT JOIN Categories
ON Categories.CategoryID = CategoryProducts.CategoryID
WHERE Products.ManufacturerID = #ManufacturerID
AND (Products.ModifiedOn > #tStamp
-- Deleted/missing categories
OR CategoryProducts.ModifiedOn is null
OR CategoryProducts.ModifiedOn > #tStamp)
) C
WHERE RowNum >= #StartRow AND RowNum <= #EndRow
On a third look I don't see that Category is used at all except as a filter to CategoryProducts. If this is the case second LEFT JOIN should be changed to INNER JOIN and this section should be enclosed in parenthessis.