How to write this SQL Server query: Add values in unique rows?

How to write this SQL Server query: Add values in unique rows? - sql-server

I have a query like below. The relation between table are:
Each truck may have multiple drivers. Table List connects the each row in table Truck with rows in table Driver. Now I want to get the count of unique Trucks under certain condition, and the total size of the unique Trucks under that condition.
Here is what I have:
SELECT t.Year AS [Year]
, t.Month AS [Month]
, t.Day AS [Day]
-- Count will not count NULL
, COUNT( DISTINCT (CASE WHEN (t.Sent = 1 AND r.Internal=1) THEN L.TruckId
ELSE NULL
END) ) AS [Count]
, SUM(CASE WHEN (t.Sent = 1 AND r.Internal = 1) THEN t.Size
END) AS [Size]
FROM Truck t
INNER JOIN List L ON t.Id = L.TruckId
INNER JOIN Driver r ON L.DriverId = r.Id
GROUP BY t.Year, t.Month, t.Day
the COUNT is correct, but the SUM is not.
My question is how to get this SUM? And I do not want to write 2 queries and join them.
Thanks

You can try query like below:
; with cte as (
SELECT
DISTINCT
t.Year AS [Year]
, t.Month AS [Month]
, t.Day AS [Day]
, L.TruckId,
, t.Size
FROM Truck t
INNER JOIN List L ON t.Id = L.TruckId
INNER JOIN Driver r ON L.DriverId = r.Id
WHERE t.Sent = 1 AND r.Internal=1
)
select
Year
, Month
, Day
, count(TruckId) AS [Count]
, sum(Size) AS [Size]
from cte
group by Year, Month, Day

Related

BigQuery LEFT JOIN a table and filter its array elements based on conditions

I want to join a table to another table containing arrays and in the joined result I want to have only the array elements which pass a condition. In this case a date condition.
The code snippet below illustrates my problem. I want the output to contain only ids with record_dates less than '2019-10-15'
WITH platform AS (
SELECT 'u1' AS id, 'm1' AS platform_id, '2019-10-12' as record_date
UNION ALL
SELECT 'u2' AS id, 'm1' AS platform_id, '2019-10-13' as record_date
UNION ALL
SELECT 'u21' AS id, 'm1' AS platform_id, '2019-10-16' as record_date
),
platform_agg AS (
SELECT platform_id
, ARRAY_AGG(id) as ids
, ARRAY_AGG(record_date) as record_dates
FROM platform
GROUP BY platform_id
),
orders AS(
SELECT 'u2' AS id, 'c1' AS order_id, '2019-10-15' as order_date
),
orders_plus_platform AS (
SELECT order_id
, orders.id
, orders.order_date
, platform.platform_id
, CASE WHEN platform.platform_id IS NOT NULL THEN platform_agg.ids ELSE [orders.id] END AS ids
, CASE WHEN platform.platform_id IS NOT NULL THEN platform_agg.record_dates ELSE NULL END AS record_dates
FROM orders
LEFT JOIN platform
ON orders.id = platform.id and platform.record_date <= orders.order_date
LEFT JOIN platform_agg
ON platform.platform_id = platform_agg.platform_id
)
SELECT * FROM orders_plus_platform
Below is the current query output, however, in the desired output the u21 element should be filtered out as the record_date is after '2019-10-15'.
Thank you,

The below solution worked for me. Basically you join twice to the platform table to get all the ids associated with a platform, instead of joining to a pre-aggregated versions of it. This way you can more easily apply filters.
orders_plus_platform AS (
SELECT order_id
, orders.id
, orders.order_date
, platform.platform_id
, ARRAY_AGG(CASE WHEN platform.platform_id IS NOT NULL THEN platform2.id ELSE orders.id END) AS ids
, ARRAY_AGG(CASE WHEN platform.platform_id IS NOT NULL THEN platform2.record_date ELSE NULL END) AS record_dates
FROM orders
LEFT JOIN platform
ON orders.id = platform.id and platform.record_date <= orders.order_date
LEFT JOIN platform platform2
ON platform.platform_id = platform2.platform_id AND platform2.record_date <= orders.order_date
GROUP BY
order_id
, orders.id
, orders.order_date
, platform.platform_id
)

You can use subqueries in your WHERE clause. Subqueries can run on unnested arrays and return a boolean value - e.g. count of dates < something should be more than zero:
SELECT c_id
, c.id
, c.c_date
, cxd.record_id
, CASE WHEN cxd.record_id IS NOT NULL THEN rd_agg.ids ELSE [c.id] END AS ids
, CASE WHEN cxd.record_id IS NOT NULL THEN rd_agg.record_dates ELSE NULL END AS record_dates
FROM c
LEFT JOIN record_ids cxd
ON c.id = cxd.id and cxd.record_date <= c.c_date
LEFT JOIN record_ids_agg rd_agg
ON cxd.record_id = rd_agg.record_id
WHERE (SELECT COUNT(1)>0 FROM UNNEST(record_dates) AS r WHERE r < '2019-10-15')

SUM with CASE counts duplicate rows in SQL GROUP BY

I'm trying to do a SUM against all items which match a certain condition, like so:
SELECT l.Building_Name,
SUM(CASE WHEN s.Date >= '20180930' THEN 1 ELSE 0 END) Validated,
COUNT(DISTINCT s.id) Total
FROM Lab_Space s
JOIN Locations l ON s.Building_Code = l.Building_Code
GROUP BY l.Building_Name
The COUNT there is correct, and will say something like 20 because I can put the DISTINCT s.id in there. However, my SUM ends up with something like 1500. This is because when I do the JOIN rows are duplicated multiple times, and thus the SUM is counting against each one.
How can I do a SUM/CASE like this but make sure it only applies to distinct rows?
s.id l.building_name s.date
1 JF 2018-11-10
1 JF 2018-11-10
2 JF 2018-12-12
So if I have data like that, I'm going to get my count properly of 2, but validate will say 3 because the id of 1 appears twice due to doing a JOIN

You can edit this code of temp table if you deem fit.
create table #temp_Lab_Space
([Date] date null
,Building_Code int null
)
create table #temp_Locations
( Building_Code int null
,Building_Name varchar(10) null
)
insert into #temp_Lab_Space values
('2018-11-10',1)
,('2018-11-10', 1)
,('2018-12-12' , 1)
insert into #temp_Locations values
(1, 'JF')
select Building_Name,
SUM(CASE WHEN Date >= '20180930' THEN 1 ELSE 0 END) Validated,
COUNT(DISTINCT Building_Code) Total
from (
select distinct l.Building_Name, s.Building_Code, s.Date
,Rank_1 = rank() over(partition by l.Building_Name order by s.Date asc)
FROM #temp_Lab_Space s
JOIN #temp_Locations l ON s.Building_Code = l.Building_Code
) a
group by Building_Name

wild guess
select l.Building_Name
, count(s.Id)
, sum(s.Validated)
from Locations l
cross apply ( select s.Id
, max(case
when s.Date >= '20180930' then 1
else 0
end) as Validated
from Lab_Space s
where s.Building_Code = l.Building_Code
group by s.Id) s
group by l.Building_Name
should give you the distinct space.id and a flag whether it is validated.

query is not returning distinct record

Hi can you please take a look why my query is not returning distinct record. i want result with following condition OE1='SCHEDCHNG', need only recent records per orderid or ordernum means only one record per ordernum or orderid and also dropdate is null. My query is as below
select DISTINCT TOP 100 OE.ORDERID,OE.ID,OE.ORDERNUM,OE.OE4 from OrderExports OE
inner join (
select ORDERNUM, max(OE4) as MaxDate
from OrderExports
group by ORDERNUM
) tm
on OE.ORDERNUM = tm.ORDERNUM and OE.OE4 = tm.MaxDate
inner join orde_ O on OE.ORDERID = O.ORDERID
WHERE OE1='SCHEDCHNG' AND O.DROPDATE is null

Pretty sparse on details here but I think you are wanting something along these lines.
with SortedResults as
(
select OE.ORDERID
, OE.ID
, OE.ORDERNUM
, OE.OE4
, ROW_NUMBER() over(partition by OE.ORDERID, OE.ORDERNUM order by OE.OE4 desc) as RowNum
from OrderExports OE
inner join
(
select ORDERNUM
, max(OE4) as MaxDate
from OrderExports
group by ORDERNUM
) tm on OE.ORDERNUM = tm.ORDERNUM and OE.OE4 = tm.MaxDate
inner join orde_ O on OE.ORDERID = O.ORDERID
WHERE OE1='SCHEDCHNG'
AND O.DROPDATE is null
)
select ORDERID
, ID
, ORDERNUM
, OE4
from SortedResults
where RowNum = 1

You can try using max and group by as below :
SELECT a.ID, max(a.ORDERID) as OrderID, max(a.ORDERNUM) as OrderNum,MAX(OE.OE4) as OE4 FROM
(
--your query
) a
group by a.ID

Getting most recent date from multiple SQL columns

The suggested answer, in this post, works great for two columns.
I have about 50 different date columns, where I need to be able to report on the most recent interaction, regardless of table.
In this case, I am bringing the columns in to a view, since they are coming from different tables in two different databases...
CREATE VIEW vMyView
AS
SELECT
comp_name AS Customer
, Comp_UpdatedDate AS Last_Change
, CmLi_UpdatedDate AS Last_Communication
, Case_UpdatedDate AS Last_Case
, AdLi_UpdatedDate AS Address_Change
FROM Company
LEFT JOIN Comm_Link on Comp_CompanyId = CmLi_Comm_CompanyId
LEFT JOIN Cases ON Comp_CompanyId = Case_PrimaryCompanyId
LEFT JOIN Address_Link on Comp_CompanyId = AdLi_CompanyID
...
My question is, how I would easily account for the many possibilities of one column being greater than the others?
Using only the two first columns, as per the example above, works great. But considering that one row could have column 3 as the highest value, another row could have column 14 etc...
SELECT Customer, MAX(CASE WHEN (Last_Change IS NULL OR Last_Communication> Last_Change)
THEN Last_Communication ELSE Last_Change
END) AS MaxDate
FROM vMyView
GROUP BY Customer
So, how can I easily grab the highest value for each row in any of the 50(ish) columns?
I am using SQL Server 2008 R2, but I also need this to work in versions 2012 and 2014.
Any help would be greatly appreciated.
EDIT:
I just discovered that the second database is storing the dates in NUMERIC fields, rather than DATETIME. (Stupid! I know!)
So I get the error:
The type of column "ARCUS" conflicts with the type of other columns specified in the UNPIVOT list.
I tried to resolve this with a CAST to make it DATETIME, but that only resulted in more errors.
;WITH X AS
(
SELECT Customer
,Value [Date]
,ColumnName [Entity]
,BusinessEmail
,ROW_NUMBER() OVER (PARTITION BY Customer ORDER BY Value DESC) rn
FROM (
SELECT comp_name AS Customer
, Pers_EmailAddress AS BusinessEmail
, Comp_UpdatedDate AS Company
, CmLi_UpdatedDate AS Communication
, Case_UpdatedDate AS [Case]
, AdLi_UpdatedDate AS [Address]
, PLink_UpdatedDate AS Phone
, ELink_UpdatedDate AS Email
, Pers_UpdatedDate AS Person
, oppo_updateddate as Opportunity
, samdat.dbo.ARCUS.AUDTDATE AS ARCUS
FROM vCompanyPE
LEFT JOIN Comm_Link on Comp_CompanyId = CmLi_Comm_CompanyId
LEFT JOIN Cases ON Comp_CompanyId = Case_PrimaryCompanyId
LEFT JOIN Address_Link on Comp_CompanyId = AdLi_CompanyID
LEFT JOIN PhoneLink on Comp_CompanyId = PLink_RecordID
LEFT JOIN EmailLink on Comp_CompanyId = ELink_RecordID
LEFT JOIN vPersonPE on Comp_CompanyId = Pers_CompanyId
LEFT JOIN Opportunity on Comp_CompanyId = Oppo_PrimaryCompanyId
LEFT JOIN Orders on Oppo_OpportunityId = Orde_opportunityid
LEFT JOIN SAMDAT.DBO.ARCUS on IDCUST = Comp_IdCust
COLLATE Latin1_General_CI_AS
WHERE Comp_IdCust IS NOT NULL
AND Comp_deleted IS NULL
) t
UNPIVOT (Value FOR ColumnName IN
(
Company
,Communication
,[Case]
,[Address]
,Phone
,Email
,Person
,Opportunity
,ARCUS
)
)up
)
SELECT Customer
, BusinessEmail
,[Date]
,[Entity]
FROM X
WHERE rn = 1 AND [DATE] >= DATEADD(year,-2,GETDATE()) and BusinessEmail is not null

You could use CROSS APPLY to manually pivot your fields, then use MAX()
SELECT
vMyView.*,
greatest.val
FROM
vMyView
CROSS APPLY
(
SELECT
MAX(val) AS val
FROM
(
SELECT vMyView.field01 AS val
UNION ALL SELECT vMyView.field02 AS val
...
UNION ALL SELECT vMyView.field50 AS val
)
AS manual_pivot
)
AS greatest
The inner most query will pivot each field in to a new row, then the MAX() re-aggregate them back in to a single row. (Also skipping NULLs, so you don't need to explicitly cater for them.)

;WITH X AS
(
SELECT Customer
,Value [Date]
,ColumnName [CommunicationType]
,ROW_NUMBER() OVER (PARTITION BY Customer ORDER BY Value DESC) rn
FROM (
SELECT comp_name AS Customer
, Comp_UpdatedDate AS Last_Change
, CmLi_UpdatedDate AS Last_Communication
, Case_UpdatedDate AS Last_Case
, AdLi_UpdatedDate AS Address_Change
FROM Company
LEFT JOIN Comm_Link on Comp_CompanyId = CmLi_Comm_CompanyId
LEFT JOIN Cases ON Comp_CompanyId = Case_PrimaryCompanyId
LEFT JOIN Address_Link on Comp_CompanyId = AdLi_CompanyID
) t
UNPIVOT (Value FOR ColumnName IN (Last_Change,Last_Communication,
Last_Case,Address_Change))up
)
SELECT Customer
,[Date]
,[CommunicationType]
FROM X
WHERE rn = 1

Sum of missing data

The below query displays sites against the total orders within last week.
But if there is no order for a given site in last week, i should still see the site with a sum of zero.
At the moment its only giving me four sites, thats because no order has been made in the last week for those sites.
select SITE
,SUM(Case When OrderDate >= dateadd(dd,(datediff(dd,-53690,getdate()-1)/7)*7,-53690)
Then 1
Else 0
End) as COMPLETED
from
(
SELECT DISTINCT ORDERS.SITE, ORDERS.ORDERDATE FROM ORDERS
INNER JOIN PHONEDATA AS P
ON ORDERS.RECID = P.OrderID
where SITE IN ('SITE1','SITE2','SITE3','SITE4','SITE5','SITE6','SITE7')
) X
GROUP BY SITE
order by SITE
RESULT:
Site---------------------Completed
SITE1-----------------------2
SITE2-----------------------2
SITE3-----------------------2
SITE4-----------------------2
EXPECTED RESULT:
Site---------------------Completed
SITE1-----------------------2
SITE2-----------------------2
SITE3-----------------------2
SITE4-----------------------2
SITE5-----------------------0
SITE6-----------------------0
SITE7-----------------------0
updated:
select SITE
,SUM(Case When OrderDate >= dateadd(dd,(datediff(dd,-53690,getdate()-1)/7)*7,-53690)
Then 1
Else 0
End) as COMPLETED
from
(
SELECT DISTINCT ORDERS.SITE, ORDERS.ORDERDATE FROM ORDERS
where SITE IN ('SITE1','SITE2','SITE3','SITE4','SITE5','SITE6','SITE7')
) X
GROUP BY SITE
order by SITE
I have now removed the inner join with phone data table, so i am now getting the missing sites. but the reason i avoided this approach is because if i only rely on the orders table the orderdate time field is inserted few times for a given order, and the final order makes it to the phonedata table, so now i get more values in completed count but it should only consider the latest value for each day for each site
result of update :
Site---------------------Completed
SITE1-----------------------5
SITE2-----------------------5
SITE3-----------------------5
SITE4-----------------------5
SITE5-----------------------0
SITE6-----------------------0
SITE7-----------------------0
expected
Site---------------------Completed
SITE1-----------------------2
SITE2-----------------------2
SITE3-----------------------2
SITE4-----------------------2
SITE5-----------------------0
SITE6-----------------------0
SITE7-----------------------0

If there are no rows in the table with the sites that have no orders, how can it return any rows to count? Perhaps you have a table with all the possible sites that can be joined to? Or create a temp table with the site values. You could then left join the orders table to this. i.e.
create table #sites (site varchar(25));
insert into #sites values ('SITE1','SITE2','SITE3','SITE4','SITE5','SITE6','SITE7');
...
from
(
SELECT DISTINCT ORDERS.SITE, ORDERS.ORDERDATE FROM
#sites s left join ORDERS on orders.site = s.site
INNER JOIN PHONEDATA AS P
ON ORDERS.RECID = P.OrderID
) X
...

Try using a left join instead of the inner join. It is probably not getting rows from the phone data table:
select SITE
,SUM(Case When OrderDate >= dateadd(dd,(datediff(dd,-53690,getdate()-1)/7)*7,-53690)
Then 1
Else 0
End) as COMPLETED
from
(
SELECT DISTINCT ORDERS.SITE, ORDERS.ORDERDATE FROM ORDERS
Left JOIN PHONEDATA AS P
ON ORDERS.RECID = P.OrderID
where SITE IN ('SITE1','SITE2','SITE3','SITE4','SITE5','SITE6','SITE7')
) X
GROUP BY SITE
order by SITE

It'd be best to start with a "Site" table and then left join to your results. This example mimics the behavior, and can be used as a hack-workaround.
DECLARE #table TABLE
(
site VARCHAR(10) ,
Completed TINYINT
)
INSERT INTO #table
( site, Completed )
VALUES ( 'SITE1', 0 ),
( 'SITE2', 0 ),
( 'SITE3', 0 ),
( 'SITE4', 0 ),
( 'SITE5', 0 ),
( 'SITE6', 0 ),
( 'SITE7', 0 )
WITH cte
AS ( SELECT SITE ,
SUM(CASE WHEN OrderDate >= DATEADD(dd,( DATEDIFF(dd, -53690, GETDATE() - 1) / 7 ) * 7, -53690)
THEN 1
ELSE 0
END) AS COMPLETED
FROM ( SELECT DISTINCT
ORDERS.SITE ,
ORDERS.ORDERDATE
FROM ORDERS
INNER JOIN PHONEDATA AS P ON ORDERS.RECID = P.OrderID
WHERE SITE IN ( 'SITE1', 'SITE2', 'SITE3',
'SITE4', 'SITE5', 'SITE6',
'SITE7' )
)
GROUP BY SITE
)
SELECT t.site ,
t.completed + cte.COMPLETED
FROM #table t
LEFT OUTER JOIN cte ON t.site = cte.Site
ORDER BY t.site

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

How to write this SQL Server query: Add values in unique rows? - sql-server

Related

BigQuery LEFT JOIN a table and filter its array elements based on conditions

SUM with CASE counts duplicate rows in SQL GROUP BY

query is not returning distinct record

Getting most recent date from multiple SQL columns

Sum of missing data

Categories

Resources