Calculate a Running Monthly Average in SQL Server - sql-server

We want to create a data-set which shows the monthly average count in our equipment table broken down by it's status: Active, Scrapped, New.
The more I ponder this it seems that the only way to accomplish this is to first create a container temp table and evaluate each record using a cursor.
Can this be accomplished without a temp table?
The following just shows the fields we're working with:
SELECT a1.statusdate, a1.CreateDate,
RunningTotalActive = count([status]='Active'),
RunningTotalScrapped = count([status]='Scrapped'),
NewEquipment = count(Month(a1.CreateDate) )
FROM dbo.Equipment AS a1
INNER JOIN dbo.Equipment AS a2
ON a2.statusdate <= a1.CreateDate
GROUP BY a1.statusdate
ORDER BY a1.statusdate desc

I'm making a few SWAGs about your data, but the idea is to average the sums over simple counts by month; using CTEs.
; WITH A AS (
SELECT a1.statusdate,
Active = CASE a1.[status] WHEN 'Active' THEN 1 ELSE 0 END,
Scrapped = CASE a1.[status] WHEN 'Scrapped' THEN 1 ELSE 0 END,
New = CASE WHEN a2.statusdate = a1.CreateDate THEN 1 ELSE 0 END --Guessing here that "new" means status date and create date are the same
FROM dbo.Equipment AS a1
INNER JOIN dbo.Equipment AS a2
ON a2.statusdate <= a1.CreateDate --"status" can be older than "create" for a piece of equipment? Not sure I understand this criteria. May need sample data.
), B AS (
SELECT Y = DATEPART(YEAR, statusdate)
, M = DATEPART(MONTH, statusdate)
, SumActive = SUM(Active)
, SumScrapped = SUM(Scrapped)
, SumNew = SUM(New)
FROM A
GROUP BY DATEPART(YEAR, statusdate), DATEPART(MONTH, statusdate)
)
SELECT Y, M,
RunningTotalActive = AVG(SumActive)OVER(PARTITION BY Y,M ORDER BY Y,M),
RunningTotalScrapped = AVG(SumScrapped)OVER(PARTITION BY Y,M ORDER BY Y,M),
NewEquipment = AVG(SumNew)OVER(PARTITION BY Y,M ORDER BY Y,M)
FROM B;

Can you show some sample data?
I had modified your script following my understanding. Can you try it?
SELECT YEAR(a1.statusdate) AS yr,MONTH(a1.statusdate) AS mon,
RunningTotalActive = count(CASE WHEN a1.[status]='Active' THEN 1 ELSE NULL END ), -- or SUM(CASE WHEN [status]='Active' THEN 1 ELSE 0 END ),
RunningTotalScrapped = count(CASE WHEN a1.[status]='Scrapped' THEN 1 ELSE NULL END),
NewEquipment = count(CASE WHEN YEAR(a1.CreateDate)*12+ MONTH(a1.CreateDate)= YEAR(a1.statusdate)*12+ MONTH(a1.statusdate)) THEN 1 ELSE NULL END )
FROM dbo.Equipment AS a1
GROUP BY YEAR(a1.statusdate),MONTH(a1.statusdate)
ORDER BY YEAR(a1.statusdate),MONTH(a1.statusdate) DESC

Related

Group by two columns with case query in SQL Server

I'm trying to retrieve drivers data with Total Accepted and Total Ignored ride requests for the current date.
Based on the Drivers and DriverReceivedRequests tables, I get the total count but the twists is I have duplicate rows that reside on the DriverReceivedRequest table against the driverId and the rideId. So there has to be group by clause on both the driverId and RideId, having the driver getting multiple requests for the current date, but also receiving twice or thrice request for the same ride as well.
This is the table structure for DriverReceivedRequests:
Id DriverId RideId ReceivedStatusId DateTime
------------------------------------------------------------------------
0014d26b 93665f55 fef6fb96 NULL 04:55.6
00175c65 6e62a94e cb214a84 NULL 09:32.1
0017c22b ec9e1297 4b47dc8a 4211357D 10:28:5
0014d26b 6e62a94e fef6fb96 NULL 04:56.8
This is the query I have tried:
select
d.Id, d.FirstName, d.LastName,
Sum(case when drrs.Number = 1 then 1 else 0 end) as TotalAccepted,
Sum(case when drrs.Number = 2 or drr.ReceivedStatusId is null then 1 else 0 end) as TotalRejected
from
dbo.[DriverReceivedRequests] drr
inner join
dbo.[Drivers] d on drr.DriverId = d.Id
left join
dbo.[DriverReceivedRequestsStatus] drrs on drr.ReceivedStatusId = drrs.Id
where
Day(drr.Datetime) = Day(getdate())
and month(drr.DateTime) = Month(getdate())
group by
d.FirstName, d.LastName, d.Id
In the above query if I group by with RideId as well, it generates duplicate names of drivers as well with incorrect data. I've also applied partition by clause with DriverId but not the correct result
This also generates the same result
WITH cte AS
(
SELECT
d.Id,
d.FirstName, d.LastName,
TotalAccepted = SUM (CASE WHEN drrs.Number = 1 THEN 1 ELSE 0 END) OVER (PARTITION BY drr.DriverId),
TotalRejected = SUM (CASE WHEN drrs.Number = 2 OR drr.ReceivedStatusId IS NULL THEN 1 ELSE 0 END)
OVER (PARTITION BY drr.DriverId),
rn = ROW_NUMBER() OVER(PARTITION BY drr.DriverId
ORDER BY drr.DateTime DESC)
FROM
DriverReceivedRequests drr
INNER JOIN
dbo.[Drivers] d ON drr.DriverId = d.Id
LEFT JOIN
dbo.[DriverReceivedRequestsStatus] drrs ON drr.ReceivedStatusId = drrs.Id
WHERE
DAY (drr.Datetime) = DAY (GETDATE())
AND MONTH (drr.DateTime) = MONTH (GETDATE())
)
SELECT
Id,
FirstName, LastName,
TotalAccepted, TotalRejected
FROM
cte
WHERE
rn = 1
My question is how can I group by individual driver data in terms of incoming ride requests?
Note: the driver receives same ride request multiple times

SQL - Finding Gaps in Coverage

I am running this problem on SQL server
Here is my problem.
have something like this
Dataset A
FK_ID StartDate EndDate Type
1 10/1/2018 11/30/2018 M
1 12/1/2018 2/28/2019 N
1 3/1/2019 10/31/2019 M
I have a second data source I have no control over with data something like this:
Dataset B
FK_ID SpanStart SpanEnd Type
1 10/1/2018 10/15/2018 M
1 10/1/2018 10/25/2018 M
1 2/15/2019 4/30/2019 M
1 5/1/2019 10/31/2019 M
What I am trying to accomplish is to check to make sure every date within each TYPE M record in Dataset A has at least 1 record in Dataset B.
For example record 1 in Dataset A does NOT have coverage from 10/26/2018 through 11/30/2018. I really only care about when the coverage ends, in this case I want to return 10/26/2018 because it is the first date where the span has no coverage from Dataset B.
I've written a function that does this but it is pretty slow because it is cycling through each date within each M record and counting the number of records in Dataset B. It exits the loop when it finds the first one but I would really like to make this more efficient. I am sure I am not thinking about this properly so any suggestions anyone can offer would be helpful.
This is the section of code I'm currently running
else if #SpanType = 'M'
begin
set #CurrDate = #SpanStart
set #UncovDays = 0
while #CurrDate <= #SpanEnd
Begin
if (SELECT count(*)
FROM eligiblecoverage ec join eligibilityplan ep on ec.plandescription = ep.planname
WHERE ec.masterindividualid = #IndID
and ec.planbegindate <= #CurrDate and ec.planenddate >= #CurrDate
and ec.sourcecreateddate = #MaxDate
and ep.medicaidcoverage = 1) = 0
begin
SET #Result = concat('NON Starting ',format(#currdate, 'M/d/yyyy'))
BREAK
end
set #CurrDate = #CurrDate + 1
end
end
I am not married to having a function it just could not find a way to do this in queries that wasn't very very slow.
EDIT: Dataset B will never have any TYPEs except M so that is not a consideration
EDIT 2: The code offered by DonPablo does de-overlap the data but only in cases where there is an overlap at all. It reduces dataset B to:
FK_ID SpanStart SpanEnd Type
1 10/1/2018 10/25/2018 M
instead of
FK_ID SpanStart SpanEnd Type
1 10/1/2018 10/25/2018 M
1 2/15/2019 4/30/2019 M
1 5/1/2019 10/31/2019 M
I am still futzing around with it but it's a start.
I would approach this by focusing on B. My assumption is that any absent record would follow span_end in the table. So here is the idea:
Unpivot the dates in B (adding "1" to the end dates)
Add a flag if they are present with type "M".
Check to see if any not-present records are in the span for A.
Check the first and last dates as well.
So, this looks like:
with bdates as (
select v.dte,
(case when exists (select 1
from b b2
where v.dte between b2.spanstart and b2.spanend and
b2.type = 'M'
)
then 1 else 0
end) as in_b
from b cross apply
(values (spanstart), (dateadd(day, 1, spanend)
) v(dte)
where b.type = 'M' -- all we care about
group by v.dte -- no need for duplicates
)
select a.*,
(case when not exists (select 1
from b b2
where a.startdate between b2.spanstart and b2.spanend and
b2.type = 'M'
)
then 0
when not exists (select 1
from b b2
where a.enddate between b2.spanstart and b2.spanend and
b2.type = 'M'
)
when exists (select 1
from bdates bd
where bd.dte between a.startdate and a.enddate and
bd.in_b = 0
)
then 0
when exists (select 1
from b b2
where a.startdate between b2.spanstart and b2.spanend and
b2.type = 'M'
)
then 1
else 0
end)
from a;
What is this doing? Four validity checks:
Is the starttime valid?
Is the endtime valid?
Are any intermediate dates invalid?
Is there at least one valid record?
Start by framing the problem in smaller pieces, in a sequence of actions like I did in the comment.
See George Polya "How To Solve It" 1945
Then Google is your friend -- look at==> sql de-overlap date ranges into one record (over a million results)
UPDATED--I picked Merge overlapping dates in SQL Server
and updated it for our table and column names.
Also look at theory from 1983 Allen's Interval Algebra https://www.ics.uci.edu/~alspaugh/cls/shr/allen.html
Or from 2014 https://stewashton.wordpress.com/2014/03/11/sql-for-date-ranges-gaps-and-overlaps/
This is a primer on how to setup test data for this problem.
Finally determine what counts via Ranking the various pairs of A vs B --
bypass those totally Within, then work with earliest PartialOverlaps, lastly do the Precede/Follow items.
--from Merge overlapping dates in SQL Server
with SpanStarts as
(
select distinct FK_ID, SpanStart
from Coverage_B as t1
where not exists
(select * from Coverage_B as t2
where t2.FK_ID = t1.FK_ID
and t2.SpanStart < t1.SpanStart
and t2.SpanEnd >= t1.SpanStart)
),
SpanEnds as
(
select distinct FK_ID, SpanEnd
from Coverage_B as t1
where not exists
(select * from Coverage_B as t2
where t2.FK_ID = t1.FK_ID
and t2.SpanEnd > t1.SpanEnd
and t2.SpanStart <= t1.SpanEnd)
),
DeOverlapped_B as
(
Select FK_ID, SpanStart,
(select min(SpanEnd) from SpanEnds as e
where e.FK_ID = s.FK_ID
and SpanEnd >= SpanStart) as SpanEnd
from SpanStarts as s
)
Select * from DeOverlapped_B
Now we have something to feed into the next steps, and we can use the above as a CTE
======================================
with SpanStarts as
(
select distinct FK_ID, SpanStart
from Coverage_B as t1
where not exists
(select * from Coverage_B as t2
where t2.FK_ID = t1.FK_ID
and t2.SpanStart < t1.SpanStart
and t2.SpanEnd >= t1.SpanStart)
),
SpanEnds as
(
select distinct FK_ID, SpanEnd
from Coverage_B as t1
where not exists
(select * from Coverage_B as t2
where t2.FK_ID = t1.FK_ID
and t2.SpanEnd > t1.SpanEnd
and t2.SpanStart <= t1.SpanEnd)
),
DeOverlapped_B as
(
Select FK_ID, SpanStart,
(select min(SpanEnd) from SpanEnds as e
where e.FK_ID = s.FK_ID
and SpanEnd >= SpanStart) as SpanEnd
from SpanStarts as s
),
-- find A row's coverage
ACoverage as (
Select
a.*, b.SpanEnd, b.SpanStart,
Case
When SpanStart <= StartDate And StartDate <= SpanEnd
And SpanStart <= EndDate And EndDate <= SpanEnd
Then '1within' -- starts, equals, during, finishes
When EndDate < SpanStart
Or SpanEnd < StartDate
Then '3beforeAfter' -- preceeds, meets, preceeded, met
Else '2overlap' -- one or two ends hang over spanStart/End
End as relation
From Coverage_A a
Left Join DeOverlapped_B b
On a.FK_ID = b.FK_ID
Where a.Type = 'M'
)
Select
*
,Case
When relation1 = '2' And StartDate < SpanStart Then StartDate
When relation1 = '2' Then DateAdd(d, 1, SpanEnd)
When relation1 = '3' Then StartDate
End as UnCoveredBeginning
From (
Select
*
,SUBSTRING(relation,1,1) as relation1
,ROW_NUMBER() Over (Partition by A_ID Order by relation, SpanStart) as Rownum
from ACoverage
) aRNO
Where Rownum = 1
And relation1 <> '1'

Select from same column under different conditons

I need to join these two tables. I need to select occurrences where:
ex_head of_family_active = 1 AND tax_year = 2017
and also:
ex_head of_family_active = 0 AND tax_year = 2016
The first time I tried to join these two tables I got the warehouse data
dbo.tb_master_ascend AND warehouse_data.dbo.tb_master_ascend in the from clause have the same exposed names. As the query now shown below, I get a syntax error on the "where". What am I doing wrong? Thank you
use [warehouse_data]
select
parcel_number as Account,
pact_code as type,
owner_name as Owner,
case
when ex_head_of_family_active >= 1
then 'X'
else ''
end 'Head_Of_Fam'
from
warehouse_data.dbo.tb_master_ascend
inner join
warehouse_data.dbo.tb_master_ascend on parcel_number = parcel_number
where
warehouse_data.dbo.tb_master_ascend.tax_year = '2016'
and ex_head_of_family_active = 0
where
warehouse_data.dbo.tb_master_ascend.t2.tax_year = '2017'
and ex_head_of_family_active >= 1
and (eff_from_date <= getdate())
and (eff_to_date is null or eff_to_date >= getdate())
#marc_s I changed the where statements and updated my code however the filter is not working now:
use [warehouse_data]
select
wh2.parcel_number as Account
,wh2.pact_code as Class_Type
,wh2.owner_name as Owner_Name
,case when wh2.ex_head_of_family_active >= 1 then 'X'
else ''
end 'Head_Of_Fam_2017'
from warehouse_data.dbo.tb_master_ascend as WH2
left join warehouse_data.dbo.tb_master_ascend as WH1 on ((WH2.parcel_number = wh1.parcel_number)
and (WH1.tax_year = '2016')
and (WH1.ex_head_of_family_active is null))
where WH2.tax_year = '2017'
and wh2.ex_head_of_family_active >= 1
and (wh2.eff_from_date <= getdate())
and (wh2.eff_to_date is null or wh2.eff_to_date >= getdate())
I would use a CTE to get all your parcels that meet your 2016 rules.
Then join that against your 2017 rules on parcel ID.
I'm summarizing:
with cte as
(
select parcelID
from
where [2016 rules]
group by parcelID --If this isn't unique you will cartisian your results
)
select columns
from table
join cte on table.parcelid=cte.parcelID
where [2017 rules]

Updating multiple columns for each record from table with multiple records for each ID

This is my first time posting a question here so please be gentle, I searched as exhaustively as I could. Sometimes it's how to search for the answer that is half the battle.
What I'm trying to do is update Table 1 with the data from Table 2 for each person for each period. Some people will have category A,B, and/or C records in Table 2, but not all and not necessarily all three.
The problem I'm running into is that my statement I'm using to do the update will update some of the columns but not all. I'm guessing this is because the updates weren't committed yet while doing the update so it can't fetch values that haven't been committed yet.
Do I need to do 3 separate update statements or can this somehow be handled through a case statement. I'm looking for most efficient methods here. Updating Table 1 who has 2 million records for each period.
Table 1 - Period_Perf
CustID
Period_Date
Perf_Cat_A
Perf_Cat_B
Perf_Cat_C
Table 2 - Period_Perf_Detail
CustID
Period_Date
Perf_Category (will contain A, B, or C)
Perf_Points (will contain a integer value)
Here's essentially the statement I've been trying to use:
UPDATE
Period_Perf
SET
Perf_Cat_A = CASE WHEN pd.Perf_Category = 'A' then pd.Total_Perf_Points else Perf_Cat_A END
Perf_Cat_B = CASE WHEN pd.Perf_Category = 'B' then pd.Total_Perf_Points else Perf_Cat_B END
Perf_Cat_C = CASE WHEN pd.Perf_Category = 'C' then pd.Total_Perf_Points else Perf_Cat_C END
from
Period_Perf
ON
INNER JOIN
(
select
CustID
,Period_Date
,Perf_Category
,sum(Perf_Points) as Total_Perf_Points
from
Period_Perf_Detail
group by CustID, Period_Date, Perf_Category
) as pd
ON
Period_Perf.CustID = pd.CustID and Period_Perf.Period_Date = pd.Period_Date
To update all the values at once, you'll need to have data that has all those three values in one row, so something like this:
UPDATE P
SET
Perf_Cat_A = pd.Cat_A,
Perf_Cat_B = pd.Cat_B,
Perf_Cat_C = pd.Cat_C
from
Period_Perf P
cross apply (
select
sum(case when Perf_Category = 'A' then Perf_Points else 0 end) as Cat_A,
sum(case when Perf_Category = 'B' then Perf_Points else 0 end) as Cat_B,
sum(case when Perf_Category = 'C' then Perf_Points else 0 end) as Cat_C
from
Period_Perf_Detail D
where
D.CustID = P.CustID and
D.Period_Date = P.Period_Date
) as pd
I haven't tested this, but hopefully there's no bugs.
Use SELECT statements first to understand your data, then build an update based on that select statement.
First use the following query to better understand why your update statement is not producing your desired results.
SELECT CASE WHEN pd.Perf_Category = 'A' THEN pd.Total_Perf_Points ELSE Perf_Cat_A END AS Perf_Cat_A
, CASE WHEN pd.Perf_Category = 'B' THEN pd.Total_Perf_Points ELSE Perf_Cat_B END AS Perf_Cat_B
, CASE WHEN pd.Perf_Category = 'C' THEN pd.Total_Perf_Points ELSE Perf_Cat_C END AS Perf_Cat_C
FROM Period_Perf
ON
INNER JOIN
(
SELECT CustID
, Period_Date
, Perf_Category
, SUM(Perf_Points) AS Total_Perf_Points
FROM Period_Perf_Detail
GROUP BY CustID
, Period_Date
, Perf_Category ) AS pd
ON Period_Perf.CustID = pd.CustID
AND Period_Perf.Period_Date = pd.Period_Date
I suggest the following query for your update.
UPDATE Period_Perf
SET Perf_Cat_A = SUM(a.Perf_Points)
, Perf_Cat_B = SUM(b.Perf_Points)
, Perf_Cat_C = SUM(c.Perf_Points)
FROM Period_Perf
LEFT JOIN Period_Perf_Detail AS a
ON Period_Perf.CustID = a.CustID
AND Period_Perf.Period_Date = a.Period_Date
AND a.Perf_Category = 'A'
LEFT JOIN Period_Perf_Detail AS b
ON Period_Perf.CustID = b.CustID
AND Period_Perf.Period_Date = b.Period_Date
AND a.Perf_Category = 'B'
LEFT JOIN Period_Perf_Detail AS c
ON Period_Perf.CustID = c.CustID
AND Period_Perf.Period_Date = c.Period_Date
AND a.Perf_Category = 'C'
GROUP BY Period_Perf.CustID , Period_Perf.Period_Date
Please notice that instead putting criteria in a WHERE, it is in the join itself. This way you don't affect the entire dataset, but just the subsets individually on each join.

SQL Percentage calculation

Is it possible in SQL to calculate the percentage of the 'StaffEntered' column's "Yes" values (case when calculated column) out of the grand total number of orders by that user (RequestedBy)? I'm basically doing this function now myself in Excel with a Pivot table, but thought it may be easier to build it into the query. Here is the existing sample SQL code:
Select
Distinct
RequestedBy = HStaff.Name,
AccountID = isnull(pv.AccountID, ''),
StaffEntered = Case When DictionaryItem2.Name like '%PLB%' Then 'Yes' Else 'No' end
FROM
[dbo].[HOrd] HOrd WITH ( NOLOCK )
left outer join HStaff HStaff with (nolock)
on HOrd.Requestedby_oid = HStaff.ObjectID
and HStaff.Active = 1
left outer join DictionaryItem DictionaryItem2 WITH (NOLOCK)
ON HSUser1.PreferenceGroup_oid = DictionaryItem2.ObjectID
AND DictionaryItem2.ItemType_oid = 98
Here is what I am doing in Excel currently with the query results, I have a pivot table and I am dividing the "Yes" values of the "StaffEntered" field out of the Grand Total number of entries for that specific "RequestedBy" user. Essentially Excel is doing the summarization and then I am doing a simple division calculation to obtain the percentage.
Thanks in advance!
You didn't provide a lot in the way of details but I think this should be pretty close to what you are looking for.
select HStaff.Name as RequestedBy
, isnull(pv.AccountID, '') as AccountID
, Case When DictionaryItem2.Name like '%PLB%' Then 'Yes' Else 'No' end as StaffEntered
, sum(Case When DictionaryItem2.Name like '%PLB%' Then 1 Else 0 end) / GrandTotal
From SomeTable
group by HStaff.Name
, isnull(pv.AccountID, '')
, GrandTotal
Giving the FROM part of your SQL Statement would allow us to create a more correct answer. This statement will get the totals of yes/no per HStaff name and add it to each detail record in your SQL statement:
WITH cte
AS ( SELECT HStaff.Name ,
SUM(CASE WHEN dictionaryItem2.Name LIKE '%PLB%' THEN 1
ELSE 0
END) AS YesCount ,
SUM(CASE WHEN dictionaryItem2.Name NOT LIKE '%PLB%'
THEN 1
ELSE 0
END) AS NotCount
FROM YourTable
GROUP BY HStaff.Name
)
SELECT HStaff.Name AS requestedBy ,
ISNULL(pv.AccountID, '') AS AccountID ,
CASE WHEN DictionaryItem2.Name LIKE '%PLB%' THEN 'Yes'
ELSE 'No'
END AS StaffEntered ,
cte.YesCount / ( cte.YesCount + cte.NotCount ) AS PLB_Percentage
FROM yourtable
INNER JOIN cte ON yourtable.Hstaff.Name = cte.NAME

Resources