T-SQL: group date ranges and set end date based on, on/off column - sql-server

I'm struggling to find an effective, concise way without a loop to produce groups of active dates where activity (grant/rescind activity) is based on the Switch column which has values 0 off and 1 on and start and end dates.
TransactionDate EffectiveDate TerminationDate Switch
-------------------------------------------------------------------
2013-06-14 2013-05-29 NULL 1
2013-06-14 2013-06-05 2013-06-05 0
2013-10-03 2013-05-29 2013-05-29 0
2013-10-12 2013-05-29 NULL 1
2013-10-12 2013-06-06 2013-06-06 0
The final output should be but one row:
2013-05-29 to 2013-06-06
The output is one row because the the last two transactions were switch on for 5/29/2013 and the last switch off was for 2013-06-06, which becomes the end date for the span.
Even more, the dates should also be grouped by active spans. If there were another year record in here it would need to be on a separate row.
Can I please get some help with a query to solve this issue?

Is this what you want?
select max(case when switch = 1 then effective_date end) as on_date,
(case when max(case when switch = 1 then effective_date end) <
max(case when switch = 0 then effective_date end)
then max(case when switch = 0 then effective_date end)
end) as off_date -- if any
from t;
This gets the last date the switch is on. And then the last date after that it is off.

Related

SQL Server: add column for rows since value changed

I have a table that contains 3 columns: personID, weeknumber, and event. Event is 0 if there was no event for that person in that week and 1 if there was.
I need to create a new column weekssincelastevent which will be 0 for the week where event=1 and then 1,2,3,4 etc for the weeks afterwards. If there is a later event then it starts from 0 again. E.g.
personID
weeknumber
event
weekssincelastevent
1
1
0
NULL
1
2
0
NULL
1
3
1
0
1
4
0
1
1
5
0
2
1
6
0
3
2
1
0
NULL
2
2
1
0
2
3
0
1
2
4
1
0
2
5
0
1
The column should be NULL before the first events and all values NULL where a personID never has event.
I can't think how to write this in SQL.
The table has ~600m rows (60m personIDs with 100 weeknumbers each, although some personIDs don't have all the weeknumbers).
Many thanks for any insight.
This is a bit of a gaps and island problem here. The first part, in the CTE, puts the data into "groups". Each time there is an event that's a new group. it also calculates the number of weeks that past since the prior week (which is set to 0 for rows hosting an event). Then in the outer query we SUM the number of weeks past in each group, giving the number of weeks that have passed:
WITH Groups AS(
SELECT PersonID,
WeekNumber,
Event,
COUNT(CASE Event WHEN 1 THEN 1 END) OVER (PARTITION BY PersonID ORDER BY WeekNumber ASC
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS Events,
CASE Event WHEN 0 THEN WeekNumber - LAG(WeekNumber) OVER (PARTITION BY PersonID ORDER BY WeekNumber ASC) ELSE 0 END AS WeeksPassed
FROM dbo.YourTable)
SELECT PersonID,
WeekNumber,
Event,
CASE WHEN Events = 0 THEN NULL
ELSE SUM(WeeksPassed) OVER (PARTITION BY PersonID, Events ORDER BY WeekNumber ASC
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)
END AS WeekSinceLastEvent
FROM Groups;
db<>fiddle
You can do this with a conditional aggregate within a windowed function:
SELECT t.PersonID,
t.WeekNumber,
t.Event,
WeeksSinceLastEvent = t.WeekNumber - MAX(CASE WHEN t.Event = 1 THEN t.WeekNumber END)
OVER(PARTITION BY t.PersonID ORDER BY t.WeekNumber)
FROM dbo.T AS t;
The key parts are:
CASE WHEN t.Event = 1 THEN t.WeekNumber END Only consider week number where it is a valid event. Since MAX with ignore nulls this will only consider relevant rows
OVER (PARTITION BY t.PersonID ORDER BY t.WeekNumber) - Only consider rows for the current person, where the weeknumber is lower than the current row.
Example on DB<>Fiddle

SQL Server based on current date need to check the date range in WHERE condition

Based on current date, I need to set case in WHERE condition.
select
count(id)
from
table1
case
when (DAY(GETDATE())) between 1 and 15
Then created_date between (getdate()-DAY(GETDATE())+1) and (getdate()+15-DAY(GETDATE()))
else created_date between (getdate()+16-DAY(GETDATE())) and (getdate()+31-DAY(GETDATE()))
End
but it does not work for me.
For eg,
Table1:
Id App_Name Created_Date
1 app1 2016-12-05
2 app2 2016-12-10
3 app3 2016-12-16
4 app4 2016-12-25
5 app5 2016-12-28
Today date is 2016-12-15, So i need to take created_date between 2016-12-01 to 2016-12-15.
Expected output is "2"
You can try using a nested CASE expression along with conditional aggregation:
SELECT CASE WHEN DAY(GETDATE()) <= 15 THEN
SUM(CASE WHEN created_date BETWEEN (getdate()-DAY(GETDATE())+1) AND
(getdate()+15-DAY(GETDATE()))
THEN 1 ELSE 0 END)
ELSE
SUM(CASE WHEN created_date BETWEEN (getdate()+16-DAY(GETDATE())) AND
(getdate()+31-DAY(GETDATE()))
THEN 1 ELSE 0 END)
END AS summary
FROM table1
The simple way of doing it.Working perfectly.Try it once
SELECT CASE
WHEN (DAY(GETDATE())) < 16 THEN
(Select Count(1) from SEP_RetailStore WITH (NOLOCK) WHERE DAY(CreateDate) BETWEEN 1 and 15)
ELSE (Select Count(1) from SEP_RetailStore WITH (NOLOCK) WHERE DAY(CreateDate) >= 16)
END

Need query for counting for 2 columns at a time

I have a table named "Orders"
It has following fields:
OrderID, OrderDate, ..... ,City, StatusID.
I want this result as return:
City No. of Delivered Orders, No. of Pending (Not Delivered)
-------------------------------------------------------------------
London 3 4
Paris 5 6
New York 7 8
Since we have only one field to track the delivery status that is StatusID, so I am facing difficulty in order to count for two conditions at a time..
Thanx in Advance :)
select City,
sum(case when StatusID = 'delivered' then 1 else 0 end) as [No. of Delivered Orders],
sum(case when StatusID = 'not_delivered' then 1 else 0 end) as [No. of Pending]
from Orders

Selecting multiple counts grouped by time from same table

I have a table logging application activity.
Each row contains a DateTime ("Time") and an "EventType" column... (And obviously some others that are unimportant here)
I would like to be able to get a count of the number of different EventTypes that occur every hour.
I'm currently getting a basic count of a single EventType with:
select DATEADD(hh, (DATEDIFF(hh,0,Time)),0) as 'hour', count(*) as 'apploads'
from PlaySessionEvent
where EventType = 0
Group By DATEADD(hh,(DATEDIFF(hh,0,Time)),0)
order by hour
What is the easiest way to extend this to count multiple different EventTypes within the same hour?
::Update
Should have specified, I havn't just grouped by the EventType aswell because I only want a subset of all the EventTypes available. (ie. not boring trace/debug data)
Also, I wanted the different event types as columns, rather than additional rows duplicating the DateTime entries.
Eg...
DateTime EventType1 EventType2
12:12:12 12/12/12 45 22
Apologies for the inexact initial question!
select EventType, DATEADD(hh, (DATEDIFF(hh,0,Time)),0) as 'hour', count(*) as 'apploads'
from PlaySessionEvent
where EventType = 0
Group By DATEADD(hh,(DATEDIFF(hh,0,Time)),0), EventType
order by hour
Edit:
New solution for the changed question:
select DATEADD(hh, (DATEDIFF(hh,0,Time)),0) as 'hour', count(*) as 'apploads',
sum(case when EventType = 1 then 1 else 0 end) EventType1,
sum(case when EventType = 2 then 1 else 0 end) EventType2
from PlaySessionEvent
where EventType = 0
group By DATEDIFF(hh,0,Time)
order by hour
Here is a slightly different way of writing it:
select DATEADD(hh, (DATEDIFF(hh,0,Time)),0) as [hour],
COUNT(*) [apploads],
COUNT(case when EventType = 1 then 1 end) EventType1,
COUNT(case when EventType = 2 then 1 end) EventType2
from PlaySessionEvent
where EventType = 0
group By DATEDIFF(hh,0,Time)
order by hour

Why some dates give worse performance than other in MS SQL Server

I have a query in MS SQL Server asking for name and some date-related information, depending on two dates, a start- and an enddate.
The problem is, I´m not always getting the same performance. Whenever I request something between the dates;
2010-07-01 00:00:00.000 and
2011-07-21 23:59:59.999
the performance is excellent. I get my result within mseconds. When I request something between these dates, for example,
2011-07-01 00:00:00.000 and
2011-07-21 23:59:59.999
the performance is.. less than good, taking between 20-28 seconds for each query. Do note how the dates giving good performance is more than a year between, while the latter is 20 days.
Is there any particular reason (maybe related to how DATETIME work) for this?
EDIT: The query,
SELECT ENAME,
SUM(CASE DATE WHEN 0 THEN 1 ELSE 0 END) AS U2,
SUM(CASE DATE WHEN 1 THEN 1 ELSE 0 END) AS B_2_4,
SUM(CASE DATE WHEN 2 THEN 1 ELSE 0 END) AS B_4_8,
SUM(CASE DATE WHEN 3 THEN 1 ELSE 0 END) AS B_8_16,
SUM(CASE DATE WHEN 4 THEN 1 ELSE 0 END) AS B_16_24,
SUM(CASE DATE WHEN 5 THEN 1 ELSE 0 END) AS B_24_48,
SUM(CASE DATE WHEN 6 THEN 1 ELSE 0 END) AS O_48,
SUM(CASE DATE WHEN 7 THEN 1 ELSE 0 END) AS status,
AVG(AVG) AS AVG,
SUM(DATE) AS TOTAL
FROM
(SELECT ENAME,
(CASE
WHEN status = 'Öppet' THEN 7
WHEN DATE < 48 THEN
(CASE WHEN DATE BETWEEN 0 AND 2 THEN 0
WHEN DATE BETWEEN 2 AND 4 THEN 1
WHEN DATE BETWEEN 4 AND 8 THEN 2
WHEN DATE BETWEEN 8 AND 16 THEN 3
WHEN DATE BETWEEN 16 AND 24 THEN 4
WHEN DATE BETWEEN 24 AND 48 THEN 5
ELSE - 1 END)
ELSE 6 END) AS DATE,
DATE AS AVG
FROM
(SELECT DATEDIFF(HOUR, cases.date, status.date) AS DATE,
extern.name AS ENAME,
status.status
FROM
cases INNER JOIN
status ON cases.id = status.caseid
AND status.date =
(SELECT MAX(date) AS Expr1
FROM status AS status_1
WHERE (caseid = cases.id)
GROUP BY caseid) INNER JOIN
extern ON cases.owner = extern.id
WHERE (cases.org = 'Expert')
AND (cases.date BETWEEN '2009-01-15 09:48:25.633'
AND '2011-07-21 09:48:25.633'))
AS derivedtbl_1)
AS derivedtbl_2
GROUP BY ENAME
ORDER BY ENAME
(parts of) The tables:
Extern
-ID (->cases.owner)
-name
Cases
-Owner (->Extern.id)
-id (->status.caseid)
-date (case created at this date)
Status
-caseid (->cases.id)
-Status
-Date (can be multiple, MAX(status.date) gives us date when
status was last changed)
I would have thought a statistics issue.
When you are only selecting the most recent dates these may be unrepresented in the statistics yet as the threshold has not yet been reached that would trigger auto updating.
See this blog post for an example.

Resources