How get one line per key in left outer join - sql-server

The context is a transaction table with date and UserAccount. This table contains about billion lines.
dOperationValueDate sUserAccount
------------------- ----------------------------------------------
2016-03-05 00000000001
2016-03-06 00000000002
2016-03-07 00000000003
2016-03-08 00000000004
2016-03-09 00000000005
2016-04-05 00000000002
2016-10-05 00000000001
2016-10-06 00000000001
2016-10-06 00000000005
I would like to find datas in my table with these criterias :
At least one transaction before 6 months ago (like TOP 1 *)
No transaction for 6 months
In my example, the results would be accounts 2, 3, 4.
I started with a LEFT OUTER JOIN, in order to remove all userId with transaction since 6 months. But the processing time is just horrible : for 4 hours right now.
SELECT b.sUserAccount FROM
(SELECT sUserAccount FROM T_Operations WITH (readuncommitted) WHERE dOperationValueDate < DATEADD(month, -6, DATEADD(month, DATEDIFF(month, 0, GETUTCDATE()), 0)) GROUP BY sUserAccount) b -- all operations before 6 months ago
LEFT JOIN
(SELECT sUserAccount FROM T_Operations WITH (readuncommitted) WHERE dOperationValueDate >= DATEADD(month, -6, DATEADD(month, DATEDIFF(month, 0, GETUTCDATE()), 0)) GROUP BY sUserAccount) c -- all operations since 6 months
ON b.sUserAccount = c.sUserAccount
WHERE c.sUserBankAccount IS NULL) d -- remove all customers who have operations before 6 months ago and since 6 months / keep only customers who have operations beofre 6 months ago only
I think the solution is to find only one operation in the b query, and sql stops when it find one row. The main problem is only if the user doesn't have transaction before 6 months ago but for the others, it will be fine.
On the other hand, I have to check each transaction since 6 months in order to remove customers from the scope.
I read about CROSS APPLY, but I'm not sure about how it works.
The main problem here is the processing time. I have to do a "quick" request (less than 1 hour).

I think you should be able to just use NOT EXISTS here.
SELECT b.sUserAccount
FROM T_Operations b WITH (READUNCOMMITTED)
WHERE b.dOperationValueDate < DATEADD(month,-6,DATEADD(month,DATEDIFF(month,0,GETUTCDATE()),0))
AND NOT EXISTS ( SELECT 1
FROM T_Operations WITH (READUNCOMMITTED)
WHERE sUserAccount = b.sUserAccount
AND dOperationValueDate >= DATEADD(month,-6,DATEADD(month,DATEDIFF(month,0,GETUTCDATE()),0)) )
GROUP BY b.sUserAccount -- all operations before 6 months ago
or actually, you might be able to just use GROUP BY with HAVING
SELECT sUserAccount
FROM T_Operations WITH (READUNCOMMITTED)
GROUP BY sUserAccount
HAVING MAX(dOperationValueDate) < DATEADD(month,-6,DATEADD(month,DATEDIFF(month,0,GETUTCDATE()),0))
as a side note.. DATEADD(month,-6,DATEADD(month,DATEDIFF(month,0,GETUTCDATE()),0)) would return 2016-04-01
if you want current date, minus six months you can use DATEADD(month,-6,CAST(GETUTCDATE() AS DATE)) or DATEADD(month,-6,DATEADD(day,DATEDIFF(day,0,GETUTCDATE()),0)

datatime #dt = DATEADD(month, -6, DATEADD(month, DATEDIFF(month, 0, GETUTCDATE()), 0));
SELECT sUserAccount
FROM T_Operations WITH (readuncommitted)
WHERE dOperationValueDate < #dt
EXCEPT
SELECT sUserAccount
FROM T_Operations WITH (readuncommitted)
WHERE dOperationValueDate >= #dt;
Have an index on dOperationValueDate

Related

build month Start and End dates intervals SQL

I have my getdate() = '2022-03-21 09:24:34.313'
I'd like to build Start Month and End Month dates intervals with SQL language (SQL server) , with the following screen :
You can use EOMONTH function and DATEADD function to get the data you want.
But, the best approach would be to use a calendar table and map it against the current date and get the data you want.
DECLARE #DATE DATE = getdate()
SELECT DATEADD(DAY,1,EOMONTH(#DATE,-1)) AS MonthM_Start, EOMONTH(#DATE) AS MonthM_End,
DATEADD(DAY,1,EOMONTH(#DATE,-2)) AS MonthOneBack_Start, EOMONTH(#DATE,-1) AS MonthOneBack_End,
DATEADD(DAY,1,EOMONTH(#DATE,-3)) AS MonthTwoBack_Start, EOMONTH(#DATE,-2) AS MonthTwoBack_End,
DATEADD(DAY,1,EOMONTH(#DATE,-4)) AS MonthThreeBack_Start, EOMONTH(#DATE,-3) AS MonthThreeBack_End
MonthM_Start
MonthM_End
MonthOneBack_Start
MonthOneBack_End
MonthTwoBack_Start
MonthTwoBack_End
MonthThreeBack_Start
MonthThreeBack_End
2022-03-01
2022-03-31
2022-02-01
2022-02-28
2022-01-01
2022-01-31
2021-12-01
2021-12-31
You can use a recursive CTE to avoid having to hard-code an expression for each month boundary you need, making it very easy to handle fewer or more months by just changing a parameter.
Do you really need the end date for processing? Seems more appropriate for a label, since date/time types can vary - meaning the last day of the month at midnight isn't very useful if you're trying to pull any data from after midnight on the last day of the month.
This also shows how to display the data for each month even if there isn't any data in the table for that month.
DECLARE #number_of_months int = 4,
#today date = DATEFROMPARTS(YEAR(GETDATE()), MONTH(GETDATE()), 1);
;WITH m(s) AS
(
SELECT #today UNION ALL SELECT DATEADD(MONTH, -1, s) FROM m
WHERE s > DATEADD(MONTH, 1-#number_of_months, #today)
)
SELECT MonthStart = m.s, MonthEnd = EOMONTH(m.s)--, other cols/aggs
FROM m
--LEFT OUTER JOIN dbo.SourceTable AS t
--ON t.datetime_column >= m
--AND t.datetime_column < DATEADD(MONTH, 1, m);
Output (without the join):
MonthStart
MonthEnd
2022-03-01
2022-03-31
2022-02-01
2022-02-28
2022-01-01
2022-01-31
2021-12-01
2021-12-31
Example db<>fiddle
But, as mentioned in a comment, you could easily store this information in a calendar table, too, and just outer join to that:
SELECT c.TheFirstOfMonth, c.TheLastOfMonth --, other cols/aggs
FROM dbo.CalendarTable AS c
LEFT OUTER JOIN dbo.SourceTable AS t
ON t.datetime_column >= c.TheFirstOfMonth
AND t.datetime_column < c.TheFirstOfNextMonth
WHERE c.FirstOfMonth >= DATEADD(MONTH, -4, GETDATE())
AND c.FirstOfMonth < GETDATE();

T-SQL : filter out all employees who left before > 6 months

I need to create a new employee-database that filters out all employees, who left the company longer than 6 months ago from today.
I have a table with entry date, exit date and tried something like :
WHERE [exit date] > = DATEADD(M, -6, getdate())
That didn't work because it shows only the employees who left the company longer than 6 months ago. I just want to filter them out automatically and only show the employees, who are still employed er left the company lesser than 6 months ago.
Thanks in advance for your help.
You need to check for null:
WHERE ([exit date] IS NULL OR [exit date] > = DATEADD(M, -6, getdate()))
Do not try tricks with NOT and <
WHERE NOT ([exit date] < DATEADD(M, -6, getdate()))
This doesn't work because NULL rows will just result in UNKNOWN, so those rows will not be returned.
You can do this though, you may find it performs faster or slower than the first version:
WHERE NOT EXISTS (SELECT 1
WHERE [exit date] < DATEADD(M, -6, getdate())
This works because if exit date is null then no row gets returned from the subquery.

SQL Server: Calculate Four Weeks From a Month

I need a simple solution to get 4 weeks for a month based on current date (each week starting from Monday - Friday).
For each week I need to update a table that already has current date and place a counter from Week 1 - 4 and continue to the following month starting from Week 6 - 8. and start from the beginning after week 8.
The query below is returning week number but for 7 days:
can I use something similar just for 5 days?
DECLARE #MyDate DATETIME = '2020-08-03'
--This assumes the weeks starts from Monday - Sunday
DECLARE #WeekNumber INTEGER = (DATEPART(DAY, DATEDIFF(DAY, 0, #MyDate)/7 * 7)/7 +1)
SELECT #WeekNumber
The previous answer was not useful so I got rid of it. This should do what you're looking for
declare #date datetime= '2020-08-03';
select dateadd(d, -4, dt.dt) start_dt,
dt.dt end_dt,
row_number() over (order by v.n) n
from
(select datefromparts(year(#date),month(#date),1) first_dt) fd
cross apply
(select datediff(week, 0, fd.first_dt) wk_diff) wd
cross apply
(values (1),(2),(3),(4),(5),(6)) v(n)
cross apply
(select dateadd(d, -((datepart(weekday, fd.first_dt) + 1 + ##datefirst) % 7), fd.first_dt) calc_dt) calc_dt
cross apply
(select dateadd(d, (v.n-1)*7, calc_dt) dt) dt
where
dt.dt>=fd.first_dt;
Results
start_dt end_dt n
2020-08-03 2020-08-07 1
2020-08-10 2020-08-14 2
2020-08-17 2020-08-21 3
2020-08-24 2020-08-28 4
2020-08-31 2020-09-04 5

Getting an "Aggregate" at a GROUP BY query

1029/5000
I have 2 tables which are linked by the serial number (DeviceID).
Table 1 (C) lists all downloaded cyclist data.
Table 2 (T) lists the data about the device and when the last download took place.
Now I want to do a group by with the average speed of the last 3 6 or 12 months (counted from today).
This goes without problems.
However, when I get the average speed of the last 3 6 or 12 months counted from the last download I am going to get:
An aggregate may not appear in the WHERE clause unless it is in a subquery contained in a HAVING clause or a select list, and the column being aggregated is an outer reference.
Code 1 that goes OK:
SELECT
C.DeviceID
, AVG(C.Speed) AS AVG_Speed
, DATEDIFF(MONTH, C.LogDateTime, GETDATE()) AS Months
FROM Compass C
JOIN Transfer T ON C.DeviceID = T.DeviceID
WHERE DATEDIFF(MONTH, C.LogDateTime, GETDATE()) <= #EvalTimeFrame
GROUP BY C.DeviceID
Code 2 that goes wrong:
SELECT
C.DeviceID
, AVG(C.Speed) AS AVG_Speed
, DATEDIFF(MONTH, C.LogDateTime, GETDATE())) - (DATEDIFF(MONTH, MAX(T.TransferDateTime), GETDATE()) AS Months
FROM Compass C
JOIN Transfer T ON C.DeviceID = T.DeviceID
WHERE (DATEDIFF(MONTH, C.LogDateTime, GETDATE())) - (DATEDIFF(MONTH, MAX(T.TransferDateTime), GETDATE())) <= #EvalTimeFrame
GROUP BY C.DeviceID
Acually what I want to have is:
GROUP BY C.DeviceID, (DATEDIFF(MONTH, C.LogDateTime, GETDATE())) - (DATEDIFF(MONTH, MAX(T.TransferDateTime), GETDATE()))
It would help me a lot - any idea?

Microsoft SQL Calculating Backlog

I would like to calculate the backlog for every week in the past month. Date format is in (MM/DD/YY)
| mutation | issued_date | queryno_i | status |
-----------------------------------------------
01/05/14 12/31/13 321 OPEN
01/02/14 08/01/13 323 CLOSED
01/01/14 06/06/13 123 OPEN
01/01/14 01/01/14 1240 CLOSED
01/02/14 01/01/14 1233 OPEN
01/03/14 01/03/14 200 CLOSED
01/05/14 01/04/14 300 OPEN
01/06/14 01/05/14 231 OPEN
01/07/14 01/06/14 232 CLOSED
01/09/14 01/10/14 332 OPEN
01/11/14 01/11/14 224 CLOSED
01/15/14 01/14/14 225 CLOSED
01/16/14 01/15/14 223 OPEN
I want my result set to look like this:
WeekNum | Opened | Closed | Total Open
--------------------------------------
1 4 3 4 <= (2-4)+ data in week 2 so (2-4)+(1-2)+7
2 4 2 6 <= (1-2)+7
3 2 1 7 <= total count
My Code is below however I am not sure how to query the last part. I am not even sure if this is possible or not.
WITH
issued_queries AS
(
SELECT DATEPART(wk, issued_date) AS 'week_number'
,COUNT(queryno_i) AS 'opened'
FROM t.tech_query
WHERE DATEADD(D,-12,issued_date) > GETDATE()-40
GROUP BY DATEPART(wk, issued_date)
),
closed_queries AS
(
SELECT DATEPART(wk, mutation) AS 'week_number'
,COUNT(queryno_i) AS 'closed'
FROM t.tech_query
WHERE status=3 AND DATEADD(D,-12,issued_date) > GETDATE()-40
GROUP BY DATEPART(wk, mutation)
),
total as
(
SELECT COUNT(*) AS 'total'
FROM t.tech_query
WHERE status!='3'
)
SELECT issued_queries.week_number
, issued_queries.opened
, closed_queries.closed
FROM issued_queries JOIN closed_queries
ON (issued_queries.week_number = closed_queries.week_number)
ORDER BY week_number
Backlog for every week in the past month.
I've taken this to mean last 4 weeks, as that appears to be what you are doing.
Assuming "mutation" represents the date a record was updated (maybe set to closed).
So first, I generate a list of dates, so that way there will be an answer for week number X even if there are no new/closed records.
declare #SundayJustGone datetime
-- We need to get rid of the time component, done through convert.
set #SundayJustGone = convert(date, dateadd(d, 1-DATEPART(dw, getdate()), getdate()))
-- If earlier than sql 2008, can get rid of time component through: set #SundayJustGone = SELECT DATEADD(dd, 0, DATEDIFF(dd, 0, #SundayJustGone))
;with
Last4Weeks as
(
-- Get the sunday of the week just gone.
select #SundayJustGone as SundayDate -- Sunday just gone
union all
select dateadd(d, -7, SundayDate) -- Get the previous Sunday
from Last4Weeks
where dateadd(d, -7, SundayDate) > dateadd(Wk, -4, #SundayJustGone) -- where the new date is not more than 4 weeks old
)
select A.SundayDate,
DATEPART(wk, DateAdd(d, -1, A.SundayDate)) as Week_Number, -- SQL considers Sunday the first day of the week, so we need to move it back 1 day to get the right week
(select count(*)
from t.tech_query
where issued_date between DateAdd(d, -6, A.SundayDate) and A.SundayDate -- Was issued this week. (between monday - sunday)
) as Opened,
(select count(*)
from t.tech_query
where status = 3 -- where it is closed
and mutation between DateAdd(d, -6, A.SundayDate) and A.SundayDate -- and the mutation was this week. (between monday - sunday)
) as Closed,
(select count(*)
from t.tech_query
where (status != 3 or datediff(d, mutation, A.SundayDate) < 0 ) -- Is still open, or was closed after this week.
and datediff(d, issued_date, A.SundayDate) >= 0 -- and it was issued on or before the sunday.
) as TotalOpen
from Last4Weeks as A
hopefully this helps.
the results are different to yours, as I assume Monday is the first day of the week. To change start of week back to sunday, saturday needs to be considered end of week, so, change the set #SundayJustGone = convert(date, dateadd(d, 1-DATEPART(dw, getdate()), getdate())) to set #SundayJustGone = convert(date, dateadd(d, -DATEPART(dw, getdate()), getdate())) (1 removed)

Resources