SQL Query to Calculate the Rolling Difference by Date - sql-server

I cannot seem to work this one out to be exactly what need.
I'm using MS SQL Management Studio 2008.
I have a table (several actually) but lets keep it simple. The table contains daily stock figures for each item (SKU).
SKU DataDate Web_qty
2 2014-11-17 00:00:00 404
2 2014-11-18 00:00:00 373
2 2014-11-19 00:00:00 1350
66 2014-11-17 00:00:00 3624
66 2014-11-18 00:00:00 3576
66 2014-11-19 00:00:00 3570
67 2014-11-17 00:00:00 9353
67 2014-11-18 00:00:00 9297
67 2014-11-19 00:00:00 9250
I simply need the Select Query to return this:
SKU DataDate Difference
2 2014-11-17 00:00:00 ---
2 2014-11-18 00:00:00 -31
2 2014-11-19 00:00:00 +977
66 2014-11-17 00:00:00 ---
66 2014-11-18 00:00:00 -48
66 2014-11-19 00:00:00 -6
67 2014-11-17 00:00:00 ---
67 2014-11-18 00:00:00 -56
67 2014-11-19 00:00:00 -47
I do not need the --- parts, I have just shown that to draw attention to the fact that this one cannot be calculated as it is the first record.
I've tried using derived tables, but its getting a little confusing, i need to play with a working example so I can understand it better.
If someone could point me in the right direction I'm sure I'll be able to join the other tables back together (i.e. SKU Description and prices).
Really appreciate everyone's time
Kev

Try this. Use correlated sub-query to find rolling difference
CREATE TABLE #tem
(SKU INT,DataDate DATETIME,Web_qty INT)
INSERT #tem
VALUES( 2,'2014-11-17 00:00:00',404),
(2,'2014-11-18 00:00:00',373),
(2,'2014-11-19 00:00:00',1350),
(66,'2014-11-17 00:00:00',3624),
(66,'2014-11-18 00:00:00',3576),
(66,'2014-11-19 00:00:00',3570),
(67,'2014-11-17 00:00:00',9353),
(67,'2014-11-18 00:00:00',9297),
(67,'2014-11-19 00:00:00',9250)
SELECT *,
Web_qty - (SELECT Web_qty
FROM #tem a
WHERE a.sku = b.SKU
AND a.DataDate = Dateadd(dd, -1, b.DataDate)) Roll_diff
FROM #tem b

I know it is an old thread but I happened to have a similar problem and I ended up solving it with Window functions. It works in SQL 2014 but not sure about 2008.
It also solves the problem of potentially non-continuous data as well as rows with no changes. Hopefully it helps someone out there!
CREATE TABLE #tem
(SKU INT,DataDate DATETIME,Web_qty INT)
INSERT #tem
VALUES( 2,'2014-11-17 00:00:00',404),
(2,'2014-11-18 00:00:00',373),
(2,'2014-11-19 00:00:00',1350),
(2,'2014-11-20 00:00:00',1350),
(2,'2014-11-21 00:00:00',1350),
(66,'2014-11-17 00:00:00',3624),
(66,'2014-11-18 00:00:00',3576),
(66,'2014-11-19 00:00:00',3570),
(66,'2014-11-20 00:00:00',3590),
(66,'2014-11-21 00:00:00',3578),
(67,'2014-11-17 00:00:00',9353),
(67,'2014-11-18 00:00:00',9297),
(67,'2014-11-19 00:00:00',9250),
(67,'2014-11-20 00:00:00',9250),
(67,'2014-11-21 00:00:00',9240)
;WITH A AS (
SELECT
SKU,
DataDate,
Web_Qty,
Web_qty - LAG(Web_qty,1, 0)
OVER (PARTITION BY SKU ORDER BY DataDate) Roll_diff
FROM #tem b
)
SELECT
SKU,
DataDate ValidFromDate,
Lead(DataDate, 1, DateFromParts(9999,12,31)) OVER (PARTITION BY SKU ORDER BY DataDate) ValidToDate,
Web_Qty
FROM A WHERE Roll_diff <> 0

Related

How to get > 240 days data from today's date in SQL Server, where particular unit key of some records are within and outside 240 days

How to get > 240 days (or 8 month) data from today's date in SQL Server, where particular unit_key of some records are within and outside 240 days.So eliminate those records.Data available in this table from 2019 and searching records start from 01.01.2020 (allocation_date)
FL_DATA TABLE
fl_key
fl_no
allocation_date
unit_key
unit_name
1
FL/3352
02/20/2021
54
A Pradhan
5
FL/3374
07/14/2020
54
A Pradhan
8
FL/3469
08/16/2019
54
A Pradhan
11
FL/3578
06/22/2019
54
A Pradhan
15
FL/3670
06/15/2020
60
D Raj
22
FL/3692
04/22/2020
60
D Raj
9
FL/3542
07/20/2020
64
K Parihar
33
FL/3599
05/27/2020
64
K Parihar
46
FL/3645
11/13/2019
64
K Parihar
48
FL/3640
02/22/2021
22
R Raja
52
FL/3724
12/28/2020
22
R Raja
12
FL/3342
03/20/2020
22
R Raja
13
FL/3355
12/20/2019
22
R Raja
Output Table:
fl_no
unit_key
unit_name
FL/3670
60
D Raj
FL/3692
60
D Raj
FL/3542
64
K Parihar
FL/3599
64
K Parihar
My probable query is
SELECT fl_no,unit_key,unit_name
from FL_DATA
where DATEDIFF(day, allocation_date, getdate()) > 240
and allocation_date>='01/01/2020'
order by unit_key
But in where clause if I write >240 or <240, in that case data is showing only within or without the DAY boundary mentioned in the query. But here I am talking about some unit_key belongs within 240 and outside 240 days, that data should be eliminated by the query because I want only those unit_key who have not registered with us for more than 240 days. Rest others unit_key should be in query output.
In my table, data is available from the year 2018 and my requirement of data is from year 2020 to till date that’s why I have mentioned "allocation_date>='01/01/2020'”
I hope I got my point through to you and expecting the exact query format.
There you go:
SELECT T.fl_no,
T.unit_key,
T.unit_name,
DATEDIFF(DAY, T.allocation_date, GETDATE()) AS days_past
FROM FL_DATA AS T
WHERE T.allocation_date >= '2020-01-01'
AND NOT EXISTS
(
SELECT 1
FROM FL_DATA AS Recent
WHERE Recent.allocation_date >= DATEADD(DAY, -240, GETDATE())
AND Recent.unit_key = T.unit_key
)
Please let me know if it solves it, or if some adjustments are needed
In case you want to include the days in separate column, I suggest using OUTER APPLY instead:
SELECT T.fl_no,
T.unit_key,
T.unit_name,
unit_key_agg.days_past
FROM FL_DATA AS T
OUTER APPLY
(
SELECT DATEDIFF(
DAY,
MAX(Recent.allocation_date),
GETDATE()
) AS days_past
FROM FL_DATA AS Recent
WHERE Recent.unit_key = T.unit_key
) AS unit_key_agg
WHERE T.allocation_date >= '2020-01-01'
AND unit_key_agg.days_past > 240
This query would give you this result:
fl_no
unit_key
unit_name
days_past
FL/3670
60
D Raj
375
FL/3692
60
D Raj
375
FL/3542
64
K Parihar
340
FL/3599
64
K Parihar
340

Right join isn't pulling in all rows from right table

I've got two tables for a call center in a Microsoft SQL Server database, fact_queue, that has the number of calls received, and dim_interval, which is used to convert the interval number (0-95) into time stamps (e.g. 07:15-07:30). It's set up this way so you can easily change the timezone data is being pulled in.
I'm trying to get a result which will show me all 96 intervals regardless of if there's a call or not, but it's not working as expected.
Here's an example of what's in the tables:
Fact_Queue
date_id
queue_id
interval_id
calls_offered
7780
40
0
1
7780
40
2
5
7780
40
3
6
7780
40
5
10
Dim_Interval
interval_id
interval_name
0
00:00 - 00:15
1
00:15 - 00:30
2
00:30 - 00:45
3
00:45 - 01:00
--
--
95
23:45 - 24:00
I've played around with a couple variations of a query and I believe the following should work, but it isn't
SELECT dim_interval.interval_name
,fact_queue.offered_calls
FROM dim_interval
RIGHT JOIN fact_queue
ON fact_queue.interval_id = dim_interval.interval_id
WHERE fact_queue.date_id= '7780'
AND fact_queue.queue_id = '40'
ORDER BY dim_interval.interval_id
This just results in
interval_name
calls_offered
00:00 - 00:15
1
00:30 - 00:45
5
00:45 - 01:00
6
01:15 - 01:30
10
but what I want is
interval_name
calls_offered
00:00 - 00:15
1
00:15 - 00:30
null
00:30 - 00:45
5
00:45 - 01:00
6
01:00 - 01:15
null
Why is the query not working? If it matters I'm using DBeaver version 21.0.3.202104181339
In the snippet dim_interval JOIN fact_queue, you've place the dimension table on the left, not the right. As you want all the dimension table's rows, this means you want a left outer join...
FROM dim_interval LEFT JOIN fact_queue
That only gets you half way there though, because the WHERE clause is applied After the join. This means the WHERE clause would filter out the results which have NULLs.
So, you need to do the filtering During the join...
SELECT dim_interval.interval_name
,fact_queue.offered_calls
FROM dim_interval
LEFT JOIN fact_queue
ON fact_queue.interval_id = dim_interval.interval_id
AND fact_queue.date_id= '7780'
AND fact_queue.queue_id = '40'
ORDER BY dim_interval.interval_id
Some people prefer to do the filtering Before the join, but that's not necessary and actually yields the same execution plan...
SELECT dim_interval.interval_name
,fact_queue.offered_calls
FROM dim_interval
LEFT JOIN (
SELECT *
FROM fact_queue
WHERE date_id= '7780'
AND queue_id = '40'
) AS fact_queue
ON fact_queue.interval_id = dim_interval.interval_id
ORDER BY dim_interval.interval_id

Longest gap between test matches in Sachin's career

I am new to SQL server and was practising using Sachin's batting stats (Cricket) I found here. (Sachin Batting Statistics). I wanted to find the longest gap between two test matches in Sachin's career. So basically have to filter it based on Test matches and find the max difference in the Start_DateAscending column? Hope that made some sense. Sample table added if link doesn't make sense
EDIT: I created a sample table with different dates. the column is named DateValues. Now, I want to find the code for maximum difference between any two successive rows in the DateValue column. For example, in this case the answer is 2 years and 17 days between December 09, 1989 and December 26, 1991
IF OBJECT_ID('TempDB..#mytable','U') IS NOT NULL
DROP TABLE #mytable
CREATE TABLE #mytable
(
ID INT IDENTITY(1,1) PRIMARY KEY CLUSTERED,
DateValue DATETIME
)
SET DATEFORMAT DMY
SET IDENTITY_INSERT #mytable ON
INSERT INTO #mytable
(ID, DateValue)
SELECT '11', 'Nov 15 1989 12:00AM' UNION ALL
SELECT '59', 'Nov 23 1989 12:00AM' UNION ALL
SELECT '37', 'Dec 09 1989 12:00AM' UNION ALL
SELECT '44', 'Dec 26 1991 12:00AM' UNION ALL
SELECT '55', 'May 31 1993 12:00AM' UNION ALL
SELECT '60', 'May 15 1995 12:00AM' UNION ALL
SELECT '57', 'Jan 12 1996 12:00AM' UNION ALL
SELECT '43', 'Jan 19 1996 12:00AM' UNION ALL
SELECT '49', 'Jan 31 1996 12:00AM' UNION ALL
SELECT '18', 'Oct 17 1997 12:00AM'
Here's a solution I found on this website, the answer I obtained was 1900-01-01!
SELECT MAX(#mytable.DateValue-h.DateValue) as maxDiff
FROM #mytable
LEFT JOIN #mytable h
ON h.ID=[dbo].#mytable.ID AND #mytable.DateValue>=h.DateValue
WHERE h.DateValue IS NOT NULL
If you're using SQL Server 2012 or above, this SQL will return the biggest gap in days between two tests:
select max(datediff(day, a.TestDate, a.NextTest)) as BiggestGap
from (
select DateValue as TestDate, lead(DateValue) over (order by DateValue) as NextTest
from #mytable m
) a
The first thing this query does (inside the parenthesis) is gets a table that lists all test matches and the date of the next test match. That's what the innermost query provides: It selects all test dates and, using the lead function, the date of the match straight after that test.
The data from that parenthesised select (including ID) looks like this:
ID TestDate NextTest
----------- ----------------------- -----------------------
11 1989-11-15 00:00:00.000 1989-11-23 00:00:00.000
59 1989-11-23 00:00:00.000 1989-12-09 00:00:00.000
37 1989-12-09 00:00:00.000 1991-12-26 00:00:00.000
44 1991-12-26 00:00:00.000 1993-05-31 00:00:00.000
55 1993-05-31 00:00:00.000 1995-05-15 00:00:00.000
60 1995-05-15 00:00:00.000 1996-01-12 00:00:00.000
57 1996-01-12 00:00:00.000 1996-01-19 00:00:00.000
43 1996-01-19 00:00:00.000 1996-01-31 00:00:00.000
49 1996-01-31 00:00:00.000 1997-10-17 00:00:00.000
18 1997-10-17 00:00:00.000 NULL
After that (outside the parenthesis), it's simply a case of finding the row with the biggest difference between dates. In SQL Server, it's best to use the datediff function to get the difference between two dates instead of using mathematical operators such as - in the example you saw, so we use that to get the difference in days between each row. max is used to get the largest of those, thus returning the biggest gap between two matches.
Using the example SQL data provided, the biggest gap is 747 days (approximately 2 years 17 days).

T-SQL Count of Records in Status for Previous Months

I have a T-SQL Quotes table and need to be able to count how many quotes were in an open status during past months.
The dates I have to work with are an 'Add_Date' timestamp and an 'Update_Date' timestamp. Once a quote is put into a 'Won' or 'Loss' columns with a value of '1' in that column it can no longer be updated. Therefore, the 'Update_Date' effectively becomes the Closed_Status timestamp.
Here's a few example records:
Quote_No Add_Date Update_Date Open_Quote Win Loss
001 01-01-2016 NULL 1 0 0
002 01-01-2016 3-1-2016 0 1 0
003 01-01-2016 4-1-2016 0 0 1
Here's a link to all the data here:
https://drive.google.com/open?id=0B4xdnV0LFZI1T3IxQ2ZKRDhNd1k
I asked this question previously this year and have been using the following code:
with n as (
select row_number() over (order by (select null)) - 1 as n
from master..spt_values
)
select format(dateadd(month, n.n, q.add_date), 'yyyy-MM') as yyyymm,
count(*) as Open_Quote_Count
from quotes q join
n
on (closed_status = 1 and dateadd(month, n.n, q.add_date) <= q.update_date) or
(closed_status = 0 and dateadd(month, n.n, q.add_date) <= getdate())
group by format(dateadd(month, n.n, q.add_date), 'yyyy-MM')
order by yyyymm;
The problem is this code is returning a cumulative value. So January was fine, but then Feb is really Jan + Feb, and March is Jan+Feb+March, etc. etc. It took me a while to discover this and the numbers returned now way, way off and I'm trying to correct them.
From the full data set the results of this code are:
Year-Month Open_Quote_Count
2017-01 153
2017-02 265
2017-03 375
2017-04 446
2017-05 496
2017-06 560
2017-07 609
The desired result would be how many quotes were in an open status during that particular month, not the cumulative :
Year-Month Open_Quote_Count
2017-01 153
2017-02 112
2017-03 110
2017-04 71
Thank you in advance for your help!
Unless I am missing something, LAG() would be a good fit here
Example
Declare #YourTable Table ([Year-Month] varchar(50),[Open_Quote_Count] int)
Insert Into #YourTable Values
('2017-01',153)
,('2017-02',265)
,('2017-03',375)
,('2017-04',446)
,('2017-05',496)
,('2017-06',560)
,('2017-07',609)
Select *
,NewValue = [Open_Quote_Count] - lag([Open_Quote_Count],1,0) over (Order by [Year-Month])
From #YourTable --<< Replace with your initial query
Returns
Year-Month Open_Quote_Count NewValue
2017-01 153 153
2017-02 265 112
2017-03 375 110
2017-04 446 71
2017-05 496 50
2017-06 560 64
2017-07 609 49

DateTime in sql only comparing the time

ID DateTime Code
---------- -------------- ----------
58 2015-01-01 20:00:00 1111
58 2015-01-11 10:00:00 8523
58 2015-01-11 03:00:00 4555
58 2015-01-19 00:01:00 8888
9 2015-01-01 20:00:00 4444
how do i count the number of codes for a specific ID ignoring which date it is but it must be between 20:00:00 and 06:00:00
select count(code) as count from table 1 where ID='58' and DateTime between '20:00:00' and '06:00:00'
the expected output would be
count
3
SELECT count(code) as count
FROM table1
WHERE
ID='58' and
(CAST(DateTime as time) >= '20:00'
or CAST(DateTime as time) <= '06:00')
EDIT: John, I understand the issue. Here is a full solution to handle those cases:
In order to use variables:
DECLARE #HourBegin time = '07:00'
DECLARE #HourEnd time = '17:30'
SELECT count(code) as count
FROM table1
WHERE
ID='58' and
(CAST(DateTime as time) between #HourBegin and #HourEnd or
((CAST(DateTime as time) <= #HourEnd or
CAST(DateTime as time) >= #HourBegin) and
#HourBegin > #HourEnd)
)
Almost the same as previous answer, but with hours it looks nicer for me and might be you need DISTINCT code
SELECT count(DISTINCT code) as count
FROM table1
WHERE
ID='58' and
(DATEPART(HOUR,DateTime) >= 20
or DATEPART(HOUR,DateTime) < 6)
UPDATED: changed from <= 6 to < 6
Update
This answer applies to MySQL.
When I started writing the answer, the question was tagged mysql and sql-server. The OP edited it in the meantime.
This query should do what you want on MySQL.
SELECT count(code) AS `count`
FROM `table 1`
WHERE ID='58'
AND TIME(`DateTime`) NOT BETWEEN '06:00:01' AND '19:59:59'
The MySQL function TIME() extracts only the time component from a DATETIME value.
On version 5.7, MySQL added support for fractional seconds (up to 6 digits) on DATETIME columns. The query above will include the entries having time greater than 06:00:00 but smaller than 06:00:01 (events that happened during the first second after 6 AM sharp).
For MySQL 5.7 and newer, the correct query is:
SELECT count(code) AS `count`
FROM `table 1`
WHERE ID='58'
AND (TIME(`DateTime`) <= '06:00:00' OR '20:00:00' <= TIME(`DateTime`))
I don't know about SQL Server.

Resources