Detect samples with dates differences within same data column

Detect samples with dates differences within same data column - sql-server

Table:
How can I scan each individual SampleRef and flag the one that have any of the delivery dates more than 6 months apart. So from the sample above only SampleRef A has dates that are 6 months apart example 01/04/2013 and 16/02/2014 using T-SQL.
Result:
Thank you!

Couple other options for you:
MIN and MAX per SampleRef and difference of those greater than 6 months.
Or use LAG() and get the previous delivery date for each record and then see which one had a previous delivery greater than six months.
DECLARE #TestData TABLE
(
[SampleRef] CHAR(1)
, [DeliveryDate] DATE
);
INSERT INTO #TestData (
[SampleRef]
, [DeliveryDate]
)
VALUES ( 'A', '4/1/2013' )
, ( 'A', '2/3/2013' )
, ( 'A', '2/16/2014' )
, ( 'A', '6/12/2015' )
, ( 'A', '6/26/2015' )
, ( 'A', '6/26/2015' )
, ( 'A', '2/10/2015' )
, ( 'B', '6/26/2015' )
, ( 'B', '6/27/2015' )
, ( 'B', '6/28/2015' )
, ( 'B', '6/29/2015' )
, ( 'B', '6/30/2015' )
, ( 'B', '7/1/2015' );
--This looks at all dates per sampleref, min and max and filters those greater than 6 months
SELECT *
FROM (
SELECT [SampleRef]
, MIN([DeliveryDate]) AS [MinDeliveryDate]
, MAX([DeliveryDate]) AS [MaxDeliveryDate]
FROM #TestData
GROUP BY [SampleRef]
) AS [SampleRef]
WHERE DATEDIFF(
MONTH
, [SampleRef].[MinDeliveryDate]
, [SampleRef].[MaxDeliveryDate]
) > 6;
--This will get the prior delivery date for each record and then you can see all where there was a span greater than six months.
SELECT *
, DATEDIFF(
MONTH
, [SampleRef].[PreviousDelivery]
, [SampleRef].[DeliveryDate]
) AS [MonthSincePreviousDelivery]
FROM (
SELECT *
, LAG([DeliveryDate], 1, [DeliveryDate]) OVER ( PARTITION BY [SampleRef]
ORDER BY [DeliveryDate]
) AS [PreviousDelivery]
FROM #TestData
) AS [SampleRef]
WHERE DATEDIFF(
MONTH
, [SampleRef].[PreviousDelivery]
, [SampleRef].[DeliveryDate]
) > 6;

--Is there a record belonging to a SampleRef, where for any row, there is an absence of any
--delivery within six months prior, however there is some prior delivery
SELECT
DISTINCT T1.SampleRef FROM YourTable T1
WHERE EXISTS(
SELECT 0 FROM YourTable T2
WHERE
T1.SampleRef = T2.SampleRef And
Not EXISTS( -- was there no delivery in last 6 months
SELECT 0 FROM YourTable T3
WHERE T3.SampleRef = T2.SampleRef
AND
T3.DeliverYdate >= DATEADD(mm,-6,T2.DeliveryDate)
AND
T3.DeliveryDate < T2.DeliveryDate
)
And Exists --check that there was howevwer a prior delivery
(
SELECT 0 FROM YourTable T4
WHERE T4.SampleRef = T2.SampleRef
AND
T4.DeliverYdate < T2.DeliveryDate
)
)

You can use ROW_NUMBER() to order the samples within a SampleRef, then join that ordered set to itself and find any records where the next available sample is more than 6 months later. (Note that the example code below won't tell you if it's been more than 6 months since the final sample in the set - you could modify the query to do so if necessary)
You didn't specify a name for the table, so replace YourTableNameHere in the query below with the name of your table.
WITH SamplesNumberedByGroup AS (
SELECT
SampleRef,
DeliveryDate,
ROW_NUMBER() OVER (PARTITION BY SampleRef ORDER BY DeliveryDate) AS 'SampleNum'
FROM
YourTableNameHere
)
SELECT
DISTINCT
S.SampleRef
FROM
SamplesNumberedByGroup S
INNER JOIN SamplesNumberedByGroup S2 ON S.SampleRef = S2.SampleRef AND S2.SampleNum = S.SampleNum + 1
WHERE
S2.DeliveryDate > DATEADD(MONTH,6,S.DeliveryDate);
If you want to see each sample where the next available sample is more than 6 months away (instead of just seeing which sampleref has a gap of at least 6 months), use the code below instead.
WITH SamplesNumberedByGroup AS (
SELECT
SampleRef,
DeliveryDate,
ROW_NUMBER() OVER (PARTITION BY SampleRef ORDER BY DeliveryDate) AS 'SampleNum'
FROM
YourTableNameHere
)
SELECT
S.SampleRef
,S.Price
,S.DeliveryDate
FROM
SamplesNumberedByGroup S
INNER JOIN SamplesNumberedByGroup S2 ON S.SampleRef = S2.SampleRef AND S2.SampleNum = S.SampleNum + 1
WHERE
S2.DeliveryDate > DATEADD(MONTH,6,S.DeliveryDate);
If you need to include any entries that are more than 6 months old but do not have a "next" entry as well, then replace INNER JOIN with LEFT OUTER JOIN and add OR (S2.DeliveryDate IS NULL AND GETDATE() > DATEADD(MONTH,6,S.DeliveryDate) to the where statement.

Use EXISTS() to check if there is any row where the next higher row for the same SampleRef is more than 6 months DATEDIFF.

Related

How to select the top 1 in case distinct returns 2 rows

I have a select distinct query that can return 2 rows with the same code since not all columns have the same value. Now my boss wants to get the first one. So how to I do it. Below is the sample result. I want only to return the get the first two unique pro

Use row_number in your query. Please find this link for more info link
; with cte as (
select row_number() over (partition by pro order by actual_quantity) as Slno, * from yourtable
) select * from cte where slno = 1

Your chances to get the proper answer can be much higher if you spend some time to prepare the question properly. Provide the DDL and sample data, as well as add the desired result.
To solve your problem, you need to know the right uniqueness order to get 1 record per window group. Google for window functions. In my example the uniqueness is --> Single row for every pro with earliest proforma_invoice_received_date date and small amount per this date.
DROP TABLE IF EXISTS #tmp;
GO
CREATE TABLE #tmp
(
pro VARCHAR(20) ,
actual_quantity DECIMAL(12, 2) ,
proforma_invoice_received_date DATE ,
import_permit DATE
);
GO
INSERT INTO #tmp
( pro, actual_quantity, proforma_invoice_received_date, import_permit )
VALUES ( 'N19-00945', 50000, '20190516', '20190517' ),
( 'N19-00945', 50001, '20190516', '20190517' )
, ( 'N19-00946', 50002, '20190516', '20190517' )
, ( 'N19-00946', 50003, '20190516', '20190517' );
SELECT a.pro ,
a.actual_quantity ,
a.proforma_invoice_received_date ,
a.import_permit
FROM ( SELECT pro ,
actual_quantity ,
proforma_invoice_received_date ,
import_permit ,
ROW_NUMBER() OVER ( PARTITION BY pro ORDER BY proforma_invoice_received_date, actual_quantity ) AS rn
FROM #tmp
) a
WHERE rn = 1;
-- you can also use WITH TIES for that to save some lines of code
SELECT TOP ( 1 ) WITH TIES
pro ,
actual_quantity ,
proforma_invoice_received_date ,
import_permit
FROM #tmp
ORDER BY ROW_NUMBER() OVER ( PARTITION BY pro ORDER BY proforma_invoice_received_date, actual_quantity );
DROP TABLE #tmp;

Try this-
SELECT * FROM
(
SELECT *,
ROW_NUMBER() OVER (PARTITION BY pro ORDER BY Pro) RN
-- You need to add other columns in the ORDER BY clause
-- with 'pro' to get your desired row. other case you
-- will get first row returned by the query with only
-- order by 'pro' and this can vary for different execution
FROM your_table
)A
WHERE RN = 1

CREATE TABLE T (
A [numeric](10, 2) NULL,
B [numeric](10, 2) NULL
)
INSERT INTO T VALUES (100,20)
INSERT INTO T VALUES (100,30)
INSERT INTO T VALUES (200,40)
INSERT INTO T VALUES (200,50)
select *
from T
/*
A B
100.00 20.00
100.00 30.00
200.00 40.00
200.00 50.00
*/
select U.A, U.B
from
(select row_number() over(Partition By A Order By B) as row_num, *
from T ) U
where row_num = 1
/*
A B
100.00 20.00
200.00 40.00
*/

SQL Server contiguous dates - summarizing multiple rows into contiguous start and end date rows without CTE's, loops,...s

Is it possible to write an sql query that will summarize rows with start and end dates into rows that have contiguous start and end dates?
The constraint is that it has to be regular sql, i.e. no CTE's, loops and the like as a third party tool is used that only allows an sql statement to start with Select.
e.g.:
ID StartDate EndDate
1001, Jan-1-2018, Jan-04-2018
1002, Jan-5-2018, Jan-13-2018
1003, Jan-14-2018, Jan-18-2018
1004, Jan-25-2018, Feb-05-2018
The required output needs to be:
Jan-1-2018, Jan-18-2018
Jan-25-2018, Feb-05-2018
Thank you

You can take advantage of both window functions and the use of a concept called gaps-and-islands. In your case, contiguous dates would be the island, and the the gaps are self explanatory.
I wrote the answer below in a verbose way to help make it clear what the query is doing, but it could most likely be written in a different way that is more concise. Please see my comments in the answer explaining what each step (sub-query) does.
--Determine Final output
select min(c.StartDate) as StartDate
, max(c.EndDate) as EndDate
from (
--Assign a number to each group of Contiguous Records
select b.ID
, b.StartDate
, b.EndDate
, b.EndDatePrev
, b.IslandBegin
, sum(b.IslandBegin) over (order by b.ID asc) as IslandNbr
from (
--Determine if its Contiguous (IslandBegin = 1, means its not Contiguous with previous record)
select a.ID
, a.StartDate
, a.EndDate
, a.EndDatePrev
, case when a.EndDatePrev is NULL then 1
when datediff(d, a.EndDatePrev, a.StartDate) > 1 then 1
else 0
end as IslandBegin
from (
--Determine Prev End Date
select tt.ID
, tt.StartDate
, tt.EndDate
, lag(tt.EndDate, 1, NULL) over (order by tt.ID asc) as EndDatePrev
from dbo.Table_Name as tt
) as a
) as b
) as c
group by c.IslandNbr
order by c.IslandNbr

I hope following SQL query can help you to identify gaps and covered dates for given case
I did not use a CTE expression of a dates table function, etc
On the other hand, I used a numbers table using master..spt_values to generate the dates table as the main table of a LEFT join
You can create a numbers table or a dates table if it does not fit to your requirements
In the query, to catch changes between borders I used SQL LAG() function which enables me to compare with previous value of a column in a sorted list
select
max(startdate) as startdate,
max(enddate) as enddate
from (
select
date,
case when exist = 1 then date else null end as startdate,
case when exist = 0 then dateadd(d,-1,date) else null end as enddate,
( row_number() over (order by date) + 1) / 2 as rn
from (
select date, exist, case when exist <> (lag(exist,1,'') over (order by date)) then 1 else 0 end as changed
from (
select
d.date,
case when exists (select * from Periods where d.date between startdate and enddate) then 1 else 0 end as exist
from (
SELECT dateadd(dd,number,'20180101') date
FROM master..spt_values
WHERE Type = 'P' and dateadd(dd,number,'20180101') <= '20180228'
) d
) cte
) tbl
where changed = 1
) dates
group by rn
Here is the result

MDX: Joining months with total

I want to create table like this:
[January] [February] ...other months... [Total for Year]
item1
item2
item3
It's easy to create 2 different queries, for months and total, like this:
SELECT
[Time].[Month].[Month] ON COLUMNS,
TOPCOUNT([Items], 5, [Count]) ON ROWS
FROM [Cube]
WHERE([Time].[Year].[Year].&[2015-01-01T00:00:00])
and
WITH
MEMBER [Total] AS SUM([Count], [Time].[Year].[Year].&[2015-01-01T00:00:00])
SELECT
[Total] ON COLUMNS,
TOPCOUNT([Items], 5, [Count]) ON ROWS
FROM [Cube]
but how to concatenate them or write single one?

You could expand the WITH statement like this:
WITH
MEMBER [Time].[Month].[All].[Total] AS --<<THIS IS HOSTED IN SAME HIERARCHY AS THE SET THAT FOLLOWS
Sum
(
[Count]
,[Time].[Year].[Year].&[2015-01-01T00:00:00]
)
SET [mths] AS
Exists
(
[Time].[Month].[Month].MEMBERS
,[Time].[Year].[Year].&[2015-01-01T00:00:00]
)
SET [concatenatet_set] AS
{
--<<THE FOLLOWING CAN BE BROUGHT TOGETHER AS THEY HAVE THE SAME "DIMENSIONALITY" I.E. FROM THE SAME HIERARCHY
[mths]
,[Time].[Month].[All].[Total]
}
SELECT
[concatenatet_set] ON COLUMNS
,TopCount
(
[Items]
,5
,[Count]
) ON ROWS
FROM [Cube];
Here is the script I have used to test the above idea against AdvWrks:
WITH
MEMBER [Date].[Calendar].[All].[Total] AS
Sum
(
[Measures].[Internet Sales Amount]
,(
[Date].[Calendar].[All Periods]
,[Date].[Calendar Year].&[2007]
,[Date].[Calendar Quarter of Year].&[CY Q1]
)
)
SET [mths] AS
Exists
(
[Date].[Calendar].[Month]
,(
[Date].[Calendar Year].&[2007]
,[Date].[Calendar Quarter of Year].&[CY Q1]
)
)
SET [concatenatet_set] AS
{
[mths]
,[Date].[Calendar].[All].[Total]
}
SELECT
[concatenatet_set] ON COLUMNS
,TopCount
(
NonEmpty
(
[Product].[Subcategory].[Subcategory]
,(
[Date].[Calendar Year].&[2007]
,[Date].[Calendar Quarter of Year].&[CY Q1]
)
)
,5
,[Measures].[Internet Sales Amount]
) ON ROWS
FROM [Adventure Works]
WHERE
[Measures].[Internet Sales Amount];
It results in the following which seems reasonable:

Try just changing your first query to:
SELECT
[Time].[Month].Members ON COLUMNS,
TOPCOUNT([Items], 5, [Count]) ON ROWS
FROM [Cube]
WHERE([Time].[Year].[Year].&[2015-01-01T00:00:00])

TSQL matching the first instances of multiple values in a resultset

Say I have part of a large query, as below, that returns a resultset with multiple rows of the same key information (PolNum) with different value information (PolPremium) in a random order.
Would it be possible to select the first matching PolNum fields and sum up the PolPremium. In this case I know that there are 2 PolNumber's used so given the screenshot of the resultset (yes I know it starts at 14 for illustration purposes) and return the first values and sum the result.
First match for PolNum 000035789547
(ROW 14) PolPremium - 32.00
First match for PolNum 000035789547
(ROW 16) PolPremium - 706043.00
Total summed should be 32.00 + 706043.00 = 706072.00
Query
OUTER APPLY
(
SELECT PolNum, PolPremium
FROM PN20
WHERE PolNum IN(SELECT PolNum FROM SvcPlanPolicyView
WHERE SvcPlanPolicyView.ControlNum IN (SELECT val AS ServedCoverages FROM ufn_SplitMax(
(SELECT TOP 1 ServicedCoverages FROM SV91 WHERE SV91.AccountKey = 3113413), ';')))
ORDER BY PN20.PolEffDate DESC
}
Resultset

Suppose that pic if the final result your query produces. Then you can do something like:
DECLARE #t TABLE
(
PolNum VARCHAR(20) ,
PolPremium MONEY
)
INSERT INTO #t
VALUES ( '000035789547', 32 ),
( '000035789547', 76 ),
( '000071709897', 706043.00 ),
( '000071709897', 1706043.00 )
SELECT t.PolNum ,
SUM(PolPremium) AS PolPremium
FROM ( SELECT * ,
ROW_NUMBER() OVER ( PARTITION BY PolNum ORDER BY PolPremium ) AS rn
FROM #t
) t
WHERE rn = 1
GROUP BY GROUPING SETS(t.PolNum, ( ))
Output:
PolNum PolPremium
000035789547 32.00
000071709897 706043.00
NULL 706075.00
Just replace #t with your query. Also I assume that row with minimum of premium is the first. You could probably do filtering top row in outer apply part but it really not clear for me what is going on there without some sample data.

sql conditional aggregate function with date type

I have table like:
CREATE TABLE myissues
(
id int IDENTITY(1,1) primary key,
title varchar(20),
status varchar(30),
submitdate datetime,
updatedate datetime
);
INSERT INTO myissues
(title, status,submitdate,updatedate)
VALUES
('issue1', 'closed','2014-01-01 07:59:59.000','2014-01-02 10:59:59.000'),
('issue2', 'closed','2014-01-01 08:59:59.000','2014-01-02 12:59:59.000'),
('issue3', 'closed','2014-01-01 09:59:59.000','2014-01-02 10:59:59.000'),
('issue4', 'closed','2014-01-02 07:59:59.000','2014-01-03 10:59:59.000'),
('issue5', 'closed','2014-01-02 08:59:59.000','2014-01-03 11:59:59.000'),
('issue6', 'closed','2014-01-03 08:59:59.000','2014-01-03 12:59:59.000');
I want to get counts of the issues for each day and counts should be in two different categories: Open issue which is submitted and closed which is status='closed' and update date.
here is my sql script:
SELECT
convert(nvarchar(10),submitdate,112) as Dates,
COUNTS_OPEN = SUM(case when (submitdate > CONVERT(datetime, '2014-01-01 00:00:00.000') and submitdate < CONVERT(datetime, '2014-01-05 00:00:00.000') ) then 1 else 0 end),
COUNTS_CLOSED = SUM(case when (status='closed' and (updatedate > CONVERT(datetime, '2014-01-01 00:00:00.000') and updatedate < CONVERT(datetime, '2014-01-05 00:00:00.000')) ) then 1 else 0 end)
FROM myissues
GROUP BY convert(nvarchar(10),submitdate,112)
order by convert(nvarchar(10),submitdate,112)
the result in sqlfiddle is:
DATES COUNTS_OPEN COUNTS_CLOSED
20140101 3 3
20140102 2 2
20140103 1 1
As you can see, the result is wrong for COUNTS_CLOSED. Correct result should be 0,3,3 for the listed dates above.
I think I'm not grouping it correctly. Can anyone help?
thanks!

You need to separate the closed count from the main query. Try this. ..
SELECT convert(nvarchar(10),submitdate,112) as Dates,
COUNTS_OPEN = COUNT(1),
COUNTS_CLOSED = (SELECT COUNT(1) FROM myissues f WHERE f.status = 'closed' AND convert(nvarchar(10),updatedate,112) = convert(nvarchar(10), m.submitdate,112))
FROM myissues m
GROUP BY convert(nvarchar(10),submitdate,112)
order by convert(nvarchar(10),submitdate,112)
Be warned that this query will not work if the issue was closed on a day that there was no issue submitted. To do this properly you need to get a exhaustive list of dates.
Try something like this...
SELECT m.Dates,
COUNTS_OPEN = (SELECT COUNT(1) FROM myissues f WHERE convert(nvarchar(10),submitdate,112) = m.Dates),
COUNTS_CLOSED = (SELECT COUNT(1) FROM myissues f WHERE f.status = 'closed' AND convert(nvarchar(10),updatedate,112) = m.Dates)
FROM (
SELECT convert(nvarchar(10),submitdate,112) as Dates
FROM myissues
UNION
SELECT convert(nvarchar(10),updatedate,112)
FROM myissues
) m
order by m.Dates

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Detect samples with dates differences within same data column - sql-server

Table: How can I scan each individual SampleRef and flag the one that have any of the delivery dates more than 6 months apart. So from the sample above only SampleRef A has dates that are 6 months apart example 01/04/2013 and 16/02/2014 using T-SQL. Result: Thank you!

Use EXISTS() to check if there is any row where the next higher row for the same SampleRef is more than 6 months DATEDIFF.

Related

How to select the top 1 in case distinct returns 2 rows

SQL Server contiguous dates - summarizing multiple rows into contiguous start and end date rows without CTE's, loops,...s

MDX: Joining months with total

TSQL matching the first instances of multiple values in a resultset

sql conditional aggregate function with date type

Categories

Resources