I have a table similar to the one represented below.
myID | some data | start_date | end_date
1 Tom 2016-01-01 2016-05-09
2 Mike 2015-03-01 2017-03-09
...
I have a function that when provided with start_date, end_date, interval (for example weeks)
returns me data as below. (splits the start and end dates to week intervals)
select * from my_function('2016-01-01','2016-01-12', 'ww')
2015-12-28 00:00:00.000 | 2016-01-03 00:00:00.000 15W53
2016-01-04 00:00:00.000 | 2016-01-10 00:00:00.000 16W1
2016-01-11 00:00:00.000 | 2016-01-17 00:00:00.000 16W2
I would like to be able to write a query that returns all of the values from the 1 table, but splits Start date and end date in to multiple rows using the function.
myID | some data | Week_start_date | Week_end_date | (optional)week_num
1 Tom 2015-12-28 2016-01-03 15W53
1 Tom 2016-01-04 2016-01-10 16W1
1 Tom 2016-01-11 2016-01-17 16W2
...
2 Mike etc....
Could someone please help me with creating such a query ?
select myID,some_data,b.Week_start_date,b.Week_end_date,b.(optional)week_num from #a cross apply
(select * from my_function('2016-01-01','2016-01-12', 'ww'))b
like sample data i tried
create table #a
(
myID int, some_data varchar(50) , start_date date, end_date date)
insert into #a values
(1,'Tom','2016-01-01','2016-05-09'),
(2,'Mike','2015-03-01','2017-03-09')
here iam keeping function result into one temp table
create table #b
(
a datetime,b datetime, c varchar(50)
)
insert into #b values
('2015-12-28 00:00:00.000','2016-01-03 00:00:00.000','15W53'),
('2016-01-04 00:00:00.000','2016-01-10 00:00:00.000','16W1 '),
('2016-01-11 00:00:00.000','2016-01-17 00:00:00.000','16W2 ')
select myID,some_data,b.a,b.b,b.c from #a cross apply
(select * from #b)b
output like this
myID some_data a b c
1 Tom 2015-12-28 00:00:00.000 2016-01-03 00:00:00.000 15W53
1 Tom 2016-01-04 00:00:00.000 2016-01-10 00:00:00.000 16W1
1 Tom 2016-01-11 00:00:00.000 2016-01-17 00:00:00.000 16W2
2 Mike 2015-12-28 00:00:00.000 2016-01-03 00:00:00.000 15W53
2 Mike 2016-01-04 00:00:00.000 2016-01-10 00:00:00.000 16W1
2 Mike 2016-01-11 00:00:00.000 2016-01-17 00:00:00.000 16W2
Based on your current result and expected result,the only difference ,i see is myID
so you will need to frame your query like this..
;with cte
as
(
select * from my_function('2016-01-01','2016-01-12', 'ww')
)
select dense_rank() over (order by somedata) as col,
* from cte
Dense Rank assigns same values for the same partition and assigs the sequential value to next partition ,unlike Rank
Look here for more info:
https://stackoverflow.com/a/7747342/2975396
Related
Using T-SQL, I want a new column that will show me the first day of each month, for the current year of getdate().
After that I need to count the rows on this specific date. Should I do it with CTE or a temp table?
If 2012+, you can use DateFromParts()
To Get a List of Dates
Select D = DateFromParts(Year(GetDate()),N,1)
From (values (1),(2),(3),(4),(5),(6),(7),(8),(9),(10),(11),(12)) N(N)
Returns
D
2017-01-01
2017-02-01
2017-03-01
2017-04-01
2017-05-01
2017-06-01
2017-07-01
2017-08-01
2017-09-01
2017-10-01
2017-11-01
2017-12-01
Edit For Trans Count
To get Transactions (assuming by month). It becomes a small matter of a left join to created Dates
-- This is Just a Sample Table Variable for Demonstration.
-- Remove this and Use your actual Transaction Table
--------------------------------------------------------------
Declare #Transactions table (TransDate date,MoreFields int)
Insert Into #Transactions values
('2017-02-18',6)
,('2017-02-19',9)
,('2017-03-05',5)
Select TransMonth = A.MthBeg
,TransCount = count(B.TransDate)
From (
Select MthBeg = DateFromParts(Year(GetDate()),N,1)
,MthEnd = EOMonth(DateFromParts(Year(GetDate()),N,1))
From (values (1),(2),(3),(4),(5),(6),(7),(8),(9),(10),(11),(12)) N(N)
) A
Left Join #Transactions B on TransDate between MthBeg and MthEnd
Group By A.MthBeg
Returns
TransMonth TransCount
2017-01-01 0
2017-02-01 2
2017-03-01 1
2017-04-01 0
2017-05-01 0
2017-06-01 0
2017-07-01 0
2017-08-01 0
2017-09-01 0
2017-10-01 0
2017-11-01 0
2017-12-01 0
For an adhoc table of months for a given year:
declare #year date = dateadd(year,datediff(year,0,getdate() ),0)
;with Months as (
select
MonthStart=dateadd(month,n,#year)
from (values(0),(1),(2),(3),(4),(5),(6),(7),(8),(9),(10),(11)) t(n)
)
select MonthStart
from Months
rextester demo: http://rextester.com/POKPM51023
returns:
+------------+
| MonthStart |
+------------+
| 2017-01-01 |
| 2017-02-01 |
| 2017-03-01 |
| 2017-04-01 |
| 2017-05-01 |
| 2017-06-01 |
| 2017-07-01 |
| 2017-08-01 |
| 2017-09-01 |
| 2017-10-01 |
| 2017-11-01 |
| 2017-12-01 |
+------------+
The first part: dateadd(year,datediff(year,0,getdate() ),0) adds the number of years since 1900-01-01 to the date 1900-01-01. So it will return the first date of the year. You can also swap year for other levels of truncation: year, quarter, month, day, hour, minute, second, et cetera.
The second part uses a common table expression and the table value constructor (values (...),(...)) to source numbers 0-11, which are added as months to the start of the year.
Not sure why you require recursive... But for first day of month you can try query like below:
Select Dateadd(day,1,eomonth(Dateadd(month, -1,getdate())))
declare #year date = dateadd(year,datediff(year,0,getdate() ),0)
;WITH months(MonthNumber) AS
(
SELECT 0
UNION ALL
SELECT MonthNumber+1
FROM months
WHERE MonthNumber < 11
)
select dateadd(month,MonthNumber,#year)
from months
I need to update a foreign key in table 1 with the correct entry based on table 2. The correct foreign key is the earliest date that falls after, but not before the next effective dates in table 2. If there are multiple entries in table 2 with the same effective date, then use the modified date column as a tie breaker and pick the most recent one. Here is the based table structure (all dates are in Date format):
Table 1
pK1 PeriodStartDate pK2
1 2016-04-01 00:00:00.000
2 2016-07-01 00:00:00.000
Table 2
pK2 EffectiveFrom ModifiedDate
3 2016-03-01 00:00:00.000 2016-04-01 00:00:00.000
4 2016-05-01 00:00:00.000 2016-06-01 00:00:00.000
5 2016-05-01 00:00:00.000 2016-06-02 00:00:00.000
So in the above example table 1 would look like this:
pK1 PeriodStartDate pK2
1 2016-04-01 00:00:00.000 3
2 2016-07-01 00:00:00.000 5
This is because for row 1 it falls between March 1st and May 1st (from table 2). And for row 2 it is after the last date, but as there are two similar start dates we choose the last modified.
I'm not sure of the solution. I was trying something like this:
UPDATE table1
SET pK2 = table2.pK2
FROM table2
WHERE PeriodStartDate > (SELECT FIRST(table2.EffectiveFrom) FROM table2)
I'm just not sure how to find an entry that is bounded by another row (and then needs another column for the tie breaker)
First off, you need to apply a row_number() over Table2, partitioned on the PeriodStart and ordered by the ModifiedDate (desc). Call this MaxModified; and 1 is always the most recently modified record.
pK2 PeriodStart ModifiedDate MaxModified
3 2016-03-01 00:00:00.000 2016-04-01 00:00:00.000 1
5 2016-05-01 00:00:00.000 2016-06-02 00:00:00.000 1
4 2016-05-01 00:00:00.000 2016-06-01 00:00:00.000 2
Then, for only where MaxModified=1, you add a new "id" to this so we can line up a start date, with the next rows start date (our end date). This is also done with the row_number() function ordered by the PeriodStart.
pK2 PeriodStart ModifiedDate MaxModified myID
3 2016-03-01 00:00:00.000 2016-04-01 00:00:00.000 1 1
5 2016-05-01 00:00:00.000 2016-06-02 00:00:00.000 1 2
Then we take that result and join it to itself offset by one row to get an end date value for each original row.
pK2 PeriodStart ModifiedDate MaxModified myID PeriodEnd
3 2016-03-01 00:00:00.000 2016-04-01 00:00:00.000 1 1 2016-05-01 00:00:00.000
5 2016-05-01 00:00:00.000 2016-06-02 00:00:00.000 1 2 NULL
Once we have that, its a simple matter of joining on the start/end dates to get our pk2 value.
Full script...
DECLARE #Table1 TABLE (pK1 INT, PeriodStart DATETIME, pK2 INT)
DECLARE #Table2 TABLE (pK2 INT, PeriodStart DATETIME, ModifiedDate DATETIME)
INSERT INTO #Table1
VALUES (1,'2016-04-01',NULL),
(2,'2016-07-01',NULL)
INSERT INTO #Table2
VALUES (3,'2016-03-01','2016-04-01'),
(4,'2016-05-01','2016-06-01'),
(5,'2016-05-01','2016-06-02')
;WITH OrderedList AS
(
SELECT *,
ROW_NUMBER() OVER(PARTITION BY PeriodStart ORDER BY ModifiedDate DESC) AS MaxModified
FROM #Table2
),X AS
(
SELECT *,
ROW_NUMBER() OVER(ORDER BY PeriodStart) AS myID
FROM OrderedList
WHERE MaxModified=1
), Y AS
(
SELECT L.*, R.PeriodStart AS PeriodEnd
FROM X L
LEFT JOIN X R ON L.myID=R.myID-1 AND R.MaxModified=1
WHERE L.MaxModified=1
)
UPDATE T SET pK2=Y.pK2
FROM #Table1 T
LEFT JOIN Y ON T.PeriodStart >= Y.PeriodStart AND T.PeriodStart < COALESCE(Y.PeriodEnd,CURRENT_TIMESTAMP)
SELECT *
FROM #Table1
Given a table Orders with columns:
id | revision | insertedAt
1 0 2016-01-01 00:00.000
1 1 2016-01-01 02:00.000
2 0 2016-01-01 02:00.000
Where the id, revision combination is unique.
How can I best migrate to this:
id | revision | applyFrom | applyTo
1 0 2016-01-01 00:00.000 2016-01-01 01:99.999
1 1 2016-01-01 02:00.000 9999-31-12 00:00.000
2 0 2016-01-01 02:00.000 9999-31-12 00:00.000
I've tried iterating over a CURSOR and updating as I go along.
UPDATE orders SET applyFrom = #newApplyFrom, applyTo = #newApplyTo
WHERE id = #id AND revision = #revision;
But with 226 million rows, estimated runtime is somewhere near 60 hours, even hitting the index.
Is there a faster way of achieving the same result? I can add indices as needed. Currently, there is a clustered index on (id, revision).
You can update like below: I am using lead and showing with select
;with cte as (
select *, lead(insertedAt,1,'9999-12-31 00:00.000') over(order by id) migdate from Orders
)
select *, case when insertedAt = migdate then '9999-12-31 00:00.000' else DATEADD(S, -1, migdate) end as applyto from cte
Here is a version including LEAD and a self join. Not sure about the performance on large data sets, but I've included batching just in case.
WITH cte AS (
SELECT
id,
revision,
insertedAt,
applyFrom,
applyTo,
LEAD(insertedAt) OVER (PARTITION BY id ORDER BY id, revision) AS newApplyTo
FROM orders
)
UPDATE TOP (#BatchSize) o SET
applyFrom = o.insertedAt,
applyTo = ISNULL(DATEADD(s, -1, o.newApplyTo), '9999-12-31')
FROM cte o
WHERE
o.applyFrom IS NULL AND
o.applyTo IS NULL;
The dataset I've used (with results) is:
Id revision insertedAt applyFrom applyTo
----------- ----------- --------------------------- --------------------------- ---------------------------
1 0 2016-01-01 00:00:00.0000000 2016-01-01 00:00:00.0000000 2016-01-01 01:59:59.0000000
1 1 2016-01-01 02:00:00.0000000 2016-01-01 02:00:00.0000000 9999-12-31 00:00:00.0000000
2 0 2016-01-01 02:00:00.0000000 2016-01-01 02:00:00.0000000 9999-12-31 00:00:00.0000000
3 0 2016-01-01 00:00:00.0000000 2016-01-01 00:00:00.0000000 2016-10-31 23:59:59.0000000
3 1 2016-11-01 00:00:00.0000000 2016-11-01 00:00:00.0000000 2016-11-30 23:59:59.0000000
3 2 2016-12-01 00:00:00.0000000 2016-12-01 00:00:00.0000000 9999-12-31 00:00:00.0000000
Using SQL Server 2012 I need to get the datediff of all dates in a Log table which has the same column, for example:
ID | Version | Status | Date
-----------------------------------------------------
12345 | 1 | new | 2014-05-01 00:00:00.000
12345 | 2 | up | 2014-05-02 00:00:00.000
12345 | 3 | appr | 2014-05-03 00:00:00.000
67890 | 1 | new | 2014-05-04 00:00:00.000
67890 | 2 | up | 2014-05-08 00:00:00.000
67890 | 3 | rej | 2014-05-13 00:00:00.000
I need to get the date diff of all sequential dates (date between 1, 2 and date between 2, 3)
I have tried creating a while but with no luck!
Your help is really appreciated!
This Calculates DateDiff as per your query "date diff of all sequential dates",if not sequential,it will just show same date.Further please don't use Reserved Keywords as Column names
SELECT ID,
[VERSION],
[STATUS],
[DATE],
CASE WHEN LEAD([DATE]) OVER (PARTITION BY ID ORDER BY [VERSION])=DATEADD(DAY,1,[DATE])
THEN CAST(DATEDIFF(DAY,[DATE],LEAD([DATE]) OVER (PARTITION BY ID ORDER BY VERSION)) AS VARCHAR(5))
ELSE [DATE] END AS DATEDIFFF
FROM
#TEMP
Another way with OUTER APPLY (get the previous value) :
SELECT t.*,
DATEDIFF(day,p.[Date],t.[Date]) as dd
FROM YourTable t
OUTER APPLY (
SELECT TOP 1 *
FROM YourTable
WHERE ID = t.ID AND [DATE] < t.[Date] AND [Version] < t.[Version]
ORDER BY [Date] DESC
) as p
Output:
ID Version Status Date dd
12345 1 new 2014-05-01 00:00:00.000 NULL
12345 2 up 2014-05-02 00:00:00.000 1
12345 3 appr 2014-05-03 00:00:00.000 1
67890 1 new 2014-05-04 00:00:00.000 NULL
67890 2 up 2014-05-08 00:00:00.000 4
67890 3 rej 2014-05-13 00:00:00.000 5
Note: If you are using SQL Server 2012 then better use LEAD and LAG functions.
I have a table where I have several cust_id duplicates. I would like to keep the row where prendate_next is nearest to the current date and delete the rest of the duplicates. Please help me how. I am new to this
cust_id prendate_next
1000105737 2014-11-30 00:00:00.000
1000105836 2014-11-20 00:00:00.000
1000143646 2014-11-10 00:00:00.000
1000143646 2015-03-09 00:00:00.000
1000179487 2014-12-05 00:00:00.000
1000182253 2015-01-01 00:00:00.000
1000192740 2014-10-02 00:00:00.000
1000192740 2015-01-10 00:00:00.000
1000199419 2015-09-30 00:00:00.000
1000170578 2014-12-26 00:00:00.000
1000188890 2015-06-23 00:00:00.000
1000189075 2015-03-01 00:00:00.000
1000189075 2015-03-01 00:00:00.000
1000189144 2015-04-04 00:00:00.000
;WITH cte AS (
SELECT cust_id, prendate_next,
ROW_NUMBER() OVER (PARTITION BY cust_id ORDER BY ABS(DATEDIFF(DAY,prendate_next,GETDATE()))) AS RowNumber
FROM MyTable
)
DELETE MyTable
FROM MyTable
INNER JOIN cte ON MyTable.cust_id = cte.cust_id
AND MyTable.prendate_next = cte.prendate_next
WHERE cte.RowNumber != 1
ABS(DATEDIFF(DAY,prendate_next,GETDATE())) counts how many days prendate_next is from today.