How can I update date column of a table in a database(mssql) by 1 year for 1st 1000 data, 2 year for 2nd 1000 data and so on... I know how to implement it by assigning temporary id but is there a way to update data in a loop manner??
for example:
suppose if I have 6000 datas in table with joined_date column in range from 2012-01-01 to 2017-01-01 ordered in ascending order, I want to update first thousand rows increasing it by 1 year, 2nd thousand rows by 1 year as well and so on...
If my first thousand data contain joined date on year 2012, I want to update it to 2013 and if my 2nd thousand data contain joined date on year 2012 to 2013 then I want to increment it by 1 as well.
We can try assigning a row number to your table, then use it to do the updates:
WITH cte AS (
SELECT joined_date, ROW_NUMBER() OVER (ORDER BY joined_date) - 1 rn
FROM yourTable
)
UPDATE cte
SET joined_date = DATEADD(year, (rn % 1000) + 1, joined_date);
The trick here is that the first 1000 rows, which would receive a row number of 0 up to and including 999, would have an rn % 1000 value of 0, to which we add 1 to get the number of years to add. The next 1000 records would have 2 years added, and so on.
Related
Is there a relatively simple way to create rows in a table based on a range of dates?
For example; given:
ID
Date_min
Date_max
1
2022-02-01
2022-20-05
2
2022-02-09
2022-02-12
I want to output:
ID
Date_in_Range
1
2022-02-01
1
2022-02-02
1
2022-02-03
1
2022-02-04
1
2022-02-05
2
2022-02-09
2
2022-02-10
2
2022-02-11
2
2022-02-12
I saw a solution when the range is integer based (How to create rows based on the range of all values between min and max in Snowflake (SQL)?)
But in order to use that approach GENERATOR(ROWCOUNT => 1000) I have to convert my dates to integers and back, and it just gets very messy very quick, especially since I need to apply this to millions of rows.
So, I was wondering if there is a simpler way to do it when dealing with dates instead of integers? Any hints anyone can provide?
Another one without using generator -
with data (ID,Date_min,Date_max) as (
select * from values
(1,to_date('2022-02-01','YYYY-DD-MM'),to_date('2022-20-05','YYYY-DD-MM')),
(2,to_date('2022-02-09','YYYY-DD-MM'),to_date('2022-02-12','YYYY-DD-MM'))
)
select id,
Date_min,
Date_max,
dateadd(day, index, Date_min) day_slots from data,
table(split_to_table(repeat(',',datediff(day, Date_min, Date_max)-1),','));
SQL with first date -
with data (ID,Date_min,Date_max) as (
select * from values
(1,to_date('2022-02-01','YYYY-DD-MM'),to_date('2022-20-05','YYYY-DD-MM')),
(2,to_date('2022-02-09','YYYY-DD-MM'),to_date('2022-02-12','YYYY-DD-MM'))
)
select id,
dateadd(month, index-1, Date_min) day_slots from data,
table(split_to_table(repeat(',',datediff(month, Date_min, Date_max)),','));
But in order to use that approach GENERATOR(ROWCOUNT => 1000) I have to convert my dates to integers and back, and it just gets very messy very quick, especially since I need to apply this to millions of rows.
There is no need to convert date to int back and forth, just simple DATEADD('day', num, start_date)
Pseudocode:
WITH sample_data(id, date_min, date_max) AS (
SELECT 1, '2022-02-01'::DATE, '2022-02-05'::DATE
UNION
SELECT 2, '2022-02-09'::DATE, '2022-02-12'::DATE
) , numbers AS (
SELECT ROW_NUMBER() OVER(ORDER BY SEQ4())-1 AS num -- 0 based
FROM TABLE(GENERATOR(ROWCOUNT => 1000)) -- should match max anticipated span
)
SELECT s.id, DATEADD(DAY, n.num, s.date_min) AS calculated_date
FROM sample_data AS s
JOIN numbers AS n
ON DATEADD('DAY', n.num, s.date_min) BETWEEN s.date_min AND s.date_max
ORDER BY s.id, calculated_date;
Ouptut:
I have table Stock:
Id
OpeningQty
OpeningRate
CurrentQty
CurrentRate
ConsumedQty
ProductId
OpDate
1
10
100
4
100
6
20
2022-01-01
2
5
500
2
500
3
25
2022-01-20
So I am trying to get all the columns of stock table by passing date range but the OpeningQty should be currentQty (closingQty) for next day and next day so on.
I tried:
select
Id,
OpeningQty,
OpeningRate,
CurrentQty,
CurrentRate,
ConsumedQty,
ProductId,
from
Stock
where
convert(date, OpDate) between '2022-01-18' and '2022-01-23'
I don't know how to get.
You can use the lag() window function to access "previous" rows in a partition.
SELECT id,
lag(currentqty, 1, openingqty) OVER (PARTITION BY productid
ORDER BY opdate) AS openingqty,
openingrate,
currentqty,
currentrate,
consumedqty,
productid,
FROM stock
WHERE opdate >= '2022-01-18'
AND opdate < '2022-01-24';
And don't use casting and BETWEEN for your condition on the point in time. Use a half open range with the next day (or hour, or minute, or ...) as upper boundary. That way you don't have to cast, which can render indexes useless and don't have to worry when the precision increases.
I have a table containing available records of users from which I have to prioritize users having last 5th day data only from current date, then users with 4th or 5th day data only from current, then users with 3rd, 4th or 5th day data only and so on.
I have grouped last 5 days data by user and date and get the below result
User Date Available records
1. 1001 31-08-2019 2
2. 1001 30-08-2019 3
3. 1002 27-08-2019 1
4. 1002 28-08-2019 3
5. 1003 27-08-2019 2
Now I need to select those users only having data available for last 5th day only. For eg. if last 5th day date is 27 from current, then I need to get
1003 user only. After that I need to get user with last 4th or 5th day data only i.e. 1002.
I have tried the same with below query but didn't get desired result
SELECT *
FROM AgentCallDateWiseCalls
WHERE (CAST(CallDate AS DATE) <= CAST(GETDATE() - 5 AS DATE)
AND CallsAvailable >= 0)
AND (CAST(CallDate AS DATE) > CAST(GETDATE() - 5 AS DATE) AND CallsAvailable = 0)
Where 5 will be variable.
use this :
SELECT a1.*
FROM [AgentCallDateWiseCalls] a1
LEFT JOIN (
SELECT DISTINCT [User]
FROM [AgentCallDateWiseCalls]
WHERE (CAST([CallDate] AS DATE) > CAST(GETDATE() - 5 AS DATE)
AND [CallsAvailable] > 0)
) a2 ON a1.[User] = a2.[User]
WHERE (CAST(a1.[CallDate] AS DATE) <= CAST(GETDATE() - 5 AS DATE)
AND a1.[CallsAvailable] > 0)
AND a2.[User] IS NULL
this query select all users that having record with CallsAvailable > 0 in last 5th day and no records in 1st to 4th last days.
I have a sql server 2008 r2 database.
I have a table called hystrealdata in which are stored production data of an automotiv machine every n seconds. Thus, it is structured like this:
dataregvalue timestamp
--------------------------------------------------------------------------
0 1507190476
0 1507190577
0 1507190598
0 1507190628
1 1507190719
1 1507190750
1 1507190780
1 1507190811
1 1507190841
2 1507190861
2 1507190892
2 1507190922
2 1507190953
2 1507190983
5 1507190477
I need to select the first occurrence of a dataregvalue in the first row, then the difference between the next dataregvalue and the previous one. Next to this data I would like to have the first timestamp in which dataregvalue canges. An example of the select would be:
data_change timestamp
---------------------------
0 1507190476 <- first time in which the dataregvalue is 0
1 1507190719 <- first time in which the dataregvalue changes
1 1507190861 <- first time in which the dataregvalue changes
3 1507190477 <- first time in which the dataregvalue changes
If this is too difficult, it would be fine to have the information about the difference between dataregvalues in a new column like this:
dataregvalue data_change timestamp
---------------------------------------------
0 0 1507190476
1 1 1507190719
2 1 1507190861
5 3 1507190477
How can this be done?
Thanks in advance!
You can use the LAG analytic function to read the previous value in a partition, eg :
Select
dataregvalue,
dataregvalue - LAG(dataregvalue,1) OVER (ORDER BY timestamp) as data_change,
timestamp
from MyTable
This will return the change on all rows. The rows where there is a change will have a data_change value >0. The first row will have a NULL value because there is no previous row.
Unfortunately, you can't refer to data_change in the WHERE clause. You'll have to use a CTE :
WITH changes as (
Select
dataregvalue,
dataregvalue - LAG(dataregvalue,1) OVER (ORDER BY timestamp) as data_change,
timestamp
from MyTable
)
select *
from changes
where
data_change >0 or
data_change is null
The LAG and the corresponing LEAD functions can be used to detect gaps and islands in a sequence as well. Each row will have an ID that is one greater than the previous one. In a gap, the difference will be >1.
This is the input table:
Customer_ID Date Amount
1 4/11/2014 20
1 4/13/2014 10
1 4/14/2014 30
1 4/18/2014 25
2 5/15/2014 15
2 6/21/2014 25
2 6/22/2014 35
2 6/23/2014 10
There is information pertaining to multiple customers and I want to get a rolling sum across a 3 day window for each customer.
The solution should be as below:
Customer_ID Date Amount Rolling_3_Day_Sum
1 4/11/2014 20 20
1 4/13/2014 10 30
1 4/14/2014 30 40
1 4/18/2014 25 25
2 5/15/2014 15 15
2 6/21/2014 25 25
2 6/22/2014 35 60
2 6/23/2014 10 70
The biggest issue is that I don't have transactions for each day because of which the partition by row number doesn't work.
The closest example I found on SO was:
SQL Query for 7 Day Rolling Average in SQL Server
but even in that case there were transactions made everyday which accomodated the rownumber() based solutions
The rownumber query is as follows:
select customer_id, Date, Amount,
Rolling_3_day_sum = CASE WHEN ROW_NUMBER() OVER (partition by customer_id ORDER BY Date) > 2
THEN SUM(Amount) OVER (partition by customer_id ORDER BY Date ROWS BETWEEN 2 PRECEDING AND CURRENT ROW)
END
from #tmp_taml9
order by customer_id
I was wondering if there is way to replace "BETWEEN 2 PRECEDING AND CURRENT ROW" by "BETWEEN [DATE - 2] and [DATE]"
One option would be to use a calendar table (or something similar) to get the complete range of dates and left join your table with that and use the row_number based solution.
Another option that might work (not sure about performance) would be to use an apply query like this:
select customer_id, Date, Amount, coalesce(Rolling_3_day_sum, Amount) Rolling_3_day_sum
from #tmp_taml9 t1
cross apply (
select sum(amount) Rolling_3_day_sum
from #tmp_taml9
where Customer_ID = t1.Customer_ID
and datediff(day, date, t1.date) <= 3
and t1.Date >= date
) o
order by customer_id;
I suspect performance might not be great though.