How to update SQL rows based on other rows with shared ID? - sql-server

Currently, I have a table that looks like below:
ID|Date |Val
1 |1/1/2016|1
2 |1/1/2016|0
3 |1/1/2016|0
1 |2/1/2016|0
2 |2/1/2016|1
3 |2/1/2016|1
1 |3/1/2016|0
2 |3/1/2016|0
3 |3/1/2016|0
I want to update it so that the value carries over for each ID, but not on earlier dates than when the value first appeared. Also, the value can only change 0 to 1, not vice versa. So the final product would look like:
ID|Date |Val
1 |1/1/2016|1
2 |1/1/2016|0
3 |1/1/2016|0
1 |2/1/2016|1
2 |2/1/2016|1
3 |2/1/2016|1
1 |3/1/2016|1
2 |3/1/2016|1
3 |3/1/2016|1
I've tried a few code combinations, but the conditional of carrying the value after the date where the value first appears is tripping me up. I'd appreciate any help!

In SQL Server 2012+, using the aggregate max() as a window function with over() (inside a common table expression to simplify the update):
;with cte as (
select *
, MaxVal = max(convert(int,val)) over (partition by id order by date)
from t
)
update cte
set val = maxVal
where val <> maxVal
rextester demo: http://rextester.com/ZPGWB94088
result:
+----+------------+-----+
| id | Date | Val |
+----+------------+-----+
| 1 | 2016-01-01 | 1 |
| 2 | 2016-01-01 | 0 |
| 3 | 2016-01-01 | 0 |
| 1 | 2016-02-01 | 1 |
| 2 | 2016-02-01 | 1 |
| 3 | 2016-02-01 | 1 |
| 1 | 2016-03-01 | 1 |
| 2 | 2016-03-01 | 1 |
| 3 | 2016-03-01 | 1 |
+----+------------+-----+
Prior to SQL Server 2012, you could use something like this:
update t
set Val = 1
from t
inner join (
select i.Id, min(i.Date) as Date
from t as i
where i.Val = 1
group by i.Id
) as m
on t.Id = m.Id
and t.Date >= m.Date
and t.Val = 0
rextester demo: http://rextester.com/RLEAO15622

Related

SQL Server: Flag only First duplicate row

I want to flag only the first duplicate ID-VL combination in the dataset shown below. Column FirstOccurence is what I want the end result to be.
ID VL FirstOccurence
1 a 1
1 b 1
2 a 1
2 a 0
3 a 1
3 a 0
4 a 1
4 a 0
5 a 1
5 b 1
5 a 0
There is currently not a unique index available in the original table.
Is there any way to do this with for instance the LAG-functionality? I cannot find any examples online that result in the flagging of duplicates. Any suggestions are much appreciated!
Kind regards,
Igor
One method is with ROW_NUMBER() along with a CASE expression:
SELECT
ID
,VL
,CASE ROW_NUMBER() OVER(PARTITION BY ID, VL ORDER BY ID, VL) WHEN 1 THEN 1 ELSE 0 END AS FirstOccurance
FROM dbo.example
ORDER BY
ID
,VL
,FirstOccurance;
Results:
+----+----+----------------+
| ID | VL | FirstOccurance |
+----+----+----------------+
| 1 | a | 1 |
| 1 | b | 1 |
| 2 | a | 0 |
| 2 | a | 1 |
| 3 | a | 0 |
| 3 | a | 1 |
| 4 | a | 0 |
| 4 | a | 1 |
| 5 | a | 0 |
| 5 | a | 1 |
| 5 | b | 1 |
+----+----+----------------+
Note that this result order differs from your end result. If there are one or more columns present in the table that provide the same ordering as the results in you question, specify that in the ORDER BY clause instead.

Calculating week numbers from custom dates

I have client ids and their dates of login. i want to calculate the week number with respect to their first login date
i am fairly new to sql
Demo output
ClientID Date of login Week Number
1 2019-12-20 1
1 2019-12-21 1
1 2019-12-21 1
1 2019-12-22 1
1 2019-12-29 2
1 2019-12-29 2
2 2020-01-27 1
2 2020-01-28 1
2 2020-02-05 2
2 2020-02-06 2
2 2020-02-16 3
This is very trivial date arithmetic that just requires the min DateOfLogin for each ClientID, which you can find with a windowed function.
Calculate the datediff in days between this date and the current DateOfLogin, integer divide by 7 (to return no fractional days) and then add 1 to correctly offset the WeekNum value:
declare #l table(ClientID int, DateOfLogin date);
insert into #l values(1,'2019-12-20'),(1,'2019-12-21'),(1,'2019-12-21'),(1,'2019-12-22'),(1,'2019-12-29'),(1,'2019-12-29'),(2,'2020-01-27'),(2,'2020-01-28'),(2,'2020-02-05'),(2,'2020-02-06'),(2,'2020-02-16');
select ClientID
,DateOfLogin
,(datediff(day,min(DateOfLogin) over (partition by ClientID),DateOfLogin) / 7) + 1 as WeekNum
from #l;
Output
+----------+-------------+---------+
| ClientID | DateOfLogin | WeekNum |
+----------+-------------+---------+
| 1 | 2019-12-20 | 1 |
| 1 | 2019-12-21 | 1 |
| 1 | 2019-12-21 | 1 |
| 1 | 2019-12-22 | 1 |
| 1 | 2019-12-29 | 2 |
| 1 | 2019-12-29 | 2 |
| 2 | 2020-01-27 | 1 |
| 2 | 2020-01-28 | 1 |
| 2 | 2020-02-05 | 2 |
| 2 | 2020-02-06 | 2 |
| 2 | 2020-02-16 | 3 |
+----------+-------------+---------+
This query returns the week number.
select DATENAME(WW, '2019-12-20')
This is for MSSQL.
Here might be a solution for you, you'll maybe just have to look at the way you are going to do the insert and maybe optimize it a bit better.
select 1 AS 'ClientID', '2019-12-20' AS 'LogInDate', 1 AS 'Week'
into #test
insert into #test
select top(1) 1, '2020-02-05', case DATEDIFF(week,'2020-02-05',LogInDate) when 0 then week else Week +1 end from #test where ClientID = 1 order by LogInDate desc

How to find difference between two tables in MSSQL

I have got two tables 'Customer'.
The first one:
ID | UserID | Date
1. | 1 | 2018-05-01
2. | 1 | 2018-05-02
The second one:
ID | UserID | Date
1. | 1 | 2018-05-01
2. | 1 | 2018-05-02
3. | 1 | 2018-05-03
So, as you can see in the second table, there is one row more.
I have written so far this code:
;with cte_table1 as (
select UserID, count(id) cnt from db1.Customer group by UserID
),
cte_table2 as (
select UserID, count(id) cnt from db2.Customer group by UserID
)
select * from cte_table1 t1
join cte_table2 t2 on t2.UserID = t1.UserID
where t1.cnt <> t2.cnt
and this gives me expected result:
UserID | cnt | UserID | cnt
1 | 2 | 1 | 3
And so far, everything is fine. The thing is, these two tables have many rows and I'd like to have result with dates, where cnt does not match.
In other words, I'd like to have something like this:
UserID | cnt | Date | UserID | cnt | Date
1 | 2 | 2018-05-01 | 1 | 3 | 2018-05-01
1 | 2 | 2018-05-02 | 1 | 3 | 2018-05-01
1 | 2 | NULL | 1 | 3 | 2018-05-03
The best soulution would be resultset where both cte's are joined to give this:
UserID | cnt | Date | UserID | cnt | Date
1 | 2 | 2018-05-01 | 1 | 3 | 2018-05-01
1 | 2 | 2018-05-02 | 1 | 3 | 2018-05-01
1 | 2 | NULL | 1 | 3 | 2018-05-03
1 | 2 | 2018-05-30 | 1 | 3 | NULL
You should do a FULL OUTER JOIN query like below
Select
C1.UserID,
C1.cnt,
C1.Date,
C2.UserID,
C2.cnt,
C2.Date
from
db1.Customer C1
FULL OUTER JOIN
db2.Customer C2
on C1.UserId=C2.UserId and C1.date=C2.Date

SQL group by date difference with previous row

I looking for some grouping using datetime daily rows to build date range intervals
My table is something like:
id | A | B | Date
1 | 1 | 2 | 1/10/2010
2 | 1 | 2 | 2/10/2010
3 | 1 | 2 | 3/10/2010
4 | 1 | 3 | 4/10/2010
5 | 1 | 3 | 5/10/2010
6 | 1 | 2 | 6/10/2010
7 | 1 | 2 | 7/10/2010
8 | 1 | 2 | 8/10/2010
My first try was:
SELECT A, B, MIN(DATE), MAX(date)
FROM table
GROUP BY A, B
So after group by A, B and use min and max with date on my select, I get invalid results due the repetition of B = 2.
A B Date A B min(Date) max(Date)
1 | 1 | 2 | 1/10/2010 1 2 | 1/10/2010 8/10/2010
2 | 1 | 2 | 2/10/2010 Invalid
3 | 1 | 2 | 3/10/2010 ------->
6 | 1 | 2 | 6/10/2010
7 | 1 | 2 | 7/10/2010
8 | 1 | 2 | 8/10/2010
I'm looking for how to calculate the third member of the group by...
So the expected intervals results:
A B Start Date End Date
.. | 1 | 2 | 1/10/2010 | 3/10/2010
.. | 1 | 3 | 4/10/2010 | 5/10/2010
.. | 1 | 2 | 6/10/2010 | 8/10/2010
I need to support SQL Server 2008
Thank you in advance for your help
The following is an easy way to deal with "islands and gaps" where you need to find gaps in consecutive dates:
SELECT A, B, StartDate = MIN([Date]), EndDate = MAX([Date])
FROM
(
SELECT *,
RN = DATEDIFF(DAY, 0, [Date]) - ROW_NUMBER() OVER (PARTITION BY A, B ORDER BY [Date])
FROM myTable
) AS T
GROUP BY A, B, RN;
To break it down into slightly simpler-to-understand logic: you assign each date a number (DATEDIFF(DAY, 0, [Date]) here) and each date a row number (partitioned by A and B here), then any time there's a gap in the dates, the difference between those two will change.
There are a variety of resources you can use to understand different approaches to "islands and gaps" problems. Here is one that might help you with tackling other varieties of this in the future: https://www.red-gate.com/simple-talk/sql/t-sql-programming/the-sql-of-gaps-and-islands-in-sequences/

How to combine multiple rows into one row and multiple column in SQL Server?

I have different tables through I made temp table and here is the result set of temp table:
car_id | car_type | status | count
--------+----------+---------+------
100421 | 1 | 1 | 9
100421 | 1 | 2 | 8
100421 | 1 | 3 | 3
100421 | 2 | 1 | 6
100421 | 2 | 2 | 8
100421 | 2 | 3 | 3
100422 | 1 | 1 | 5
100422 | 1 | 2 | 8
100422 | 1 | 3 | 7
Here is the meaning of status column:
1 as sale
2 as purchase
3 as return
Now I want to show this result set as below
car_id | car_type | sale | purchase | return
--------+----------+------+----------+----------
100421 | 1 | 9 | 8 | 3
100421 | 2 | 6 | 8 | 3
100422 | 1 | 5 | 8 | 7
I tried but unable to generate this result set. Can anyone help?
You can also use a CASE expression.
Query
select [car_id], [car_type],
max(case [status] when 1 then [count] end) as [sale],
max(case [status] when 2 then [count] end) as [purchase],
max(case [status] when 3 then [count] end) as [return]
from [your_table_name]
group by [car_id], [car_type]
order by [car_id];
Try this
select car_id ,car_type, [1] as Sale,[2] as Purchase,[3] as [return]
from (select car_id , car_type , [status] ,[count] from tempTable)d
pivot(sum([count]) for [status] in([1],[2],[3]) ) as pvt
also you can remove the subquery if you don't have any condition
like
select car_id ,car_type, [1] as Sale,[2] as Purchase,[3] as [return]
from tempTable d
pivot(sum([count]) for [status] in([1],[2],[3]) ) as pvt

Resources