i am searching for a solution in regards to joining and MSSQL.
I have two tables.
The first one the Basic Table:
ID, Name, Key
1, Test1, 1x11
2, Test2, 2x22
3, Test3, 3x33
The second is the table which I want to join to the Basic table:
Key, Action, create, close, duration
1x11, 1, 01/01/2021 06:00,01/01/2021 07:00, 1
1x11, 5, 01/01/2021 07:00,01/01/2021 10:00, 1
1x11, 10, 01/01/2021 10:00,0, 0
2x22, 1, 01/01/2021 10:00,01/01/2021 11:00, 1
2x22, 5, 01/01/2021 11:00,01/01/2021 12:00, 1
2x22, 7, 01/01/2021 12:00,01/01/2021 13:00, 1
2x22, 5, 01/01/2021 13:00,01/01/2021 14:00, 1
2x22, 10, 01/01/2021 14:00,0, 0
3x33, 1, 01/01/2021 10:00,01/01/2021 12:00, 2
3x33, 10, 01/01/2021 12:00,0, 0
In this table the closedate was not given, so i had to use the following command to get the closedate (closedate is the next createdate):
lead (create,1) OVER (PARTITION BY Key ORDER BY create) AS close
Now, my goal is to join the sum(of ActionNumber 5 per Key) to the basic table
Can someone tell me how to do that? I am really frustrated.
Final Table:
ID, Name, Key, join(sum of 5)
1, Test1, 1x11,1
2, Test2, 2x22,2 (because there are two times one hour that means 2h)
3, Test3, 3x33,0
Thanks for helping. Christian
If the two tables exist then this should be a simple aggregation.
SELECT
B.ID,
B.Name,
B.Key,
CountAction5 = SUM(CASE WHEN S.Action = 5 THEN Duration ELSE 0 END)
FROM
BasicTable B
INNER JOIN SecondTable S ON S.Key = B.Key
GROUP BY
B.ID,
B.Name,
B.Key
This is simple, all you need is to do conditional aggregation:
SELECT [key], SUM(CASE WHEN Action = 5 THEN duration ELSE 0 END)
FROM t
GROUP BY [key]
where t is the second table.
Output:
key sum_of_5
-------------
1x11 1
2x22 2
3x33 0
To join back to the original table use a derived table:
SELECT [key], name, sum_of_5
FROM t1
JOIN (
SELECT
[key]
, SUM(CASE WHEN Action = 5 THEN duration ELSE 0 END)
FROM t
GROUP BY [key]
) t2 ON t1.[key] = t2.[key]
Demo here
Related
This question already has answers here:
Get top 1 row of each group
(19 answers)
Closed 11 months ago.
I have a table that looks like this.
Category
Type
fromDate
Value
1
1
1/1/2022
5
1
2
1/1/2022
10
2
1
1/1/2022
7.5
2
2
1/1/2022
15
3
1
1/1/2022
3.5
3
2
1/1/2022
5
3
1
4/1/2022
5
3
2
4/1/2022
10
I'm trying to filter this table down to filter down and keep the most recent grouping of Category/Type. IE rows 5 and 6 would be removed in the query since they are older records.
So far I have the below query but I am getting an aggregate error due to not aggregating the "Value" column. My question is how do I get around this without aggregating? I want to keep the actual value that is in the column.
SELECT T1.Category, T1.Type, T2.maxDate, T1.Value
FROM (SELECT Category, Type, MAX(fromDate) AS maxDate
FROM Table GROUP BY Category,Type) T2
INNER JOIN Table T1 ON T1.Category=T2.Category
GROUP BY T1.Category, T1.Type, T2.MaxDate
This has been asked and answered dozens and dozens of times. But it was quick and painless to type up an answer. This should work for you.
declare #MyTable table
(
Category int
, Type int
, fromDate date
, Value decimal(5,2)
)
insert #MyTable
select 1, 1, '1/1/2022', 5 union all
select 1, 2, '1/1/2022', 10 union all
select 2, 1, '1/1/2022', 7.5 union all
select 2, 2, '1/1/2022', 15 union all
select 3, 1, '1/1/2022', 3.5 union all
select 3, 2, '1/1/2022', 5 union all
select 3, 1, '4/1/2022', 5 union all
select 3, 2, '4/1/2022', 10
select Category
, Type
, fromDate
, Value
from
(
select *
, RowNum = ROW_NUMBER() over(partition by Category, Type order by fromDate desc)
from #MyTable
) x
where x.RowNum = 1
order by x.Category
, x.Type
I have this SQL Server table table1 which I want to fill with dummy rows per acct up to latest previous month end date period e.g now would be up to 2021-06-30.
In this example, acct 1 has n number of rows which ends at 2020-05-31, and I want to insert dummy rows with same values for acct and amt with begin_date and end_date incrementing by 1 month up to 06-30-2021.
Let's assume acct 2 already ends at 06-30-2021 so this doesn't need dummy rows to be inserted.
acct,amt,begin_date,end_date
1 , 10, 2020-04-01, 2020-04-30
1 , 10, 2020-05-01, 2020-05-31
2 , 50, 2021-05-01, 2021-05-31
2 , 50, 2021-06-01, 2021-06-30
So for acct 1, I want n number of rows to be inserted from last period of 2020-05-31 up to previous month end which is now 06-30-2021 and I want the amt and acct to remain same. So it would look like this below:
acct,amt,begin_date,end_date
1 , 10, 2020-04-01, 2020-04-30
1 , 10, 2020-05-01, 2020-05-31
1 , 10, 2020-06-01, 2020-06-30
1 , 10, 2020-07-01, 2020-07-31
.............................
.............................
1 , 10, 2021-06-01, 2021-06-30
Based on some data anamolies, I realize I need another condition to the solution. Suppose another column type was added to the table1. So acct and type would be the composite key that identifies each related row hence acct 2 type A and acct 2 type B are not related. So we have the updated table:
acct,type,amt,begin_date,end_date
1, A, 10, 2020-04-01, 2020-04-30
1, A, 10, 2020-05-01, 2020-05-31
2, A, 50, 2021-05-01, 2021-05-31
2, A, 50, 2021-06-01, 2021-06-30
2, B, 50, 2021-01-01, 2021-01-31
2, B, 50, 2021-02-01, 2021-02-28
I would now need dummy rows to be created for acct 2 type B up to 2021-06-30. We already know acct 2 type A would be ok since it already has rows up to 2021-06-30
You can generate the rows using a recursive CTE:
with cte as (
select acct, amt,
dateadd(day, 1, end_date) as begin_date,
eomonth(dateadd(day, 1, end_date)) as end_date
from (select t.*,
row_number() over (partition by acct order by end_date desc) as seqnum
from t
) t
where seqnum = 1 and end_date < '2021-06-30'
union all
select acct, amt, dateadd(month, 1, begin_date),
eomonth(dateadd(month, 1, begin_date))
from cte
where begin_date < '2021-06-01'
)
select *
from cte;
You can then use insert to insert these rows into a table. Or use union all if you simply want a result set with all the rows.
Here is a db<>fiddle.
IP QID ScanDate Rank
101.110.32.80 6 2016-09-28 18:33:21.000 3
101.110.32.80 6 2016-08-28 18:33:21.000 2
101.110.32.80 6 2016-05-30 00:30:33.000 1
I have a Table with certain records, grouped by Ipaddress and QID.. My requirement is to find out which record missed the sequence in the date column or other words the date difference is more than 30 days. In the above table date diff between rank 1 and rank 2 is more than 30 days.So, i should flag the rank 2 record.
You can use LAG in Sql 2012+
declare #Tbl Table (Ip VARCHAR(50), QID INT, ScanDate DATETIME,[Rank] INT)
INSERT INTO #Tbl
VALUES
('101.110.32.80', 6, '2016-09-28 18:33:21.000', 3),
('101.110.32.80', 6, '2016-08-28 18:33:21.000', 2),
('101.110.32.80', 6, '2016-05-30 00:30:33.000', 1)
;WITH Result
AS
(
SELECT
T.Ip ,
T.QID ,
T.ScanDate ,
T.[Rank],
LAG(T.[Rank]) OVER (ORDER BY T.[Rank]) PrivSRank,
LAG(T.ScanDate) OVER (ORDER BY T.[Rank]) PrivScanDate
FROM
#Tbl T
)
SELECT
R.Ip ,
R.QID ,
R.ScanDate ,
R.Rank ,
R.PrivScanDate,
IIF(DATEDIFF(DAY, R.PrivScanDate, R.ScanDate) > 30, 'This is greater than 30 day. Rank ' + CAST(R.PrivSRank AS VARCHAR(10)), '') CFlag
FROM
Result R
Result:
Ip QID ScanDate Rank CFlag
------------------------ ----------- ----------------------- ----------- --------------------------------------------
101.110.32.80 6 2016-05-30 00:30:33.000 1
101.110.32.80 6 2016-08-28 18:33:21.000 2 This is greater than 30 day. Rank 1
101.110.32.80 6 2016-09-28 18:33:21.000 3 This is greater than 30 day. Rank 2
While Window Functions could be used here, I think a self join might be more straight forward and easier to understand:
SELECT
t1.IP,
t1.QID,
t1.Rank,
t1.ScanDate as endScanDate,
t2.ScanDate as beginScanDate,
datediff(day, t2.scandate, t1.scandate) as scanDateDays
FROM
table as t1
INNER JOIN table as t2 ON
t1.ip = t2.ip
t1.rank - 1 = t2.rank --get the record from t2 and is one less in rank
WHERE datediff(day, t2.scandate, t1.scandate) > 30 --only records greater than 30 days
It's pretty self-explanatory. We are joining the table to itself and joining the ranks together where rank 2 gets joined to rank 1, rank 3 gets joined to rank 2, and so on. Then we just test for records that are greater than 30 days using the datediff function.
I would use windowed function to avoid self join which in many case will perform better.
WITH cte
AS (
SELECT
t.IP
, t.QID
, LAG(t.ScanDate) OVER (PARTITION BY t.IP ORDER BY T.ScanDate) AS beginScanDate
, t.ScanDate AS endScanDate
, DATEDIFF(DAY,
LAG(t.ScanDate) OVER (PARTITION BY t.IP ORDER BY t.ScanDate),
t.ScanDate) AS Diff
FROM
MyTable AS t
)
SELECT
*
FROM
cte c
WHERE
Diff > 30;
I have a table with three columns, ID, Date, Value. I want to rank the rows such that, within an ID, the Ranking goes up with each date where Value is at least X, otherwise, Ranking stays the same.
Given ID, Date, and Values like these
1, 6/1, 8
1, 6/2, 12
1, 6/3, 14
1, 6/4, 9
1, 6/5, 11
I would like to return a ranking based on values of at least 10, such that I would have ID, Date, Value, and Rank like this:
1, 6/1, 8, 0
1, 6/2, 12, 1
1, 6/3, 14, 2
1, 6/4, 9, 2
1, 6/5, 11, 3
In other words, the ranking increases each time the value exceeds a threshhold, otherwise it stays the same.
What I have tried is
SELECT T1.*, X.Ranking FROM TABLE T1
LEFT JOIN ( SELECT *, DENSE_RANK( ) OVER ( PARTITION BY T2.ID ORDER BY T2.DATE ) Ranking
FROM TABLE T2 WHERE T2.VALUE >= 10 ) X
ON T1.ID = T2.ID AND T1.Date = T2.Date
This almost works. It gets me output like
1, 6/1, 8, NULL
1, 6/2, 12, 1
1, 6/3, 14, 2
1, 6/4, 9, NULL
1, 6/5, 11, 3
Then, I want to turn the first NULL into a 0, and the second into a 2.
I turned the above query into a cte and tried
SELECT T1.*, CASE WHEN T1.Ranking IS NULL THEN ISNULL( (
SELECT MAX( T2.Ranking )
FROM cte T2 WHERE T1.ID = T2.ID AND T1.Date > T2.Date, 0 )
ELSE T1.Ranking END NewRanking
FROM cte T1
This looks like it would work, but my table has 200,000 rows and the query ran for 25 minutes... So, I'm looking for something a little more out of the box than the SELECT MAX.
You are using SQL Server 2012, so you can do a cumulative sum:
select t.*,
sum(case when value >= 10 then 1 else 0 end) over
(partition by id order by date) as ranking
from table t;
EDIT: This actually does not work. In spirit it fetches the previous LAG value and increment it, but this is not how LAG works... it would be 'recursive' in essence which results in a 'my_rank' is undefined syntax error. Better solution is the accepted answer based on a cumulative sum.
If you have SQL Server 2012 (you didn't tag your question), you can do something like:
SELECT
LAG(my_rank, 1, 0) OVER (ORDER BY DATE)
+ CASE WHEN VALUE >= 10 THEN 1 ELSE 0 END AS my_rank
FROM T1
I am running a SQL query in MSSQL 2008 R2 which should always return a consistent resultset, meaning that all dates within a selected date range should be shown, although there are no rows/values in the database for a particular date within the date range. It should for example look like this for the dates 2013-07-03 - 2013-07-04 when there are values for id 1 and 2.
Scenario 1
Date-hour, value, id
2013-07-03-1, 10, 1
2013-07-03-2, 12, 1
2013-07-03-...
2013-07-03-24, 9, 1
2013-07-04-1, 10, 1
2013-07-04-2, 10, 1
2013-07-04-...
2013-07-04-24, 10, 1
2013-07-03-1, 11, 2
2013-07-03-2, 12, 2
2013-07-03-...
2013-07-03-24, 9, 2
2013-07-04-1, 10, 2
2013-07-04-2, 12, 2
2013-07-04-...
2013-07-04-24, 10, 2
However, if id 2 is missing values for 2013-07-04, I will normally only get a resultset which looks like this:
Scenario 2
Date-hour, value, id
2013-07-03-1, 10, 1
2013-07-03-2, 12, 1
2013-07-03-...
2013-07-03-24, 9, 1
2013-07-04-1, 10, 1
2013-07-04-2, 10, 1
2013-07-04-...
2013-07-04-24, 10, 1
2013-07-03-1, 11, 2
2013-07-03-2, 12, 2
2013-07-03-...
2013-07-03-24, 9, 2
Scenario 2 will create an inconsistent resultset which will affect the output. Is there any way to make the SQL query always return as scenario 1 even when there are missing values, so at least to return NULL if there are no values for a specific date within the date range. If the resultset returns id 1 and 2 then all dates for id 1 and 2 should be covered. If id 1, 2 and 3 are returned then all dates for id 1, 2 and 3 should be covered.
I have two tables which look like this:
tbl_measurement
id, date, hour1, hour2, ..., hour24
tbl_plane
planeId, id, maxSpeed
The SQL query I am running look like this:
SELECT DISTINCT hour00_01, hour01_02, mr.date, mr.id, maxSpeed
FROM tbl_measurement as mr, tbl_plane as p
WHERE (date >= '2013-07-03' AND date <= '2013-07-04') AND p.id = mr.id
GROUP BY mr.id, mr.date, hour00_01, hour01_02, p.maxSpeed
ORDER BY mr.id, mr.date
I have been looking around quite a bit, and perhaps PIVOT tables are the way to solve this? Could you please help me out? I would appreciate if you can help me out with how to write the SQL query for this purpose.
You can use a recursive CTE to generate a list of dates. If you cross join that with planes, you get one row per date per plane. With a left join, you can link in measurements if they exist. A left join will leave the row even if no measurement is found.
For example:
declare #startDt date = '2013-01-01'
declare #endDt date = '2013-06-30'
; with AllDates as
(
select #startDt as dt
union all
select dateadd(day, 1, dt)
from AllDates
where dateadd(day, 1, dt) <= #endDt
)
select *
from AllDates ad
cross join
tbl_plane p
left join
(
select row_number() over (partition by Id, cast([date] as date) order by id) rn
, *
from tbl_measurement
where m.inputType = 'forecast'
) m
on p.Id = m.Id
and m.date = ad.dt
and m.rn = 1 -- Only one per day
where p.planeType = 3
option (maxrecursion 0)