TSQL - Select the rows with the higher count and when count is the same, the row with the higher id value - sql-server

HELP!!! I'm stumped and have tried several options to no avail...
I need to return one row for each Pub_id, and the row that is returned should be the one with the higher Count and when there is more than one row with the highest count, I need the one with the higher price_id.
I have populated a table with this data...
pub_id, price_id, count
7, 59431, 5
22, 39964, 4
39, 112831, 3
39, 120715, 2
47, 95359, 2
74, 142825, 5
74, 106688, 5
74, 37514, 1
and This is what I need to return...
pub_id, price_id, count
7, 59431, 5
22, 39964, 4
39, 112831, 3
47, 95359, 2
74, 142825, 5

;WITH T
AS (SELECT *,
ROW_NUMBER() OVER (PARTITION BY pub_id
ORDER BY [count] DESC, price_id DESC) AS rn
FROM your_table)
SELECT pub_id,
[count],
price_id
FROM T
WHERE rn=1

Do you want something like this
select pub_id,
Count,
Price_Id
from (select Pub_id,
max(count),
Price_Id
from table_name
group by Pub_id) der_tab
group by Pub_id,
Count
having Price_id = max(price_Id)

Related

Query to get ids of entries which appear in arbitrary amount different lists

id
category_id
product_id
status
13
93
2137
1
14
94
2137
1
15
93
2138
2
16
94
2138
2
17
87
2128
1
18
94
2128
1
19
87
2139
2
20
94
2139
2
21
88
2132
1
22
93
2132
1
23
88
2140
2
24
93
2140
2
25
87
2137
1
26
87
2141
2
27
93
2136
1
28
93
2137
1
29
88
2134
1
30
88
2143
2
I have this kind of data presented to me. For my query I'm given a list of category ids.
Let's say I'm given three lists with
1. {93, 94}
2. {88, 87, 86}
3. {93}
Now I would need a query, which would give me product ids, which appear at least once in ALL of those lists and for which the status is 1. So for the example query the result should be:
product_id
2137
The first step in any solution is to normalize the selection criteria data into a table of the form { category_group_id, category_id } with only one category_id for row. There are several ways to do this but I've used the relatively new STRING_SPLIT function here (same as Luis LL). This normalized criteria may be loaded into a temp table or included as a Common Table Expression (CTE) as is done below.
Once the criteria is normalized, the real problem can be solved by (1) filtering the input data by status, (2) joining it with the normalized selection criteria from above, (3) grouping by product ID, and then (4) counting the number of distinct category group IDs matched. If that count matches the total number of category group IDs (three for the sample data), we have a match.
;WITH NormalizedCategoryIds AS (
SELECT C.category_group_id, CAST(S.Value AS INT) AS category_id
FROM CategoryIds C
CROSS APPLY STRING_SPLIT(
REPLACE(REPLACE(category_id_list, '{', ''), '}', ''),
',') S
)
SELECT D.product_id
FROM SampleData D
JOIN NormalizedCategoryIds C on C.category_id = D.category_id
WHERE D.status = 1
GROUP BY D.product_id
HAVING COUNT(DISTINCT C.category_group_id) = (SELECT COUNT(*) FROM CategoryIds)
If we started with criteria that was already normalized, the HAVING clause could be changed to:
HAVING COUNT(DISTINCT C.category_group_id)
= (SELECT COUNT(DISTINCT C2.category_group_id) FROM NormalizedCategoryIds C2)
That value could also be calculated ahead of the query.
Sample results:
product_id
2132
2137
Even though not in the original posted results, 2132 is also included here, because it matches all three category groups. The 93 row matches category groups 1 and 3 and the 88 record matches category group 2.
See this db<>fiddle for a working demo including some extra test data.
This should work for SQL Server 2016 and above.
CREATE TABLE table1
(
id INT,
category_id INT,
product_id INT,
status INT
)
INSERT INTO table1
(id, category_id, product_id, status)
VALUES
( 13, 93, 2137, 1)
,( 14, 94, 2137, 1)
,( 15, 93, 2138, 2)
,( 16, 94, 2138, 2)
,( 17, 87, 2128, 1)
,( 18, 94, 2128, 1)
,( 19, 87, 2139, 2)
,( 20, 94, 2139, 2)
,( 21, 88, 2132, 1)
,( 22, 93, 2132, 1)
,( 23, 88, 2140, 2)
,( 24, 93, 2140, 2)
,( 25, 87, 2137, 1)
,( 26, 87, 2141, 2)
,( 27, 93, 2136, 1)
,( 28, 93, 2137, 1)
,( 29, 88, 2134, 1)
,( 30, 88, 2143, 2)
CREATE TABLE Input
(IdLst varchar(100))
INSERT INTO Input (IdLst)
VALUES
('{93, 94}')
,('{88, 87, 86}')
,('{93}')
;WITH Categories AS (
SELECT CONVERT(INT, Value ) category_id
FROM Input
CROSS APPLY STRING_SPLIT(REPLACE(REPLACE( IdLst, '{', ''), '}', ''), ',')
)
SELECT product_id
FROM Categories
INNER JOIN table1 ON table1.category_id = Categories.category_id
GROUP BY product_id
HAVING COUNT(1) = (SELECT COUNT(1) cntCategories FROM Categories )

Insert dummy rows to fill missing values into a SQL Table

I have this SQL Server table table1 which I want to fill with dummy rows per acct up to latest previous month end date period e.g now would be up to 2021-06-30.
In this example, acct 1 has n number of rows which ends at 2020-05-31, and I want to insert dummy rows with same values for acct and amt with begin_date and end_date incrementing by 1 month up to 06-30-2021.
Let's assume acct 2 already ends at 06-30-2021 so this doesn't need dummy rows to be inserted.
acct,amt,begin_date,end_date
1 , 10, 2020-04-01, 2020-04-30
1 , 10, 2020-05-01, 2020-05-31
2 , 50, 2021-05-01, 2021-05-31
2 , 50, 2021-06-01, 2021-06-30
So for acct 1, I want n number of rows to be inserted from last period of 2020-05-31 up to previous month end which is now 06-30-2021 and I want the amt and acct to remain same. So it would look like this below:
acct,amt,begin_date,end_date
1 , 10, 2020-04-01, 2020-04-30
1 , 10, 2020-05-01, 2020-05-31
1 , 10, 2020-06-01, 2020-06-30
1 , 10, 2020-07-01, 2020-07-31
.............................
.............................
1 , 10, 2021-06-01, 2021-06-30
Based on some data anamolies, I realize I need another condition to the solution. Suppose another column type was added to the table1. So acct and type would be the composite key that identifies each related row hence acct 2 type A and acct 2 type B are not related. So we have the updated table:
acct,type,amt,begin_date,end_date
1, A, 10, 2020-04-01, 2020-04-30
1, A, 10, 2020-05-01, 2020-05-31
2, A, 50, 2021-05-01, 2021-05-31
2, A, 50, 2021-06-01, 2021-06-30
2, B, 50, 2021-01-01, 2021-01-31
2, B, 50, 2021-02-01, 2021-02-28
I would now need dummy rows to be created for acct 2 type B up to 2021-06-30. We already know acct 2 type A would be ok since it already has rows up to 2021-06-30
You can generate the rows using a recursive CTE:
with cte as (
select acct, amt,
dateadd(day, 1, end_date) as begin_date,
eomonth(dateadd(day, 1, end_date)) as end_date
from (select t.*,
row_number() over (partition by acct order by end_date desc) as seqnum
from t
) t
where seqnum = 1 and end_date < '2021-06-30'
union all
select acct, amt, dateadd(month, 1, begin_date),
eomonth(dateadd(month, 1, begin_date))
from cte
where begin_date < '2021-06-01'
)
select *
from cte;
You can then use insert to insert these rows into a table. Or use union all if you simply want a result set with all the rows.
Here is a db<>fiddle.

How do I SELECT all the entries in a SQL table that have a date within the last week but only if the EQNum has not appeared in the past 6 months

I am very much a SQL novice. I am looking to write a script that will select all the columns from a table where two criteria are met:
The date of the call must have happened within the past 7 days
The EQNum must not have had another call placed on it in the past six months
Here is a sample table:
Call, Date, EQNum, Customer
123, 06-16-2015, 75, ABC Co
125, 06-16-2015, 82, XYZ Co
133, 06-14-2015, 69, DEF Co
101, 05-12-2015, 82, XYZ Co
115, 10-11-2014, 69, DEF Co
The query I need created should return:
123, 06-16-2015, 75, ABC Co
133, 06-14-2015, 69, DEF Co
The Call 125 (EQNum 82) is eliminated because though is occurred in the past week, EQNum 82 had another call (Call 101) occur within the last 6 month thus eliminating it.
Call 133 is valid because the other call for EQNum 69 occurred more than 6 months ago.
Something like this:
SELECT *
from tbl
WHERE DateCol > DATEADD(day, -7, getdate())
AND NOT EXISTS (SELECT TOP 1 1
FROM tbl this
WHERE this.EQNum = tbl.EQNum
AND this.DateCol > DATEADD(month, -6, getdate())
)
This is one way, although it probably wouldn't perform well if the table got massive
select
Call,
[Date],
EQNum,
Customer
from #table
where
[Date] > getdate() - 7 and
EQNum not in
(
select
EQNum
from #table
where
[Date] > DATEADD(month, -6, getdate())
group by
EQNum
having count(*) > 1
)
Another way would be to left join...
select
Call,
[Date],
EQNum,
Customer
from #table t1
left join #table t2 on
t1.Call != t2.Call and
t1.EQNum = t2.EQNum and
t2.Date > DATEADD(month, -6, getdate())
where
t1.[Date] > getdate() - 7 and
t2.Call is null

Sql Server Rank on Value Range

I have a table with three columns, ID, Date, Value. I want to rank the rows such that, within an ID, the Ranking goes up with each date where Value is at least X, otherwise, Ranking stays the same.
Given ID, Date, and Values like these
1, 6/1, 8
1, 6/2, 12
1, 6/3, 14
1, 6/4, 9
1, 6/5, 11
I would like to return a ranking based on values of at least 10, such that I would have ID, Date, Value, and Rank like this:
1, 6/1, 8, 0
1, 6/2, 12, 1
1, 6/3, 14, 2
1, 6/4, 9, 2
1, 6/5, 11, 3
In other words, the ranking increases each time the value exceeds a threshhold, otherwise it stays the same.
What I have tried is
SELECT T1.*, X.Ranking FROM TABLE T1
LEFT JOIN ( SELECT *, DENSE_RANK( ) OVER ( PARTITION BY T2.ID ORDER BY T2.DATE ) Ranking
FROM TABLE T2 WHERE T2.VALUE >= 10 ) X
ON T1.ID = T2.ID AND T1.Date = T2.Date
This almost works. It gets me output like
1, 6/1, 8, NULL
1, 6/2, 12, 1
1, 6/3, 14, 2
1, 6/4, 9, NULL
1, 6/5, 11, 3
Then, I want to turn the first NULL into a 0, and the second into a 2.
I turned the above query into a cte and tried
SELECT T1.*, CASE WHEN T1.Ranking IS NULL THEN ISNULL( (
SELECT MAX( T2.Ranking )
FROM cte T2 WHERE T1.ID = T2.ID AND T1.Date > T2.Date, 0 )
ELSE T1.Ranking END NewRanking
FROM cte T1
This looks like it would work, but my table has 200,000 rows and the query ran for 25 minutes... So, I'm looking for something a little more out of the box than the SELECT MAX.
You are using SQL Server 2012, so you can do a cumulative sum:
select t.*,
sum(case when value >= 10 then 1 else 0 end) over
(partition by id order by date) as ranking
from table t;
EDIT: This actually does not work. In spirit it fetches the previous LAG value and increment it, but this is not how LAG works... it would be 'recursive' in essence which results in a 'my_rank' is undefined syntax error. Better solution is the accepted answer based on a cumulative sum.
If you have SQL Server 2012 (you didn't tag your question), you can do something like:
SELECT
LAG(my_rank, 1, 0) OVER (ORDER BY DATE)
+ CASE WHEN VALUE >= 10 THEN 1 ELSE 0 END AS my_rank
FROM T1

Add new rows to resultset in MSSQL

I am running a SQL query in MSSQL 2008 R2 which should always return a consistent resultset, meaning that all dates within a selected date range should be shown, although there are no rows/values in the database for a particular date within the date range. It should for example look like this for the dates 2013-07-03 - 2013-07-04 when there are values for id 1 and 2.
Scenario 1
Date-hour, value, id
2013-07-03-1, 10, 1
2013-07-03-2, 12, 1
2013-07-03-...
2013-07-03-24, 9, 1
2013-07-04-1, 10, 1
2013-07-04-2, 10, 1
2013-07-04-...
2013-07-04-24, 10, 1
2013-07-03-1, 11, 2
2013-07-03-2, 12, 2
2013-07-03-...
2013-07-03-24, 9, 2
2013-07-04-1, 10, 2
2013-07-04-2, 12, 2
2013-07-04-...
2013-07-04-24, 10, 2
However, if id 2 is missing values for 2013-07-04, I will normally only get a resultset which looks like this:
Scenario 2
Date-hour, value, id
2013-07-03-1, 10, 1
2013-07-03-2, 12, 1
2013-07-03-...
2013-07-03-24, 9, 1
2013-07-04-1, 10, 1
2013-07-04-2, 10, 1
2013-07-04-...
2013-07-04-24, 10, 1
2013-07-03-1, 11, 2
2013-07-03-2, 12, 2
2013-07-03-...
2013-07-03-24, 9, 2
Scenario 2 will create an inconsistent resultset which will affect the output. Is there any way to make the SQL query always return as scenario 1 even when there are missing values, so at least to return NULL if there are no values for a specific date within the date range. If the resultset returns id 1 and 2 then all dates for id 1 and 2 should be covered. If id 1, 2 and 3 are returned then all dates for id 1, 2 and 3 should be covered.
I have two tables which look like this:
tbl_measurement
id, date, hour1, hour2, ..., hour24
tbl_plane
planeId, id, maxSpeed
The SQL query I am running look like this:
SELECT DISTINCT hour00_01, hour01_02, mr.date, mr.id, maxSpeed
FROM tbl_measurement as mr, tbl_plane as p
WHERE (date >= '2013-07-03' AND date <= '2013-07-04') AND p.id = mr.id
GROUP BY mr.id, mr.date, hour00_01, hour01_02, p.maxSpeed
ORDER BY mr.id, mr.date
I have been looking around quite a bit, and perhaps PIVOT tables are the way to solve this? Could you please help me out? I would appreciate if you can help me out with how to write the SQL query for this purpose.
You can use a recursive CTE to generate a list of dates. If you cross join that with planes, you get one row per date per plane. With a left join, you can link in measurements if they exist. A left join will leave the row even if no measurement is found.
For example:
declare #startDt date = '2013-01-01'
declare #endDt date = '2013-06-30'
; with AllDates as
(
select #startDt as dt
union all
select dateadd(day, 1, dt)
from AllDates
where dateadd(day, 1, dt) <= #endDt
)
select *
from AllDates ad
cross join
tbl_plane p
left join
(
select row_number() over (partition by Id, cast([date] as date) order by id) rn
, *
from tbl_measurement
where m.inputType = 'forecast'
) m
on p.Id = m.Id
and m.date = ad.dt
and m.rn = 1 -- Only one per day
where p.planeType = 3
option (maxrecursion 0)

Resources