Query to get ids of entries which appear in arbitrary amount different lists - sql-server

id
category_id
product_id
status
13
93
2137
1
14
94
2137
1
15
93
2138
2
16
94
2138
2
17
87
2128
1
18
94
2128
1
19
87
2139
2
20
94
2139
2
21
88
2132
1
22
93
2132
1
23
88
2140
2
24
93
2140
2
25
87
2137
1
26
87
2141
2
27
93
2136
1
28
93
2137
1
29
88
2134
1
30
88
2143
2
I have this kind of data presented to me. For my query I'm given a list of category ids.
Let's say I'm given three lists with
1. {93, 94}
2. {88, 87, 86}
3. {93}
Now I would need a query, which would give me product ids, which appear at least once in ALL of those lists and for which the status is 1. So for the example query the result should be:
product_id
2137

The first step in any solution is to normalize the selection criteria data into a table of the form { category_group_id, category_id } with only one category_id for row. There are several ways to do this but I've used the relatively new STRING_SPLIT function here (same as Luis LL). This normalized criteria may be loaded into a temp table or included as a Common Table Expression (CTE) as is done below.
Once the criteria is normalized, the real problem can be solved by (1) filtering the input data by status, (2) joining it with the normalized selection criteria from above, (3) grouping by product ID, and then (4) counting the number of distinct category group IDs matched. If that count matches the total number of category group IDs (three for the sample data), we have a match.
;WITH NormalizedCategoryIds AS (
SELECT C.category_group_id, CAST(S.Value AS INT) AS category_id
FROM CategoryIds C
CROSS APPLY STRING_SPLIT(
REPLACE(REPLACE(category_id_list, '{', ''), '}', ''),
',') S
)
SELECT D.product_id
FROM SampleData D
JOIN NormalizedCategoryIds C on C.category_id = D.category_id
WHERE D.status = 1
GROUP BY D.product_id
HAVING COUNT(DISTINCT C.category_group_id) = (SELECT COUNT(*) FROM CategoryIds)
If we started with criteria that was already normalized, the HAVING clause could be changed to:
HAVING COUNT(DISTINCT C.category_group_id)
= (SELECT COUNT(DISTINCT C2.category_group_id) FROM NormalizedCategoryIds C2)
That value could also be calculated ahead of the query.
Sample results:
product_id
2132
2137
Even though not in the original posted results, 2132 is also included here, because it matches all three category groups. The 93 row matches category groups 1 and 3 and the 88 record matches category group 2.
See this db<>fiddle for a working demo including some extra test data.

This should work for SQL Server 2016 and above.
CREATE TABLE table1
(
id INT,
category_id INT,
product_id INT,
status INT
)
INSERT INTO table1
(id, category_id, product_id, status)
VALUES
( 13, 93, 2137, 1)
,( 14, 94, 2137, 1)
,( 15, 93, 2138, 2)
,( 16, 94, 2138, 2)
,( 17, 87, 2128, 1)
,( 18, 94, 2128, 1)
,( 19, 87, 2139, 2)
,( 20, 94, 2139, 2)
,( 21, 88, 2132, 1)
,( 22, 93, 2132, 1)
,( 23, 88, 2140, 2)
,( 24, 93, 2140, 2)
,( 25, 87, 2137, 1)
,( 26, 87, 2141, 2)
,( 27, 93, 2136, 1)
,( 28, 93, 2137, 1)
,( 29, 88, 2134, 1)
,( 30, 88, 2143, 2)
CREATE TABLE Input
(IdLst varchar(100))
INSERT INTO Input (IdLst)
VALUES
('{93, 94}')
,('{88, 87, 86}')
,('{93}')
;WITH Categories AS (
SELECT CONVERT(INT, Value ) category_id
FROM Input
CROSS APPLY STRING_SPLIT(REPLACE(REPLACE( IdLst, '{', ''), '}', ''), ',')
)
SELECT product_id
FROM Categories
INNER JOIN table1 ON table1.category_id = Categories.category_id
GROUP BY product_id
HAVING COUNT(1) = (SELECT COUNT(1) cntCategories FROM Categories )

Related

SQLite - Match records data

I have a database with a list of game related trades in it, and I'm trying to identify the trades that match. Let me explain better.
I have the following records:
UserID GameID PlatformID RecivingGameID RecivingGamePlatformID PostID
1111 18 167 41 43 2312451124
2222 41 43 18 167 1276949826
3333 41 43 18 21 6798639876
4444 41 43 90 167 4587938698
In this table there is only 1 match, between post 2312451124 and 1276949826. This is because both users have the game that the other user wants for a specific platform.
So if I am the user 1111, the only match I have to see is with the user 2222 and vice versa. Users 3333 and 4444 have no matches in the database.
How can I identify the trades that match?
Here's what I tried:
SELECT * FROM Posts WHERE UserID != 1111 AND (RecivingGameID IN (
SELECT DISTINCT GameID FROM Posts) AND GameID IN (
SELECT DISTINCT RecivingGameID FROM Posts)) AND (RecivingGamePlatformID IN (
SELECT DISTINCT PlatformID FROM Posts) AND PlatformID IN (
SELECT DISTINCT RecivingGamePlatformID FROM Posts))
SELECT * FROM Posts WHERE UserID != 1111 AND RecivingGameID IN (
SELECT DISTINCT GameID FROM Posts) AND GameID IN (
SELECT DISTINCT RecivingGameID FROM Posts) AND RecivingGamePlatformID IN (
SELECT DISTINCT PlatformID FROM Posts) AND PlatformID IN (
SELECT DISTINCT RecivingGamePlatformID FROM Posts)
Use a self join:
SELECT p1.*
FROM Posts p1 INNER JOIN Posts p2
ON p2.UserID <> p1.UserID
AND (p1.GameID, p1.PlatformID) = (p2.RecivingGameID, p2.RecivingGamePlatformID)
AND (p2.GameID, p2.PlatformID) = (p1.RecivingGameID, p1.RecivingGamePlatformID)
WHERE p2.UserID = 1111;
See the demo.
WITH
Posts(UserID, GameID, PlatformID, RecivingGameID, RecivingGamePlatformID, PostID) AS (
VALUES
(1111, 18, 167, 41, 43, 2312451124),
(2222, 41, 43, 18, 167, 1276949826),
(3333, 41, 43, 18, 21, 6798639876),
(4444, 41, 43, 90, 167, 4587938698)
),
matches AS (
SELECT DISTINCT
src.UserID AS UserID, src.PostID AS PostID,
dst.UserID AS RecivingUserID, dst.PostID AS RecivingPostID
FROM Posts AS src, Posts AS dst
WHERE src.GameID = dst.RecivingGameID
AND dst.GameID = src.RecivingGameID
AND src.PlatformID = dst.RecivingGamePlatformID
AND dst.PlatformID = src.RecivingGamePlatformID
AND src.UserID < dst.UserID
)
SELECT * FROM matches;

How to find only the top 3 values of a column and group the rest of the column values as zero?

Consider a table having data as shown. I want to find the top 3 marks and combine the rest of the values of column marks as a single value 0.
name age marks height
-----------------------------
anil 25 67 5
ashish 23 75 6
ritu 22 0 4
manish 25 0 6
kirti 23 97 5
Output
name age marks height
-----------------------------
kirti 23 97 5
ashish 23 75 6
anil 25 67 5
OTHERS 0 0 0
With TOP 3 and UNION ALL for the last row:
select t.* from (
select top 3 * from tablename
order by marks desc
) t
union all
select 'OTHERS', 0, 0, 0
See the demo.
Results:
> name | age | marks | height
> :----- | --: | ----: | -----:
> kirti | 23 | 97 | 5
> ashish | 23 | 75 | 6
> anil | 25 | 67 | 5
> OTHERS | 0 | 0 | 0
I would use a CTE (Common Table Expression) and the ROW_NUMBER() function:
;WITH cte AS (SELECT
[Name],
Age,
Marks,
Height,
ROW_NUMBER() OVER (ORDER BY Marks DESC) AS [Rank]
FROM
Test
)
SELECT
[Name],
Age,
Marks,
Height
FROM
cte
WHERE
[Rank] <= 3
UNION ALL SELECT 'OTHERS', 0, 0, 0
You can use select top 3 or row_number()
You can use row_number() as follows
declare #mytable as table(name varchar(50),age int,marks int,height int)
insert into #mytable values('anil', 25, 67, 5),('ashish', 23, 75, 6),('ritu', 22, 0, 4),('manish', 25, 0, 6),('kirti', 23, 97, 5),
('other',0,0,0);
with cte as(
select name,age,marks,height,row_number() over(partition by 1 order by marks desc) row# from #mytable )
select name,age,marks,height from cte where row#<4 or name='other'
order by row#
Another way, using union without inserting ('other',0,0,0) to the table, you can the same result
declare #mytable as table(name varchar(50),age int,marks int,height int)
insert into #mytable values('anil', 25, 67, 5),('ashish', 23, 75, 6),('ritu', 22, 0, 4),('manish', 25, 0, 6),('kirti', 23, 97, 5)
--,('other',0,0,0)
;
with cte as(
select name,age,marks,height,row_number() over(partition by 1 order by marks desc) row# from #mytable )
select name,age,marks,height,row# from cte where row#<4
union select 'others',0,0,0,4
order by row#

How to join tables when there are duplicates in right table

I have three tables. Table Cust has a custID field, plus various other values (name, address etc)
Table List has a single column ID. Each ID is a custID in the Cust table
Edit: the purpose of this is to filter the records, restricting thge results to ones where the CustID appears in the list table.
All three tables are indexed.
Table Trans has a TransactionID field, a Cust field that holds a customer ID, And other transaction fields
Edit: I should have mentioned that in some cases there will be no transaction record. In this case I want one row of Customer info with the transaction fields null or blank.
I want a query to return cust and transaction ID for each ID in the list table. If there is more than one matching row in the transaction table, I want each included along 3with the matching cust info. So if the tables look like This:
Cust
ID Name
01 John
02 Mary
03 Mike
04 Jane
05 Sue
06 Frank
List
ID
01
03
05
06
Transact
TransID CustId Msg
21 01 There
22 01 is
23 02 a
24 03 tide
25 04 in
26 04 the
27 05 affairs
28 05 of
29 05 men
I want the result set to be:
CustID Name TransID Msg
01 John 21 There
01 John 22 is
03 Mike 24 tide
05 Sue 27 affairs
05 Sue 28 of
05 Sue 29 men
06 Frank -- --
(Where -- represents NULL or BLANK)
Obviously the actual tables are much larger (millions of rows), but that shows the pattern, one row for every item in table Transactions that matches any of the items in the List table, with matching fields from the Cust table. if there is no matching Transaction, one row of customer info from each ID in the List table. CustID is unique in the Cust and List tables, but not in the transaction table.
This needs to work on any version of SQL server from 2005 onward, if that matters.
Any suggestions?
Unless I'm missing something, this is all you need to do:
Select T.CustID, C.Name, T.TransID, T.Msg
From Transact T
Join Cust C On C.Id = T.CustId
Join List L On L.Id = C.Id
Order By T.CustID, T.TransID
;with cust (id, name) as
(
select 1, 'John' union all
select 2, 'Mary' union all
select 3, 'Mike' union all
select 4, 'Jane' union all
select 5, 'Sue'
), list (id) as
(
select 1 union all
select 3 union all
select 5
), transact (TransId, CustId, Msg) as
(
select 21, 1, 'There '
union all select 22, 1, 'is'
union all select 23, 2, 'a'
union all select 24, 3, 'tide'
union all select 25, 4, 'in'
union all select 26, 4, 'the'
union all select 27, 5, 'affairs'
union all select 28, 5, 'of'
union all select 29, 5, 'men'
)
select
CustId = c.id,
Name = c.Name,
TransId = t.TransId,
Msg = t.Msg
from cust c
inner join list l
on c.id = l.id
inner join transact t
on l.id = t.custid
yields:
CustId Name TransId Msg
----------- ---- ----------- -------
1 John 21 There
1 John 22 is
3 Mike 24 tide
5 Sue 27 affairs
5 Sue 28 of
5 Sue 29 men

How do I SELECT all the entries in a SQL table that have a date within the last week but only if the EQNum has not appeared in the past 6 months

I am very much a SQL novice. I am looking to write a script that will select all the columns from a table where two criteria are met:
The date of the call must have happened within the past 7 days
The EQNum must not have had another call placed on it in the past six months
Here is a sample table:
Call, Date, EQNum, Customer
123, 06-16-2015, 75, ABC Co
125, 06-16-2015, 82, XYZ Co
133, 06-14-2015, 69, DEF Co
101, 05-12-2015, 82, XYZ Co
115, 10-11-2014, 69, DEF Co
The query I need created should return:
123, 06-16-2015, 75, ABC Co
133, 06-14-2015, 69, DEF Co
The Call 125 (EQNum 82) is eliminated because though is occurred in the past week, EQNum 82 had another call (Call 101) occur within the last 6 month thus eliminating it.
Call 133 is valid because the other call for EQNum 69 occurred more than 6 months ago.
Something like this:
SELECT *
from tbl
WHERE DateCol > DATEADD(day, -7, getdate())
AND NOT EXISTS (SELECT TOP 1 1
FROM tbl this
WHERE this.EQNum = tbl.EQNum
AND this.DateCol > DATEADD(month, -6, getdate())
)
This is one way, although it probably wouldn't perform well if the table got massive
select
Call,
[Date],
EQNum,
Customer
from #table
where
[Date] > getdate() - 7 and
EQNum not in
(
select
EQNum
from #table
where
[Date] > DATEADD(month, -6, getdate())
group by
EQNum
having count(*) > 1
)
Another way would be to left join...
select
Call,
[Date],
EQNum,
Customer
from #table t1
left join #table t2 on
t1.Call != t2.Call and
t1.EQNum = t2.EQNum and
t2.Date > DATEADD(month, -6, getdate())
where
t1.[Date] > getdate() - 7 and
t2.Call is null

TSQL - Select the rows with the higher count and when count is the same, the row with the higher id value

HELP!!! I'm stumped and have tried several options to no avail...
I need to return one row for each Pub_id, and the row that is returned should be the one with the higher Count and when there is more than one row with the highest count, I need the one with the higher price_id.
I have populated a table with this data...
pub_id, price_id, count
7, 59431, 5
22, 39964, 4
39, 112831, 3
39, 120715, 2
47, 95359, 2
74, 142825, 5
74, 106688, 5
74, 37514, 1
and This is what I need to return...
pub_id, price_id, count
7, 59431, 5
22, 39964, 4
39, 112831, 3
47, 95359, 2
74, 142825, 5
;WITH T
AS (SELECT *,
ROW_NUMBER() OVER (PARTITION BY pub_id
ORDER BY [count] DESC, price_id DESC) AS rn
FROM your_table)
SELECT pub_id,
[count],
price_id
FROM T
WHERE rn=1
Do you want something like this
select pub_id,
Count,
Price_Id
from (select Pub_id,
max(count),
Price_Id
from table_name
group by Pub_id) der_tab
group by Pub_id,
Count
having Price_id = max(price_Id)

Resources