Using `DISTINCT ON` in Snowflake - snowflake-cloud-data-platform

I have a below query where I need to do a DISTINCT ON the allowed_id column from the union result, as is possible in PostgreSQL. I have read that Snowflake uses similar kind of PostgreSQL but DISTINCT ON didn't work.
select distinct on (allowed_id), * from (
select listagg(distinct id) as allowed_id, count (people) as totalpeople ,max(score) as maxscore , min(score) as minscore, 'n' as type from tableA
where userid = 123
union
select listagg(distinct id) as allowed_id, count (people) as totalpeople, max(elscore) as maxscore , min(elscore) as minscore, 'o' as type from tableB
where userid = 123
union
select listagg(distinct id) as allowed_id, null, null , null , 'j' as type from tableC
where userid = 123
union
select listagg(distinct id) as allowed_id, null, null , null , 'a' as type from tableD
where userid = 123
)

Snowflake does not support "DISTINCT ON", but you can use QUALIFY and ROW_NUMBER to produce the same result:
SELECT * from (
select * from values (123,11,12,'a' ) as tableA (allowed_id, col2, col3, table_name)
union all
select * from values (123,21,22,'b' ) as tableA (allowed_id, col2, col3, table_name)
union all
select * from values (123,31,32,'c' ) as tableA (allowed_id, col2, col3, table_name)
union all
select * from values (123,41,42,'d' ) as tableA (allowed_id, col2, col3, table_name)
)
where allowed_id = 123
QUALIFY ROW_NUMBER() OVER (PARTITION BY allowed_id ORDER BY allowed_id) = 1 ;
Please check:
https://docs.snowflake.net/manuals/sql-reference/constructs/qualify.html
https://docs.snowflake.net/manuals/sql-reference/functions/row_number.html

Related

Deactivate duplicate data based on two group by columns with check not null

I have this following query to return duplicate rows by 2 columns but it returns rows with NULL values. I would like to return rows with NOT NULL for both columns. Also how can I deactivate duplicate record with old ID?
With deactivaterows as (
Select t.*, count(*) over (partition by col1, col2) as cnt from TABLE1 t)
Select * from deactivaterows where cnt > 1;
You want row_number not count(*):
with deactivaterows as
(
select t.*, row_number() over(partition by t.col1, t.col2 order by (select null)) as rn
from TABLE1
where t.col1 is not null and t.col2 is not null
)
select * from deactivaterows where rn > 1
I would do it this way:
select col1, col2, count(*)
from TABLE1
where col1 is not null and col2 is not null
group by col1, col2
having count(*) > 1
The having command is great, it filters results after the group by and the select. You can always throw in a where regardless, and this will compare before you start grouping, counting or selecting anything.
Alternatively, you can certainly modify your existing solution by throwing in a where clause without issue:
With deactivaterows as (
Select t.*, count(*) over (partition by col1, col2) as cnt from TABLE1 t
where col1 is not null and col2 is not null)
Select * from deactivaterows where cnt > 1;

Inserting multiple rows in SQL?

I have WindowTable with following data :
SELECT Id FROM WindowTable WHERE OwnerRef=12
Id
----
25000
25001
25003
25004
25005
25006
25007
25008
I want to Insert 3 row per each WindowTable Row in ActionTable
Like this :
Id WindowsRef ActionName ActionName2
-----------------------------------------------
1 25000 'Add' 'E'
2 25000 'DELETE' 'H'
3 25000 'UPDATE' 'B'
4 25001 'ADD' 'E'
5 25001 'DELETE' 'H'
6 25001 'Update' 'B'
. . .
. . .
ActionTable.Id is not identity column
Or:
insert into ActionTable (Id, WindowsRe, ActionName, ActionName2)
select
isnull((select max(at.Id) from ActionTable at), 0) +
row_number() over (order by w.Id, a.Action),
w.Id, a.Action, a.Action2
from WindowsTable w
cross join
(
select 'Add' as Action, 'E' as Action2
union all select 'Delete', 'H'
union all select 'Update', 'B'
) a
UPD: Fixed misstyping. Thanks #Hua_Trung for comment
UPD2: Added ActionTable.Id generation
UPD3: Added ActionName2
Something like:
insert into ActionTable(WindowsRef, ActionName)
select id WindowsRef, 'Add'
from WindowsTable
union all
select id WindowsRef, 'DELETE'
from WindowsTable
union all
select id WindowsRef, 'UPDATE'
from WindowsTable
(Assuming ActionTable.Id is an identity column or otherwise database generated.)
To generate id values as well
insert into ActionTable(id, WindowsRef, ActionName)
select
(isnull(select max(id) from ActionTable, 0)
+ row_number() over (order by x.WindowsRef, x.ActionName)
) id,
x.WindowsRef, x.ActionName
from (
select id WindowsRef, 'Add' ActionName
from WindowsTable
union all
select id WindowsRef, 'DELETE' ActionName
from WindowsTable
union all
select id WindowsRef, 'UPDATE' ActionName
from WindowsTable
) x
one more approach using apply.Using sample data from Praveen ND
select *,row_number() over (order by id) as id from #WindowTable
cross apply
(
values('add','E'),
('delete','h'),
('update','b')
)b(action,action2)
This will help you to create the script for inserting to ActionTable
DECLARE #WindowTable TABLE (ID INT)
INSERT INTO #WindowTable VALUES
(25001),
(25003),
(25004),
(25005),
(25006),
(25007),
(25008)
SELECT 'INSERT INTO ActionTable (WindowsRe,ActionName) VALUES ('+ CAST(ID AS NVARCHAR(MAX))+',ADD)' FROM #WindowTable
UNION
SELECT 'INSERT INTO ActionTable (WindowsRe,ActionName) VALUES ('+ CAST(ID AS NVARCHAR(MAX))+',UPDATE)' FROM #WindowTable
UNION
SELECT 'INSERT INTO ActionTable (WindowsRe,ActionName) VALUES ('+ CAST(ID AS NVARCHAR(MAX))+',DELETE)' FROM #WindowTable
Note : Considering ID in ActionTable as IDENTITY.
IF ID in ActionTable is not an IDENTITY :
Try to make use of below Query :
DECLARE #WindowTable TABLE (ID INT)
INSERT INTO #WindowTable VALUES
(25001),
(25003),
(25004),
(25005),
(25006),
(25007),
(25008)
DECLARE #id INT =1;
DECLARE #id1 INT = (SELECT COUNT(*) FROM #WindowTable)
DECLARE #id2 INT = (SELECT 2 *COUNT(*) FROM #WindowTable)
SELECT 'INSERT INTO ActionTable (ID,WindowsRe,ActionName,ActionName2) VALUES ('+CAST(ROW_NUMBER() OVER(ORDER BY #id) AS NVARCHAR(MAX))+','+ CAST(ID AS NVARCHAR(MAX))+',ADD,''E'')' FROM #WindowTable
UNION
SELECT 'INSERT INTO ActionTable (ID,WindowsRe,ActionName,ActionName2) VALUES ('+CAST(#id1+ ROW_NUMBER() OVER(ORDER BY #id1) AS NVARCHAR(MAX))+','+ CAST(ID AS NVARCHAR(MAX))+',UPDATE,''B'')' FROM #WindowTable
UNION
SELECT 'INSERT INTO ActionTable (ID,WindowsRe,ActionName,ActionName2) VALUES ('+CAST(#id2+ ROW_NUMBER() OVER(ORDER BY #id2) AS NVARCHAR(MAX))+','+ CAST(ID AS NVARCHAR(MAX))+',DELETE,''H'')' FROM #WindowTable
Hope this helps
Test Data
;WITH cte_TestData(WindowsRef) AS
(
SELECT 25000 UNION ALL
SELECT 25001 UNION ALL
SELECT 25003 UNION ALL
SELECT 25004 UNION ALL
SELECT 25005 UNION ALL
SELECT 25006 UNION ALL
SELECT 25007 UNION ALL
SELECT 25008
)
SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS ID,
a.WindowsRef,
b.ActionName,
b.ActionName2
FROM cte_TestData a
CROSS JOIN (
SELECT 'Add' AS ActionName,'E' AS ActionName2 UNION ALL
SELECT 'DELETE','H' UNION ALL
SELECT 'UPDATE','B') b
ORDER BY a.WindowsRef
Against Actual Data
;WITH cte_TestData(WindowsRef) AS
(
SELECT Id
FROM WindowTable
WHERE OwnerRef=12
)
,cte_Action AS
(
SELECT 'Add' AS ActionName,'E' AS ActionName2 UNION ALL
SELECT 'DELETE','H' UNION ALL
SELECT 'UPDATE','B'
)
--INSERT INTO <DestinationTable>
/*
- Replace <DestinationTable> with Target Table Name
- If the destination table has data the the ID has to be incremented accordingly.
-- In that case define a variable, get MAX of that ID and add that to the below
auto generated ID to preserve the sequence.
--Better yet, use Identity column as your ID Column
*/
SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS ID,
a.WindowsRef,
b.ActionName,
b.ActionName2
FROM cte_TestData a
CROSS JOIN cte_Action b
--WHERE NOT EXISTS (SELECT 1 FROM <DestinationTable> c WHERE c.WindowsRef = a.WindowsRef)
/*
Enable to ensure duplicate records are not inserted, Replace <DestinationTable> with Target Table Name
*/
ORDER BY a.WindowsRef

Two columns into one using alternate rows

I have a table with two columns like this:
A 1
B 2
C 3
D 4
E 5
etc.
I want to get them into one column, with each column's data in alternate rows of the new column like this:
A
1
B
2
C
3
D
4
E
5
etc.
I would use a UNION ALL but here is the UNPIVOT alternative:
CREATE TABLE #Table1(letter VARCHAR(10),Id VARCHAR(10))
INSERT INTO #Table1(letter ,Id )
SELECT 'A',1 UNION ALL
SELECT 'B',2 UNION ALL
SELECT 'C',3 UNION ALL
SELECT 'D',4 UNION ALL
SELECT 'E',5
SELECT [value]
FROM #Table1
UNPIVOT
(
[value] FOR [Column] IN ([Id], [letter])
) UNPVT
DROP TABLE #Table1;
The tricky part is the data in alternate rows
select col2
from
( select col1, 1 as flag, col1 from tab
union all
select col1, 2, col2 from tab
) as dt
order by col1, flag
But why do you try to do this at all?
Try this :
WITH CTE AS
(
SELECT Col1Name Name,(ROW_NUMBER() OVER(ORDER BY(SELECT NULL))) RN
FROM TableName
UNION
SELECT Col2Name Name,(ROW_NUMBER() OVER(ORDER BY(SELECT NULL)))+1 RN
FROM TableName
)
SELECT Name
FROM CTE
ORDER BY RN
CREATE TABLE #Table1(Value VARCHAR(10),Id VARCHAR(10))
INSERT INTO #Table1(Value ,Id )
SELECT 'A',1 UNION ALL
SELECT 'B',2 UNION ALL
SELECT 'C',3 UNION ALL
SELECT 'D',4 UNION ALL
SELECT 'E',5
;WITH _CTE (Name) AS
(
SELECT Value [Name]
FROM #Table1
UNION ALL
SELECT Id [Name]
FROM #Table1
)
SELECT * FROM _CTE

How to find the maximum value in join without using if in sql stored procedure

I have a two tables like below
A
Id Name
1 a
2 b
B
Id Name
1 t
6 s
My requirement is to find the maximum id from table and display the name and id for that maximum without using case and if.
i findout the maximum by using below query
SELECT MAX(id)
FROM (SELECT id,name FROM A
UNION
SELECT id,name FROM B) as c
I findout the maximum 6 using the above query.but i can't able to find the name.I tried the below query but it's not working
SELECT MAX(id)
FROM (SELECT id,name FROM A
UNION
SELECT id,name FROM B) as c
How to find the name?
Any help will be greatly appreciated!!!
First combine the tables, since you need to search both. Next, determine the id you need. JOIN the id back with the temporarily created table to retreive the name that belongs to that id:
WITH tmpTable AS (
SELECT id,name FROM A
UNION
SELECT id,name FROM B
)
, max AS (
SELECT MAX(id) id
FROM tmpTable
)
SELECT t.id, t.name
FROM max m
JOIN tmpTable t ON m.id = t.id
You could use ROW_NUMBER(). You have to UNION ALL TableA and TableB first.
WITH TableA(Id, Name) AS(
SELECT 1, 'a' UNION ALL
SELECT 2, 'b'
)
,TableB(Id, Name) AS(
SELECT 1, 't' UNION ALL
SELECT 6, 's'
)
,Combined(Id, Name) AS(
SELECT * FROM TableA UNION ALL
SELECT * FROM TableB
)
,CTE AS(
SELECT *, RN = ROW_NUMBER() OVER(ORDER BY ID DESC) FROM Combined
)
SELECT Id, Name
FROM CTE
WHERE RN = 1
Just order by over the union and take first row:
SELECT TOP 1 * FROM (SELECT * FROM A UNION SELECT * FROM B) x
ORDER BY ID DESC
This won't show ties though.
For you stated that you used SQL Server 2008. Therefore,I used FULL JOIN and NESTED SELECT to get what your looking for. See below:
SELECT
(SELECT
1,
ISNULL(A.Id,B.Id)Id
FROM A FULL JOIN B ON A.Id=B.Id) AS Id,
(SELECT
1,
ISNULL(A.Name,B.Name)Name
FROM A FULL JOIN B ON A.Id=B.Id) AS Name
It's possible to use ROW_NUMBER() or DENSE_RANK() functions to get new numiration by Id, and then select value with newly created orderId equal to 1
Use:
ROW_NUMBER() to get only one value (even if there are some repetitions of max id)
DENSE_RANK() to get all equal max id values
Here is an example:
DECLARE #tb1 AS TABLE
(
Id INT
,[Name] NVARCHAR(255)
)
DECLARE #tb2 AS TABLE
(
Id INT
,[Name] NVARCHAR(255)
)
INSERT INTO #tb1 VALUES (1, 'A');
INSERT INTO #tb1 VALUES (7, 'B');
INSERT INTO #tb2 VALUES (4, 'C');
INSERT INTO #tb1 VALUES (7, 'D');
SELECT * FROM
(SELECT Id, Name, ROW_NUMBER() OVER (ORDER BY Id DESC) AS [orderId]
FROM
(SELECT Id, Name FROM #tb1
UNION
SELECT Id, Name FROM #tb2) as tb3) AS TB
WHERE [orderId] = 1
SELECT * FROM
(SELECT Id, Name, DENSE_RANK() OVER (ORDER BY Id DESC) AS [orderId]
FROM
(SELECT Id, Name FROM #tb1
UNION
SELECT Id, Name FROM #tb2) as tb3) AS TB
WHERE [orderId] = 1
Results are:

How to select from duplicate rows from a table?

I have the following table:
CREATE TABLE TEST(ID TINYINT NULL, COL1 CHAR(1))
INSERT INTO TEST(ID,COL1) VALUES (1,'A')
INSERT INTO TEST(ID,COL1) VALUES (2,'B')
INSERT INTO TEST(ID,COL1) VALUES (1,'A')
INSERT INTO TEST(ID,COL1) VALUES (1,'B')
INSERT INTO TEST(ID,COL1) VALUES (1,'B')
INSERT INTO TEST(ID,COL1) VALUES (2,'B')
I would like to select duplicate rows from that table. How can I select them?
I tried the following:
SELECT TEST.ID,TEST.COL1
FROM TEST WHERE TEST.ID IN
(SELECT ID
FROM TEST WHERE TEST.COL1 IN
(SELECT COL1
FROM TEST WHERE TEST.ID IN
(SELECT ID
FROM TEST
GROUP BY ID
HAVING COUNT(*) > 1)
GROUP BY COL1
HAVING COUNT(*) > 1)
GROUP BY ID
HAVING COUNT(*) > 1)
Where's the error? What do I need to modify?
And I would like it to show as:
ID COL1
---- ----
1 A
1 A
1 B
1 B
(4 row(s) affected)
SELECT id, col1
FROM Test
GROUP BY id, col1
HAVING COUNT(*) > 1
when you use
SELECT id, col1, COUNT(*) AS cnt
FROM Test
GROUP BY id, col1
HAVING COUNT(*) > 1
you practically have all duplicate rows and how often they appear. You can't identify them individually either way.
A slower way would be:
SELECT id, col1
FROM Test T
WHERE (SELECT COUNT(*)
FROM Test I
WHERE I.id = T.id AND I.col1 = T.col1) > 1
Using Sql Server 2005+ and CTE you could try
;WITH Dups AS (
SELECT *,
ROW_NUMBER() OVER(PARTITION BY ID, Col1 ORDER BY ID, Col1) Rnum
FROM #TEST t
)
SELECT *
FROM Dups
WHERE Rnum > 1
OR just a standard
SELECT ID,
Col1,
COUNT(1) Cnt
FROM #TEST
GROUP BY ID,
Col1
HAVING COUNT(1) > 1
EDIT:
Display duplicate rows
SELECT t.*
FROM #Test t INNER JOIN
(
SELECT ID,
Col1,
COUNT(1) Cnt
FROM #TEST
GROUP BY ID,
Col1
HAVING COUNT(1) > 1
) dups ON t.ID = dups.ID
AND t.Col1 = dups.Col1
Every row in that set of data is a duplicate
select id, col1, count(*)
from test
group by id, col1
shows this
if you want to exclude the 2,B rows you need to do it explicitly
eg
SELECT id, col1
FROM Test
WHERE NOT (id = 2 and col1 = 'B')
SELECT t.*
FROM TEST t
INNER JOIN (
SELECT ID,COL1
from test
GROUP BY ID,COL1
HAVING COUNT(*) > 1
)
AS t2
ON t2.ID = t.ID AND t2.COL1 =t.COL1
order by t.ID,t.COL1

Resources