Group row id for all unique rows in SQL Server - sql-server

I would like to generate group row id for each unique rows. Any hint or suggestion would be helpful. I am using SQL Server 2017.

You can use the windowed function ROW_NUMBER() as shown below:
Select col1
, col2
, ROW_NUMBER() OVER (PARTITION BY col1, col2 ORDER BY col1, col2) [Group ID]
from <YourTableName>
Db<>Fiddle Demo

For your Case ROW NUMBER Function and use PARTITION by Column Names
SELECT *,ROW_NUMBER () OVER (Partition By COL1 ORDER BY COL1) from #T

Related

Snowflake order by

Is there any way I can select from a table without specifying the order by column in the order by clause?
select col1 from table order by col2
This works in TSQL, but doesn't appear to be allowed in Snowflake.
Yes, this is possible:
CREATE OR REPLACE TABLE tab AS
SELECT 1 AS col1, 'B' AS col2 UNION ALL
SELECT 2, 'A';
SELECT col1
FROM tab
ORDER BY col2;
Output:

SQL Server - How to set Row ID for duplicate or similar content based on insertion first. Select query

Priority is the output column.
The group contains duplicate content.
how can I fix this using SQL query?.
enter image description here
One way to achieve the desired result with this data is to use Dense_rank() function like below:
select *, dense_rank() over (order by [Group]) as Priority
from tab
order by No
For any value, please try the following
;with cte as
(
select [Group], ROW_NUMBER() over (order by No_min) as rn
from
(
select [Group], min([No]) No_min
from tab
group by [Group]
)t
)
select t.*, x.rn as [Priority]
from cte x
join tab t on t.[Group] = x.[Group]
order by 1
Please find the db<>fiddle here.

SQL Server: select all duplicate rows where col1+col2 exists more than once

I have a table which has around 300,000 rows. 225 Rows are being added to this table daily since March 16,2015 till July 09,2015
My problem is that, from last 1 week or so, some duplicate rows are being entered in the table (i.e more than 225 per day)
Now I want to select (and ultimately delete!) all the duplicate rows from the table that have more than 1 siteID+ reportID combination existing against one Date column .
Example is attached in the screenshot:
When Row_Number() is used with Partition By clause, it can provide the SQL developer to select duplicate rows in a table
Please check the SQL tutorial on how to delete duplicate rows in SQL table
Below query is what is copied from that article and applied to your requirement:
;WITH DUPLICATES AS
(
SELECT *,
RN = ROW_NUMBER() OVER (PARTITION BY siteID, ReportID ORDER BY Date)
FROM myTable
)
DELETE FROM DUPLICATES WHERE RN > 1
I hope it helps,
When you want to filter duplicated rows I suggest you this type of query:
SELECT *
FROM (
SELECT *, ROW_NUMBER() OVER (PARTITION BY Col1, Col2 ORDER BY Col3) As seq
FROM yourTable) dt
WHERE (seq > 1)
Like this:
SELECT *
FROM (
SELECT *, ROW_NUMBER() OVER (PARTITION BY siteID, reportID, [Date] ORDER BY ID) As seq
FROM yourTable) dt
WHERE (seq > 1)

Use percentile_cont with a "group by" statment in T-SQL

I'd like to use the percentile_cont function to get median values in T-SQL. However, I also need to get mean values as well. I'd like to do something like the following:
SELECT CustomerID ,
AVG(Expenditure) AS MeanSpend , percentile_cont
( .5) WITHIN GROUP(ORDER BY Expenditure) OVER( ) AS MedianSpend
FROM Customers
GROUP BY CustomerID
Can this be accomplished? I know I can use the OVER clause to group the percentile_cont results...
but then I'm stuck using two queries, am I not?
Just figured it out... gotta drop the group by and give both aggregation functions a over statement.
SELECT CustomerID,
AVG(Expenditure) OVER(PARTITION BY CustomerID) AS MeanSpend,
percentile_cont(.5) WITHIN GROUP(ORDER BY Expenditure) OVER(PARTITION BY CustomerID) AS MedianSpend
FROM Customers
You can't use "group by" with window functions. These functions return the aggregated values for every row. One way is to use "select distinct" to get rid of the duplicate rows. Just make sure you partition each window function by the non-aggregated columns (groupId in this example).
--Generate test data
SELECT TOP(10)
value.number%3 AS groupId
, value.number AS number
INTO #data
FROM master.dbo.spt_values AS value
WHERE value."type" = 'P'
ORDER BY NEWID()
;
--View test data
SELECT * FROM #data ORDER BY groupId,number;
--CALCULATE MEDIAN
SELECT DISTINCT
groupId
, AVG(number) OVER(PARTITION BY groupId) AS mean
, percentile_cont(.5) WITHIN GROUP(ORDER BY number) OVER(PARTITION BY groupId) AS median
FROM #data
;
--Clean up
DROP TABLE #data;

Generate Row Serial Numbers in SQL Query

I have a customer transaction table. I need to create a query that includes a serial number pseudo column. The serial number should be automatically reset and start over from 1 upon change in customer ID.
Now, I am familiar with the row_number() function in SQL. This doesnt exactly solve my problem because to the best of my knowledge the serial number will not be reset in case the order of the rows change.
I want to do this in a single query (SQL Server) and without having to go through any temporary table usage etc. How can this be done?
Sometime we might don't want to apply ordering on our result set to add serial number. But if we are going to use ROW_NUMBER() then we have to have a ORDER BY clause. So, for that we can simply apply a tricks to avoid any ordering on the result set.
SELECT ROW_NUMBER() OVER(ORDER BY (SELECT 1)) AS ItemNo, ItemName FROM ItemMastetr
For that we don't need to apply order by on our result set. We'll just add ItemNo on our given result set.
select
ROW_NUMBER() Over (Order by CustomerID) As [S.N.],
CustomerID ,
CustomerName,
Address,
City,
State,
ZipCode
from Customers;
I'm not certain, based on your question if you want numbered rows that will remember their numbers even if the underlying data changes (and gives a different ordering), but if you just want numbered rows - that reset on a change in customer ID, then try using the Partition by clause of row_number()
row_number() over(partition by CustomerID order by CustomerID)
Implementing Serial Numbers Without Ordering Any of the Columns
Demo SQL Script-
IF OBJECT_ID('Tempdb..#TestTable') IS NOT NULL
DROP TABLE #TestTable;
CREATE TABLE #TestTable (Names VARCHAR(75), Random_No INT);
INSERT INTO #TestTable (Names,Random_No) VALUES
('Animal', 363)
,('Bat', 847)
,('Cat', 655)
,('Duet', 356)
,('Eagle', 136)
,('Frog', 784)
,('Ginger', 690);
SELECT * FROM #TestTable;
There are ‘N’ methods for implementing Serial Numbers in SQL Server. Hereby, We have mentioned the Simple Row_Number Function to generate Serial Numbers.
ROW_NUMBER() Function is one of the Window Functions that numbers all rows sequentially (for example 1, 2, 3, …) It is a temporary value that will be calculated when the query is run. It must have an OVER Clause with ORDER BY. So, we cannot able to omit Order By Clause Simply. But we can use like below-
SQL Script
IF OBJECT_ID('Tempdb..#TestTable') IS NOT NULL
DROP TABLE #TestTable;
CREATE TABLE #TestTable (Names VARCHAR(75), Random_No INT);
INSERT INTO #TestTable (Names,Random_No) VALUES
('Animal', 363)
,('Bat', 847)
,('Cat', 655)
,('Duet', 356)
,('Eagle', 136)
,('Frog', 784)
,('Ginger', 690);
SELECT Names,Random_No,ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS SERIAL_NO FROM #TestTable;
In the Above Query, We can Also Use SELECT 1, SELECT ‘ABC’, SELECT ” Instead of SELECT NULL. The result would be Same.
SELECT ROW_NUMBER() OVER (ORDER BY ColumnName1) As SrNo, ColumnName1, ColumnName2 FROM TableName
select ROW_NUMBER() over (order by pk_field ) as srno
from TableName
Using Common Table Expression (CTE)
WITH CTE AS(
SELECT ROW_NUMBER() OVER(ORDER BY CustomerId) AS RowNumber,
Customers.*
FROM Customers
)
SELECT * FROM CTE
I found one solution for MYSQL its easy to add new column for SrNo or kind of tepropery auto increment column by following this query:
SELECT #ab:=#ab+1 as SrNo, tablename.* FROM tablename, (SELECT #ab:= 0)
AS ab
ALTER function dbo.FN_ReturnNumberRows(#Start int, #End int) returns #Numbers table (Number int) as
begin
insert into #Numbers
select n = ROW_NUMBER() OVER (ORDER BY n)+#Start-1 from (
select top (#End-#Start+1) 1 as n from information_schema.columns as A
cross join information_schema.columns as B
cross join information_schema.columns as C
cross join information_schema.columns as D
cross join information_schema.columns as E) X
return
end
GO
select * from dbo.FN_ReturnNumberRows(10,9999)

Resources