I want to write an update script for the following table.
Id int,
Title nvarchar(100),
ProgramId int,
EventId int,
SortOrder int
I want to set the SortOrder column to 1 through N, as sorted by the Id column. However, I want the number to restart when either ProgramId or EventId changes. That is, I'd like the numbering sequence 1...N for each row with the same ProgramId and EventId values, and then restart the numbering for the next ProgramId and EventId values.
I know I could use ROW_NUMBER to get a row number based on the current sorting, but I don't see how I could restart the number when one of those other two columns changes. Is this even possible?
Like this:
;WITH cte As
(
SELECT *,
ROW_NUMBER() OVER (PARTITION BY ProgramId, EventId ORDER BY Id) As RN
FROM YourTable
)
UPDATE cte
SET SortOrder = RN
Related
I am using ROW_NUMBER() to find duplicates, encapsulated under CTEs. I am using ROW_NUMBER() because I also want to have column which shows how many duplicate rows are present in the table.
The below code gives only records greater than 1. That is row number 2 and 3. But how can I include row number 1 of duplicate records?
If I removed T>1, than output also contain records which does not have duplicate like records with Part "'0020R5',100".
DDL:
DROP TABLE #TEST
CREATE TABLE #TEST
(
PART VARCHAR(30),
ALTPART int
)
INSERT #TEST
SELECT '15-AB78',100 UNION ALL
SELECT '15-AB78',110 UNION ALL
SELECT '16-A9-1',100 UNION ALL
SELECT '16-A9-1',110 UNION ALL
SELECT '16-B97-2',100 UNION ALL
SELECT '16-B97-2',110 UNION ALL
SELECT '0020R5',100
Query:
WITH TEST(PART,ALTPART,T) AS
(
SELECT PART,ALTPART,ROW_NUMBER() OVER (PARTITION BY PART ORDER BY ALTPART ASC) AS T FROM #TEST
)
SELECT PART,ALTPART,T
FROM TEST
WHERE T>1
ORDER BY PART
GO
Current output:
'15-AB78',110,2
'16-A9-1',110,2
'16-B97-2',110,2
Expected Result:
'15-AB78',100,1
'15-AB78',110,2
'16-A9-1',100,1
'16-A9-1',110,2
'16-B97-2',100,1
'16-B97-2',110,2
You need to add another window function to count the number of duplicates in each group.
Something along these lines:
WITH TEST(PART,ALTPART,DuplicateNum, DuplicateCnt) AS
(
SELECT PART,ALTPART,
-- Number each duplicate
ROW_NUMBER() OVER (PARTITION BY PART ORDER BY ALTPART ASC) AS DuplicateNum,
-- Count duplicates
COUNT() OVER (PARTITION BY PART ) AS DuplicateCnt,
FROM #TEST
)
SELECT PART, ALTPART, DuplicateNum
FROM TEST
WHERE DuplicateCnt > 1
ORDER BY PART
So I'm trying to create a report that ranks a duplicate record, the idea behind this is that the customer wants to merge a whole lot of duplicate records that came about from a migration.
I need the ranking so that my report can show which record should be the "main" record, i.e. the record that will have missing data pulled into it.
The duplicate definition is pretty simple:
If the email addresses are the same then it is always a duplicate, if
the emails do not match, then the first name, surname, and mobile must
match.
The ranking will be based on a whole bunch of columns in the table, so:
email address isn't NULL = 50
phone number isn't NULL = 20
etc.. whichever gets the highest number in the duplicate group becomes the main record. This is where I am having issues, I can't seem to find a way to get an incremental number for each duplicate set. This is some of the code I have so far:
( I took out some of the rank columns in the temp table and CTE expression to shorten it )
DECLARE #tmp_Duplicates TABLE (
tmp_personID INT
, tmp_Firstname NVARCHAR(100)
, tmp_Surname NVARCHAR(100)
, tmp_HomeEmail NVARCHAR(300)
, tmp_MobileNumber NVARCHAR(100)
--- Ratings
, tmp_HomeEmail_Rating INT
--- Groupings
, tmp_GroupNumber INT
)
;WITH cteDupes AS
(
SELECT ROW_NUMBER() OVER(PARTITION BY personHomeEmail ORDER BY personID DESC) AS RND,
ROW_NUMBER() OVER(PARTITION BY personHomeEmail ORDER BY personId) AS RNA,
p.personID, p.PersonFirstName, p.PersonSurname, p.PersonHomeEMail
, personMobileTelephone
FROM tblCandidate c INNER JOIN tblPerson p ON c.candidateID = p.personID
)
INSERT INTO #tmp_Duplicates
SELECT PersonID, PersonFirstName, PersonSurname, PersonHomeEMail, personMobileTelephone
, 10, RND
FROM cteDupes
WHERE RNA + RND > 2
ORDER BY personID, PersonFirstName, PersonSurname
SELECT * FROM #tmp_Duplicates
This gives me the results I want, but the group number isn't showing how I need it:
What I need is for each group to be an incremental value:
I have a table called MyHistory my history have about 1000 rows in this table and the performance is crappy at best.
What I want to do is select rows showing the next row as a result. This is probably a bad example.
this is MyHistory structure ID int,DateTimeColumn datetime,ValueResult decimal (4,2)
my table has the following data
ID|DateTimeColumn|ValueResult
1|8/1/2005 1:01:29 PM|2
1|8/1/2006 1:01:29 PM|3
1|8/1/2007 1:01:29 PM|5
1|8/1/2008 1:01:29 PM|9
What I want to do is select out of this the following data
ID|DateTimeColumn|ValueResult|ChangeValue
1|8/1/2008 1:01:29 PM|9|4
1|8/1/2007 1:01:29 PM|5|2
1|8/1/2006 1:01:29 PM|3|1
1|8/1/2005 1:01:29 PM|2|
You'll notice that ID is = ID and the datetime column is now desc. Thats the easy part. But how do I make a self referencing table (in order to calculate the difference in value) based on which datetime comes next?
Thanks!
So, the task is:
to order records by DateTimeColumn descending,
to set sequence number for each record to identify next record,
to calculate required difference in value.
This is one of many possible solutions:
-- Use CTE to make intermediate table with sequence numbers - ranks
;WITH a (rank, ID, DateTimeColumn, ValueResult) AS
(
select rank() OVER (ORDER BY m.DateTimeColumn DESC) as rank, ID, DateTimeColumn, ValueResult
from MyHistory
)
-- Select all resulting columns
select a1.ID,
a1.DateTimeColumn,
a1.ValueResult,
a1.ValueResult - a2.ValueResult as ChangeValue -- Difference between current record and next one
from a a1
join a a2
on a2.rank = a1.rank + 1 -- linking next record to each one
I'm hoping for a cleaner way to do something that I know how to do one way. I want to retrieve the UserId for the MAX ID value as well as that MAX ID value. Let's say I have a table with data like this:
ID UserId Value
1 10 'Foo'
2 15 'Blah'
3 10 'Blech'
4 20 'Qwerty'
I want to retrieve:
ID UserId
4 20
I know I could do this like so:
SELECT
t.ID,
t.UserID
FROM
(
SELECT MAX(ID) as [MaxID]
FROM table
) as m
JOIN table as t ON m.MaxID = t.ID
I'm only vaguely familiar with the ROW_NUMBER(), RANK() and other similar methods and I can't help believing that this scenario could benefit from some such method to get rid of joining back to the table.
You can definitely use ROW_NUMBER for something like this:
with t1Rank as
(
select *
, t1Rank = row_number() over (order by ID desc)
from t1
)
select ID, UserID
from t1Rank
where t1Rank = 1
SQL Fiddle with demo.
The advantage with this approach is you can bring Value (or other fields as required) into the result set, too. Plus you can tweak the ordering/grouping as required.
You could also just do it with a sub-query like this:
SELECT ID ,
UserID
FROM table
WHERE ID = ( SELECT MAX(ID)
FROM table
);
SELECT TOP 1 ID, UserID FROM <table> ORDER BY ID DESC
If I have a SQL statement such as:
SELECT TOP 5
*
FROM Person
WHERE Name LIKE 'Sm%'
ORDER BY ID DESC
PRINT ##ROWCOUNT
-- shows '5'
Is there anyway to get a value like ##ROWCOUNT that is the actual count of all of the rows that match the query without re-issuing the query again sans the TOP 5?
The actual problem is a much more complex and intensive query that performs beautifully since we can use TOP n or SET ROWCOUNT n but then we cannot get a total count which is required to display paging information in the UI correctly. Presently we have to re-issue the query with a #Count = COUNT(ID) instead of *.
Whilst this doesn't exactly meet your requirement (in that the total count isn't returned as a variable), it can be done in a single statement:
;WITH rowCTE
AS
(
SELECT *
,ROW_NUMBER() OVER (ORDER BY ID DESC) AS rn1
,ROW_NUMBER() OVER (ORDER BY ID ASC) AS rn2
FROM Person
WHERE Name LIKE 'Sm%'
)
SELECT *
,(rn1 + rn2) - 1 as totalCount
FROM rowCTE
WHERE rn1 <=5
The totalCount column will have the total number of rows matching the where filter.
It would be interesting to see how this stacks up performance-wise against two queries on a decent-sized data-set.
you'll have to run another COUNT() query:
SELECT TOP 5
*
FROM Person
WHERE Name LIKE 'Sm%'
ORDER BY ID DESC
DECLARE #r int
SELECT
#r=COUNT(*)
FROM Person
WHERE Name LIKE 'Sm%'
select #r
Something like this may do it:
SELECT TOP 5
*
FROM Person
cross join (select count(*) HowMany
from Person
WHERE Name LIKE 'Sm%') tot
WHERE Name LIKE 'Sm%'
ORDER BY ID DESC
The subquery returns one row with one column containing the full count; the cross join includes it with all rows returned by the "main" query"; and "SELECT *" would include new column HowMany.
Depending on your needs, the next step might be to filter out that column from your return set. One way would be to load the data from the query into a temp table, and then return just the desired columns, and get rowcount from the HowMany column from any row.