Tough SQL Rank query - sql-server

Scenario: primary user table plus a separate audit table that tracks changes to the user's name. Each time a user is added or part of their name is edited, we write a row to the audit table.
Trying to write a query that pulls the most immediate former name
select nsh.AuthEmail, nsh.UserID, nsh.name_lastnamefirst, t.FormerName, t.RankOrder
from (
Select
an.AuditNameID, nsh.AuthEmail, nsh.UserID, nsh.name_lastnamefirst,
FormerName = CASE WHEN RTRIM(an.LastName) <> RTRIM(nsh.LastName) OR RTRIM(an.FirstName) <> RTRIM(nsh.FirstName) OR RTRIM(an.Suffix) <> RTRIM(nsh.Suffix) OR RTRIM(an.MaidenName)<>RTRIM(nsh.MaidenName) THEN LTRIM(an.LastName + ' ' + an.Suffix + ', ' + an.FirstName + ' ' + ISNULL(an.MiddleName,''))
ELSE null
END,
RANK() over (partition by an.UserID order by an.AuditNameID DESC) RankOrder
From [dbo].[AuditName] an
INNER JOIN dbo.StudentPrograms p ON an.UserID = p.UserID
INNER JOIN dbo.NameScalarHelper nsh ON p.UserID = nsh.UserID
WHERE p.SiteProgramID = 139 AND p.IsActive =1
) t
RIGHT OUTER JOIN dbo.NameScalarHelper nsh ON nsh.UserID = t.UserID
where FormerName is not null
The problem is that I can't figure out how to return the data from audit table where the RANK is RANK -1 because the top rank is the current data. Let me know if any ideas.

Looking at your requirement you basicly need to return latest name that doesn't match current one from your Audit table.
I think you could use OUTER APPLY to achieve that:
SELECT *
FROM [dbo].[StudentPrograms] AS SP
INNER JOIN [dbo].[NameScalarHelper] AS NSH
ON NSH.UserID = SP.UserID
OUTER APPLY (
SELECT TOP (1) *
FROM [dbo].[AuditName] AS AN
WHERE AN.UserID = SP.UserID
AND (
RTRIM(AN.LastName) <> RTRIM(NSH.LastName)
OR RTRIM(AN.FirstName) <> RTRIM(NSH.FirstName)
OR RTRIM(AN.Suffix) <> RTRIM(NSH.Suffix)
OR RTRIM(AN.MaidenName) <> RTRIM(NSH.MaidenName)
)
ORDER BY AuditNameID DESC
) AS AN
WHERE SP.SiteProgramID = 139
AND SP.IsActive = 1;
This will find latest name from your audit table which doesn't match latest one.
By the way, I'd strongly suggest to clean up your database and remove any trailing/leading spaces, so that you don't need to use LTRIM() or RTRIM() in your where clause so that SQL Server would be able to make use of indexes. Read this article for more details.
SELECT *
FROM [dbo].[StudentPrograms] AS SP
INNER JOIN [dbo].[NameScalarHelper] AS NSH
ON NSH.UserID = SP.UserID
OUTER APPLY (
SELECT TOP (1) *
FROM [dbo].[AuditName] AS AN
WHERE AN.UserID = SP.UserID
AND (
AN.LastName <> NSH.LastName
OR AN.FirstName <> NSH.FirstName
OR AN.Suffix <> NSH.Suffix
OR AN.MaidenNam) <> NSH.MaidenName
)
ORDER BY AuditNameID DESC
) AS AN
WHERE SP.SiteProgramID = 139
AND SP.IsActive = 1;
I was trying to understand the way you store data and replicated a tiny example:
DECLARE #User TABLE
(
UserID INT
, FirstName VARCHAR(50)
, LastName VARCHAR(50)
);
DECLARE #Audit TABLE
(
AuditID INT IDENTITY(1, 1)
, UserID INT
, FirstName VARCHAR(50)
, LastName VARCHAR(50)
);
INSERT INTO #User (UserID, FirstName, LastName)
VALUES (1, 'Ben', 'White');
INSERT INTO #Audit (UserID, FirstName, LastName)
VALUES (1, 'Ben', 'White');
SELECT *
FROM #User AS U
OUTER APPLY (
SELECT TOP (1) *
FROM #Audit AS A
WHERE A.UserID = U.UserID
AND (
A.FirstName <> U.FirstName
OR A.LastName <> U.LastName
)
ORDER BY A.AuditID DESC
) AS A;
UPDATE U
SET U.LastName = 'Whiter'
FROM #User AS U
WHERE U.UserID = 1;
INSERT INTO #Audit (UserID, FirstName, LastName)
VALUES (1, 'Ben', 'Whiter');
SELECT *
FROM #User AS U
OUTER APPLY (
SELECT TOP (1) *
FROM #Audit AS A
WHERE A.UserID = U.UserID
AND (
A.FirstName <> U.FirstName
OR A.LastName <> U.LastName
)
ORDER BY A.AuditID DESC
) AS A;
UPDATE U
SET U.LastName = 'Whitest'
FROM #User AS U
WHERE U.UserID = 1;
INSERT INTO #Audit (UserID, FirstName, LastName)
VALUES (1, 'Ben', 'Whitest');
SELECT *
FROM #User AS U
OUTER APPLY (
SELECT TOP (1) *
FROM #Audit AS A
WHERE A.UserID = U.UserID
AND (
A.FirstName <> U.FirstName
OR A.LastName <> U.LastName
)
ORDER BY A.AuditID DESC
) AS A;
INSERT INTO #User (UserID, FirstName, LastName)
VALUES (2, 'Tom', 'Brooks');
INSERT INTO #Audit (UserID, FirstName, LastName)
VALUES (2, 'Tom', 'Brooks');
SELECT *
FROM #User AS U
OUTER APPLY (
SELECT TOP (1) *
FROM #Audit AS A
WHERE A.UserID = U.UserID
AND (
A.FirstName <> U.FirstName
OR A.LastName <> U.LastName
)
ORDER BY A.AuditID DESC
) AS A;
I assumed that when you create a user - you also add record to Audit table for consistency. Each time you make an update - you also log that into Audit table. Finally I just added yet another user and ran the query.
That's the output for each query:
User was created:
UserID FirstName LastName AuditID UserID FirstName LastName
------ --------- -------- ------- ------ --------- --------
1 Ben White null null null null
Its' last name was changed first time:
UserID FirstName LastName AuditID UserID FirstName LastName
------ --------- -------- ------- ------ --------- --------
1 Ben Whiter 1 1 Ben White
Its' last name was changed second time:
UserID FirstName LastName AuditID UserID FirstName LastName
------ --------- -------- ------- ------ --------- --------
1 Ben Whitest 2 1 Ben Whiter
A new user has been added:
UserID FirstName LastName AuditID UserID FirstName LastName
------ --------- -------- ------- ------ --------- --------
1 Ben Whitest 2 1 Ben Whiter
2 Tom Brooks null null null null
Everything else is just formatting and you should not do that in SQL Server - this should be done in application layer.

By looking at your query, I'm guessing you are getting back all of a person's past names, since you have no filter based on the RankOrder that you've created. Your current name should be 1 in the RankOrder, I assume, and so your most recent previous name would be ranked 2. You can add this to your derived table's where clause like this:
select nsh.AuthEmail, nsh.UserID, nsh.name_lastnamefirst, t.FormerName, t.RankOrder
from (
Select
an.AuditNameID, nsh.AuthEmail, nsh.UserID, nsh.name_lastnamefirst,
FormerName = CASE WHEN RTRIM(an.LastName) <> RTRIM(nsh.LastName) OR RTRIM(an.FirstName) <> RTRIM(nsh.FirstName) OR RTRIM(an.Suffix) <> RTRIM(nsh.Suffix) OR RTRIM(an.MaidenName)<>RTRIM(nsh.MaidenName) THEN LTRIM(an.LastName + ' ' + an.Suffix + ', ' + an.FirstName + ' ' + ISNULL(an.MiddleName,''))
ELSE null
END,
RANK() over (partition by an.UserID order by an.AuditNameID DESC) RankOrder
From [dbo].[AuditName] an
INNER JOIN dbo.StudentPrograms p ON an.UserID = p.UserID
INNER JOIN dbo.NameScalarHelper nsh ON p.UserID = nsh.UserID
WHERE p.SiteProgramID = 139 AND p.IsActive =1 and RankOrder = 2
) t
RIGHT OUTER JOIN dbo.NameScalarHelper nsh ON nsh.UserID = t.UserID
where FormerName is not null
Let me know if I am missing something.

Related

SQL Server: How to show list of manager name separated by comma

i have to show employee name and his manager name hierarchy separated by comma.
if ram is top manager and ram is sam manager and sam is rahim manager then i would like to have output like
Desired output
EMP Name Manager's Name
--------- ---------------
Rahim Sam,Ram
Sam Ram
Ram No manager
i got script which show the employee and his manager name. here is script
;WITH EmployeeCTE AS
(
Select ID,Name,MgrID, 0 as level FROM #Employee
WHERE ID=3
UNION ALL
Select r.ID,r.Name,r.MgrID, level+1 as level
FROM #Employee r
JOIN EmployeeCTE p on r.ID = p.MgrID
)
Select e1.Name
,ISNULL(e2.Name,'Top BOSS') as [Manager Name]
,row_number() over (order by e1.level desc) as [Level]
from EmployeeCTE e1
left join EmployeeCTE e2 on e1.MgrID=e2.ID
Output
Name Manager Name Level
Simon Top BOSS 1
Katie Simon 2
John Katie 3
i also know how to show comma separated list. here is one sample script.
SELECT
PNAME,
STUFF
(
(
SELECT ',' + Mname
FROM Myproducts M
WHERE M.PNAME = P.PNAME
ORDER BY Mname
FOR XML PATH('')
), 1, 1, ''
) AS Models
FROM
Myproducts p
GROUP BY PNAME
now some tell me how could i merge two script to get the desired output. thanks
CREATE TABLE #EMP (
EmpID INT
, ManagerID INT
, Name NVARCHAR(50) NULL
);
INSERT INTO #EMP (EmpID, ManagerID, Name)
VALUES
( 1, NULL, 'John')
, (2, 1, 'Katie')
, (3, 2, 'Simon');
SELECT *
FROM
#EMP;
WITH a AS (
SELECT
EmpID
, Name
, ManagerID
, CONVERT(NVARCHAR(MAX),'') AS ManagerChain
FROM
#Emp
WHERE
ManagerID IS NULL
UNION ALL
SELECT
e.EmpID
, e.Name
, e.ManagerID
, CASE
WHEN a.ManagerChain ='' THEN a.Name
ELSE CONCAT(a.Name, CONCAT(',',a.ManagerChain))
END
FROM
#Emp e
JOIN a ON e.ManagerID = a.EmpID
)
SELECT
a.Name
, IIF(a.ManagerChain='','No Manager',a.ManagerChain) AS ManagerChain
FROM
a;
DROP TABLE #EMP;
Assuming a table structure of
DECLARE #Employee TABLE(
ID INT,
Name VARCHAR(10),
MgrID INT)
INSERT INTO #Employee
VALUES (1,'Ram',NULL),
(2,'Sam',1),
(3,'Rahim',2);
You can use
WITH EmployeeCTE
AS (SELECT ID,
Name,
MgrID,
0 AS level,
CAST('No manager' AS VARCHAR(8000)) AS [Managers Name]
FROM #Employee
WHERE MgrID IS NULL
UNION ALL
SELECT r.ID,
r.Name,
r.MgrID,
level + 1 AS level,
CAST(P.Name + CASE
WHEN level > 0
THEN ',' + [Managers Name]
ELSE ''
END AS VARCHAR(8000))
FROM #Employee r
JOIN EmployeeCTE p
ON r.MgrID = p.ID)
SELECT *
FROM EmployeeCTE

Turn many to many relationship into single row

I have the following tables
Users
Id
FirstName
LastName
Sample Data
1,'Peter','Smith'
2,'John','Como'
Phones
Id
UserId
PhoneTypeId
Phone
ContactName
Sample data
1,1,4,'555-555-5551','Peter'
2,1,4,'555-555-5552','Paul'
3,1,4,'555-555-5553','Nancy'
4,1,4,'555-555-5554','Hellen'
PhoneTypes
Id
Type
with sample data
1 Home
2 Work
3 Cell
4 Emergency
I have to create following result
UserId, UserFirstName, UserLastName, FirstEmergencyContactName, FirstEmergencyContactPhone, SecondEmergencyContactName, SecondEmergencyContactPhone, ThirdEmergencyContactName, ThirdEmergencyContactPhone, FourthEmergencyContactName, FourthEmergencyContactPhone, FifthEmergencyContactName, FifthEmergencyContactPhone
How can I create a single row for every user with emergency contacts? Some of the users might have one emergency contact and others might have many, but I need only five of them.
This is called table pivoting. Since you want no more than 5 results, you can use conditional aggregation with row_number:
select id, firstname, lastname,
max(case when rn = 1 then contactname end) emergency_contact1,
max(case when rn = 1 then phone end) emergency_phone1,
max(case when rn = 2 then contactname end) emergency_contact2,
max(case when rn = 2 then phone end) emergency_phone2,
...
from (
select u.id, u.firstname, u.lastname, p.phone, p.contactname,
row_number() over (partition by u.id order by p.phonetypeid) rn
from users u
join phones p on u.id = p.userid
) t
group by id, firstname, lastname
Also you can use pivoting, without dynamic SQL and hard-coding, because you need only 5 contacts/phones. Example below:
;WITH cte AS (
SELECT p.UserId,
FirstName,
LastName,
CAST(ContactName as nvarchar(100)) as ContactName,
CAST(Phone as nvarchar(100)) as ContactPhone,
CAST(ROW_NUMBER() OVER (PARTITION BY p.UserId ORDER BY pt.Id) as nvarchar(100)) as RN
FROM Users u
INNER JOIN Phones p
ON p.UserId = u.Id
INNER JOIN PhoneTypes pt
ON pt.Id = p.PhoneTypeId
WHERE pt.Id = 4
)
SELECT *
FROM (
SELECT UserId,
FirstName,
LastName,
[Columns]+RN as [Columns],
[Values]
FROM cte
UNPIVOT (
[Values] FOR [Columns] IN (ContactName, ContactPhone)
) as unp
) as t
PIVOT (
MAX([Values]) FOR [Columns] IN (ContactName1,ContactPhone1,ContactName2,ContactPhone2,ContactName3,ContactPhone3,
ContactName4,ContactPhone4,ContactName5,ContactPhone5)
) as pvt
Output:
UserId FirstName LastName ContactName1 ContactPhone1 ContactName2 ContactPhone2 ContactName3 ContactPhone3 ContactName4 ContactPhone4 ContactName5 ContactPhone5
1 Peter Smith Peter 555-555-5551 Paul 555-555-5552 Nancy 555-555-5553 Hellen 555-555-5554 NULL NULL
2 John Cono Harry 555-555-5555 William 555-555-5556 John 555-555-5557 NULL NULL NULL NULL
I add some more contacts.

Join Based On Column Value (Best Match)

I have 3 columns to base my JOIN on -> ID, Account, Cust. There can be multiple rows containing the same ID value.
I want to prioritise my JOIN on 1) ID, 2) Account, 3) Cust.
So in the example below, the UserCode that should be populated in #UserData should be 'u11z' as all columns contain a value.
How do I do this? Below my code to date...
UPDATE #UserData
SET UserCode = ur.UserCode
FROM #UserData uA
INNER JOIN UserReference ur
ON uA.ID = ur.ID
AND ((ua.Account = ur.Account) OR (ur.Account = ur.Account))
AND ((ua.Cust = ur.Cust) OR (ur.Cust = ur.Cust))
UserReference TABLE:
Cust Account ID UserCode
234 NULL 9A2346 u12x
234 Test 9A2346 u11z
NULL NULL 9A2346 u30s
#UserData TABLE:
Cust Account ID UserCode
234 Test 9A2346 NULL
Thanks!
You can try the following. I joined tables, counted the number of matches, and ranked them. Then select rank 1.
; with userCte (userCodeA, userCodeB, rank)
as
(
select a.usercode, b.usercode,
rank() over (partition by a.id order by case when a.cust = b.cust then 1 else 0 end +
case when a.account = b.account then 1 else 0 end +
case when a.id = b.id then 1 else 0 end desc) as rank
from userdata a
join userreference b
on a.id = b.id or a.account = b.account or a.id = b.id
)
select * from userCte
--update userCte
--set userCodeA = userCodeB
where rank = 1
Is this what you want? It is difficult to understand what you are asking for.
USE tempdb;
CREATE TABLE UserReference
(
ID VARCHAR(255) NULL
, Account VARCHAR(255) NULL
, Cust INT NULL
, UserCode VARCHAR(255)
);
INSERT INTO UserReference VALUES ('9A2346', NULL, 234, 'A');
INSERT INTO UserReference VALUES ('9A2346', 'TEST', 234, 'B');
INSERT INTO UserReference VALUES ('9A2346', NULL, NULL, 'C');
DECLARE #UserData TABLE
(
ID VARCHAR(255) NULL
, Account VARCHAR(255) NULL
, Cust INT NULL
, UserCode VARCHAR(255)
);
INSERT INTO #UserData
SELECT UR.ID, UR.Account, UR.Cust, NULL
FROM dbo.UserReference UR;
UPDATE #UserData
SET UserCode = ur.UserCode
FROM #UserData uA
INNER JOIN UserReference ur
ON uA.ID = ur.ID
AND ua.Account = ur.Account
AND ua.Cust = ur.Cust;
SELECT *
FROM #UserData;
Results of the last SELECT :
If I understood your question correctly...
And if any col in a row having a null value will drop the priority then you can use a query something like below to check the count of null values in a row, this might not be a complete answer, but a possible approach...
SELECT count(*)
FROM TableName
WHERE Col1 IS NULL and Col2 is null

Going through tables and replacing IDs resulting from duplicates in dimension table

I have a dimension Users table that unfortunately has a bunch of duplicate records. See screenshot.
I have thousands of users and 5 tables referencing the duplicates. I want to delete records with "bad" UserIDs. I want to go through the 5 dependencies and update bad UserIds with "good" (circled in red).
What would be a good approach to this?
Here's what I did to get the above screenshot...
SELECT UserID
,userIds.FirstName
,userIds.LastName
,dupTable.Email
,dupTable.Username
,dupTable.DupCount
FROM dbo.DimUsers AS userIds
LEFT OUTER JOIN
(SELECT FirstName
,LastName
,Email
,UserName
,DupCount
FROM
(SELECT FirstName
,LastName
,UserName
,Email
,COUNT(*) AS DupCount -- we're finding duplications by matches on FirstName,
-- last name, UserName AND Email. All four fields must match
-- to find a dupe. More confidence from this.
FROM dbo.DimUsers
GROUP BY FirstName
,LastName
,UserName
,Email
HAVING COUNT(*) > 1) AS userTable -- any count more than 1 is a dupe
WHERE LastName NOT LIKE 'NULL' -- exclude entries with literally NULL names
AND FirstName NOT LIKE 'NULL'
)AS dupTable
ON dupTable.FirstName = userIds.FirstName -- to get the userIds of dupes, we LEFT JOIN the original table
AND dupTable.LastName = userIds.LastName -- on four fields to increase our confidence
AND dupTable.Email = userIds.Email
AND dupTable.Username = userIds.Username
WHERE DupCount IS NOT NULL -- ignore NULL dupcounts, these are not dupes
This code should work, created for 1 dependency table but you can use the same logic to update other 4 tables.
update t
set UserID = MinUserID.UserID
from
DimUsersChild1 t
inner join DimUsers on DimUsers.UserID = t.UserID
inner join (
select min(UserID) UserID, FirstName, LastName, UserName, Email
from DimUsers
group by
FirstName, LastName, UserName, Email
) MinUserID on
MinUserID.FirstName = DimUsers.FirstName and
MinUserID.LastName = DimUsers.LastName and
MinUserID.UserName = DimUsers.UserName and
MinUserID.Email = DimUsers.Email
select * from DimUsersChild1;
delete t1
from
DimUsers t
inner join DimUsers t1 on t1.FirstName = t.FirstName and
t1.LastName = t.LastName and
t1.UserName = t.UserName and
t1.Email = t.Email
where
t.UserID < t1.UserID
select * from DimUsers;
Here is a working demo

Multiple COUNT(*) with join

I have to COUNT some rows from multiple tables. Before I can do multiple COUNT I will have to subselect. The problem here is that I need to JOIN some values in order to get the right result.
SELECT
sponsor.Name As SponsorName,
COUNT(participants.[Table]) AS ParticipantCount,
( SELECT
COUNT(guestcards.[Table])
FROM
guestcards
WHERE
guestcards.EventID = #EventID
AND
guestcards.[Table] = #Table
AND
guestcards.SponsorID = participants.SponsorID
-- Here lies the problem.
-- I will need to check up on another value to ensure I get the right rows, but participants.SponsorID is not here because of no join :-(
) AS GuestParticipantCount
FROM
participants
LEFT JOIN
sponsor
ON
sponsor.ID = participants.SponsorID
WHERE
participants.EventID = #EventID
AND
participants.[Table] = #Table
GROUP BY
sponsor.Name
Guestcards table holds: sponsorid, eventid, tablename
Participantstable holds: sponsorid, eventid, tablename
Sponsor table holds: id, name
I need to count how many "Participants" there are and how many "Guestcards" that in a particulary event. These participants have a table (where they should sit) and so does the guestcards. I need to check up on if its the same "table" where they sit.
So I need to count how many are sitting at table "A1" or table "A2" etc.
The result I am after is like:
"Sponsor Name has 5 participants and 3 guestcards. They sit on A1"
I hope I made my self clear
Here's exact equivalent of you query (grouping on sponsor.Name):
SELECT sponsor.name,
COALESCE(SUM(participantCount), 0),
COALESCE(SUM(guestcardsCount), 0)
FROM (
SELECT sponsorId, COUNT(*) AS participantCount
FROM participants
WHERE eventId = #eventId
AND [table] = #table
GROUP BY
sponsorId
) p
FULL JOIN
(
SELECT sponsorId, COUNT(*) AS guestcardsCount
FROM guestdcards
WHERE eventId = #eventId
AND [table] = #table
GROUP BY
sponsorId
) g
ON g.sponsorId = p.sponsorId
FULL JOIN
sponsor s
ON s.id = COALESCE(p.sponsorId, g.sponsorId)
GROUP BY
s.sponsorName
However, I believe you want something more simple:
SELECT sponsorName, participantCount, guestcardsCount
FROM sponsor s
CROSS APLLY
(
SELECT COUNT(*) AS participantCount
FROM participants
WHERE sponsorId = s.id
AND eventId = #eventId
AND [table] = #table
) p
CROSS APLLY
(
SELECT COUNT(*) AS guestcardsCount
FROM guestdcards
WHERE sponsorId = s.id
AND eventId = #eventId
AND [table] = #table
) g
Update:
SELECT sponsor.name,
COALESCE(participantCount, 0),
COALESCE(guestcardsCount, 0)
FROM (
SELECT sponsorId, COUNT(*) AS participantCount
FROM participants
WHERE eventId = #eventId
AND [table] = #table
GROUP BY
sponsorId
) p
FULL JOIN
(
SELECT sponsorId, COUNT(*) AS guestcardsCount
FROM guestdcards
WHERE eventId = #eventId
AND [table] = #table
GROUP BY
sponsorId
) g
ON g.sponsorId = p.sponsorId
JOIN sponsor s
ON s.id = COALESCE(p.sponsorId, g.sponsorId)

Resources