UPDATE query to synchronize duplicated records - sql-server

I have a number of duplicate records in a table, highly simplfied example is:
name, emailaddress, importantid
John Smith, john#smith.com, NULL
John Smith, john#smith.com, 12345
John Smith, john#smith.com, NULL
The problem comes later when another table is joined to this, it may be joined to one of the records which doesn't have the importantid I need.
I'm looking to update the table so that for each email address it finds the first one where importantid is not null and then updates the other records with that id, so all duplicate accounts end up having the important id.
How could I do that?
Thanks

SQL Fiddle Demo
UPDATE a
SET a.importantid = b.importantid
FROM test AS a
JOIN (SELECT emailaddress, max(importantid) as importantid
FROM test
GROUP BY emailaddress) AS b
ON a.emailaddress = b.emailaddress;

UPDATE yourTable t1
SET importanid =(SELECT max(importantid)
FROM yourTable t2
WHERE t2.emailaddress = t1.emailaddress AND t1.importantid is null and t2.importantid is not null)

Related

Replacing data in one table with data in another table using a unique ID

I'm using Access 2016 to view data from a table on our SQL server. I have a massive audit log where the record being viewed is represented by a "FolderID" field. I have another table that has values for the FolderID (represented as "fid") along with columns identifying the record's name and other ID numbers.
I want to be able to replace the FolderID value in the first table with CUSTOMER_NAME value from the second table so I know what's being viewed at a glance.
I've tried googling different join techniques to build a query that will accomplish this, but my google-fu is weak or I'm just not caffeinated enough today.
Table 1.
EventTime EventType FolderID
4/4/2019 1:23:39 PM A 12345
Table 2
fid acc Other_ID Third_ID CUSTOMER_NAME
12345 0 9875 12345678 Doe, John
Basically I want to query Table 2 to search for fid using the value in Table 1 for FolderID, and I want it to respond with the CUSTOMER_NAME associated with the FolderID/fid. The result would look like:
EventTime EventType FolderID
4/4/2019 1:23:39 PM A Doe, John
I'm stupid because I thought I was too smart to use the freaking Query Wizard. When I did, and it prompted me to create relationships and actually think about what I was doing, it came up with this.
SELECT [table1].EventTime, [table1].EventType, [table1].FolderID, [table1].ObjRef, [table1].AreaID, [table1].FileID, [table2].CUSTOMER_NAME, [table2].fid FROM [table2]
LEFT JOIN [table1] ON [table2].[fid] = [table1].[FolderID];
You can run this query and check if it helps!.
Select EventTime, EventType , CUSTOMER_NAME AS FolderID FROM Table1, Table2 Where Table1.FolderID = Table2.fid;
Basically, 'AS' is doing what you want here as you can rename your column to whatever you want.

SQL join conditional either or not both?

I have 3 tables that I'm joining and 2 variables that I'm using in one of the joins.
What I'm trying to do is figure out how to join based on either of the statements but not both.
Here's the current query:
SELECT DISTINCT
WR.Id,
CAL.Id as 'CalendarId',
T.[First Of Month],
T.[Last of Month],
WR.Supervisor,
WR.cd_Manager as [Manager], --Added to search by the Manager--
WR.[Shift] as 'ShiftId'
INTO #Workers
FROM #T T
--Calendar
RIGHT JOIN [dbo].[Calendar] CAL
ON CAL.StartDate <= T.[Last of Month]
AND CAL.EndDate >= T.[First of Month]
--Workers
--This is the problem join
RIGHT JOIN [dbo].[Worker_Filtered]WR
ON WR.Supervisor IN (SELECT Id FROM [dbo].[User] WHERE FullName IN(#Supervisors))
or (WR.Supervisor IN (SELECT Id FROM [dbo].[User] WHERE FullName IN(#Supervisors))
AND WR.cd_Manager IN(SELECT Id FROM [dbo].[User] WHERE FullNameIN(#Manager))) --Added to search by the Manager--
AND WR.[Type] = '333E7907-EB80-4021-8CDB-5380F0EC89FF' --internal
WHERE CAL.Id = WR.Calendar
AND WR.[Shift] IS NOT NULL
What I want to do is either have the result based on the Worker_Filtered table matching the #Supervisor or (but not both) have it matching both the #Supervisor and #Manager.
The way it is now if it matches either condition it will be returned. This should be limiting the returned results to Workers that have both the Supervisor and Manager which would be a smaller data set than if they only match the Supervisor.
UPDATE
The query that I have above is part of a greater whole that pulls data for a supervisor's workers.
I want to also limit it to managers that are under a particular supervisor.
For example, if #Supervisor = John Doe and #Manager = Jane Doe and John has 9 workers 8 of which are under Jane's management then I would expect the end result to show that there are only 8 workers for each month. With the current query, it is still showing all 9 for each month.
If I change part of the RIGHT JOIN to:
WR.Supervisor IN (SELECT Id FROM [dbo].[User] WHERE FullName IN (#Supervisors))
AND WR.cd_Manager IN(SELECT Id FROM [dbo].[User] WHERE FullName IN(#Manager))
Then it just returns 12 rows of NULL.
UPDATE 2
Sorry, this has taken so long to get a sample up. I could not get SQL Fiddle to work for SQL Server 2008/2014 so I am using rextester instead:
Sample
This shows the results as 108 lines. But what I want to show is just the first 96 lines.
UPDATE 3
I have made a slight update to the Sample. this does get the results that I want. I can set #Manager to NULL and it will pull all 108 records, or I can have the correct Manager name in there and it'll only pull those that match both Supervisor and Manager.
However, I'm doing this with an IF ELSE and I was hoping to avoid doing that as it duplicates code for the insert into the Worker table.
The description of expected results in update 3 makes it all clear now, thanks. Your 'problem' join needs to be:
RIGHT JOIN Worker_Filtered wr on (wr.Supervisor in(#Supervisors)
and case when #Manager is null then 1
else case when wr.Manager in(#Manager) then 1 else 0 end
end = 1)
By the way, I don't know what you are expecting the in(#Supervisors) to achieve, but if you're hoping to supply a comma separated list of supervisors as a single string and have wr.Supervisor match any one of them then you're going to be disappointed. This query works exactly the same if you have = #Supervisors instead.

update table with values from 2 other tables

I would like to find the nearest Employee for my Customer and update in order table. I tried a Query which throws an error. Can any one suggest what am doing wrong on my query? The only select Statement is working fine. But the update looks some thing wrong.
I have 3 tables
Customer_Master
Customer_ID Customer_Name WHID Cust_Location
Cust100001 Subash WH10001 0xE6100000010C1B2E724F57172A408449F1F109685340
Cust100002 Naresh WH10002 0xE6100000010CBE30992A18152A4093AAED26F8675340
Employee_Master
Emp_ID Emp_name WHID Emp_Location
Emp100001 Prakash WH10001 0xE6100000010C363B527DE7172A4069C36169E0675340
Emp100002 Suresh WH10002 0xE6100000010C98C3EE3B86172A4064E597C118685340
Emp100003 Vincent WH10001 0xE6100000010CE5B8533A58172A4090DD054A0A685340
Emp100004 Paul WH10002 0xE6100000010C2EE6E786A6142A40A0A696ADF5675340
Order_Tran
Order_ID Cust_ID Emp_ID
ORD19847 Cust100001 ?????
ORD19856 Cust100002 ?????
I have Location of the customer and also Employee in Master Tables. Now i want to update Emp_ID in Order_Tran table who is nearest to the customer location in Order Table for Customer Cust100001.
I tried the below query which is showing error
Update Order_Tran Set Emp_ID=(Select Top (1) Emp_ID, Employee_Master.Emp_Location.STDistance(Customer_Master.Cust_Location) AS DistanceApart FROM Customer_Master, Employee_Master WHERE Customer_ID = 'Cust100001'
and Customer_Master.WHID = Employee_Master.WHID ORDER BY DistanceApart);
Try selecting a single value in your sub-query (to stop the error) and adding an outer where clause (to ensure only the specified employee is updated).
UPDATE Order_Tran
SET Emp_ID=(SELECT TOP (1) Emp_ID
FROM Customer_Master, Employee_Master
WHERE Customer_ID = 'Cust100001'
AND Customer_Master.WHID = Employee_Master.WHID
ORDER BY Employee_Master.Emp_Location.STDistance(Customer_Master.Cust_Location))
WHERE Customer_ID = 'Cust100001'

T-SQL for Updating Rows with same value in a column

I have a table lets say called FavoriteFruits that has NAME, FRUIT, and GUID for columns. The table is already populated with names and fruits. So lets say:
NAME FRUIT GUID
John Apple NULL
John Orange NULL
John Grapes NULL
Peter Canteloupe NULL
Peter Grapefruit NULL
Ok, now I want to update the GUID column with a new GUID (using NEWID()), but I want to have the same GUID per distinct name. So I want all the John Smiths to have the same GUID, and I want both the Peters to have the same GUID, but that GUID different than the one used for the Johns. So now it would look something like this:
NAME FRUIT GUID
John Apple f6172268-78b7-4c2b-8cd7-7a5ca20f6a01
John Orange f6172268-78b7-4c2b-8cd7-7a5ca20f6a01
John Grapes f6172268-78b7-4c2b-8cd7-7a5ca20f6a01
Peter Canteloupe e3b1851c-1927-491a-803e-6b3bce9bf223
Peter Grapefruit e3b1851c-1927-491a-803e-6b3bce9bf223
Can I do that in an update statement without having to use a cursor? If so can you please give an example?
Thanks guys...
Update a CTE won't work because it'll evaluate per row. A table variable would work:
You should be able to use a table variable as a source from which to update the data. This is untested, but it'll look something like:
DECLARE #n TABLE (Name varchar(10), Guid uniqueidentifier);
INSERT #n
SELECT Name, newid() AS Guid
FROM FavoriteFruits
GROUP BY Name;
UPDATE f
SET f.Guid = n.Guid
FROM #n n
JOIN FavoriteFruits f ON f.Name = n.Name
So that populates a variable with a GUID per name, then joins it back to the original table and updates accordingly.
To clarify comments re a table expression in the USING clause of a MERGE statement.
The following won't work because it'll evaluate per row:
MERGE INTO FavoriteFruits
USING (
SELECT NAME, NEWID() AS GUID
FROM FavoriteFruits
GROUP
BY NAME
) AS source
ON source.NAME = FavoriteFruits.NAME
WHEN MATCHED THEN
UPDATE
SET GUID = source.GUID;
But the following, using a table variable, will work:
DECLARE #n TABLE
(
NAME VARCHAR(10) NOT NULL UNIQUE,
GUID UNIQUEIDENTIFIER NOT NULL UNIQUE
);
INSERT INTO #n (NAME, GUID)
SELECT NAME, NEWID()
FROM FavoriteFruits
GROUP
BY NAME;
MERGE INTO FavoriteFruits
USING #n AS source
ON source.NAME = FavoriteFruits.NAME
WHEN MATCHED THEN
UPDATE
SET GUID = source.GUID;
There's a single-statement solution too, which, however, has some limitations. The idea is to use OPENQUERY(), like this:
UPDATE FavoriteFruits
SET GUID = n.GUID
FROM (
SELECT NAME, GUID
FROM OPENQUERY(
linkedserver,
'SELECT NAME, NEWID() AS GUID FROM database.schema.FavoriteFruits GROUP BY NAME'
)
) n
WHERE FavoriteFruits.NAME = n.NAME
This solution implies that you need to create a self-pointing linked server. Another specificity is that you can't use this method on table variables nor local temporary tables (global ones would do as well as 'normal' tables).

T-SQL How To: Compare and List Duplicate Entries in a Table

SQL Server 2000. Single table has a list of users that includes a unique user ID and a non-unique user name.
I want to search the table and list out any users that share the same non-unique user name. For example, my table looks like this:
ID User Name Name
== ========= ====
0 parker Peter Parker
1 parker Mary Jane Parker
2 heroman Joseph (Joey) Carter Jones
3 thehulk Bruce Banner
What I want to do is do a SELECT and have the result set be:
ID User Name Name
== ========= ====
0 parker Peter Parker
1 parker Mary Jane Parker
from my table.
I'm not a T-SQL guru. I can do the basic joins and such, but I'm thinking there must be an elegant way of doing this. Barring elegance, there must be ANY way of doing this.
I appreciate any methods that you can help me with on this topic. Thanks!
---Dan---
One way
select t1.* from Table t1
join(
select username from Table
group by username
having count(username) >1) t2 on t1.username = t2.username
The simplest way I can think of to do this uses a sub-query:
select * from username un1 where exists
(select null from username un2
where un1.user_name = un2.user_name and un1.id <> un2.id);
The sub-query selects all names that have >1 row with that name... outer query selects all the rows matching those IDs.
SELECT T.*
FROM T
, (SELECT Dupe_candidates.USERNAME
FROM T AS Dupe_candidates
GROUP BY Dupe_candidates.USERNAME
HAVING count(*)>1
) Dupes
WHERE T.USERNAME=Dupes.USERNAME
You can try the following:
SELECT *
FROM dbo.Person as p1
WHERE
(SELECT COUNT(*) FROM dbo.Person AS p2 WHERE p2.UserName = p1.UserName) > 1;

Resources