SQL identify duplicate and update

SQL identify duplicate and update - sql-server

i need help in below issue.i have a customer table CustA which is having columns custid, first name , surname, phone1, phone2,lastupdateddate. This table has duplicate records.a record is considered duplicate in CustA table when
first name & surname & (phone1 or phone2) is duplicated
custid firstname surname phone1 phone2 lastupdateddate
1000 Sam Son 334566 NULL 1-jan-2016
1001 sam son NULL 334566 1-feb-2016
i have used cte for this scenario to Partition by firstname, lastname, phone1, phone2 based on rownumber. But the OR condition is remaining as challenge for phone1 or phone2 in CTE query. Please share your thoughts. Appreciate it.

Trick here is COALESCE
With cte as
(
select Count()over(partition by firstname, lastname, coalesce(phone1, phone2)) as cnt,*
From yourtable
)
Select * from CTE
WHere cnt > 1
Though if it isn't the case that one is always null You can use a CASE expression to ensure that the values are presented in a consistent order.
WITH cte
AS (SELECT COUNT(*)
OVER(
partition BY firstname,
lastname,
CASE WHEN phone1 < phone2 THEN phone1 ELSE phone2 END,
CASE WHEN phone1 < phone2 THEN phone2 ELSE phone1 END) AS cnt,
*
FROM yourtable)
SELECT *
FROM CTE
WHERE cnt > 1

This one will also give you the list of dupes (optional custid<>A.custid)
Declare #Yourtable table (custid int,firstname varchar(50),surname varchar(50),phone1 varchar(25),phone2 varchar(25),lastupdate date)
Insert into #Yourtable values
(1000,'Sam','Son' ,'334566',NULL ,'1-jan-2016'),
(1001,'sam','son' ,NULL ,'334566','1-feb-2016'),
(1003,'sam','son' ,NULL ,NULL ,'2-feb-2016'),
(1002,'Not','ADupe',NULL ,NULL ,'1-feb-2016')
Select A.*
,B.Dupes
From #YourTable A
Cross Apply (Select Dupes=(Select Stuff((Select Distinct ',' + cast(custid as varchar(25))
From #YourTable
Where custid<>A.custid
and firstname=A.firstname
and surname =A.surname
and (IsNull(A.phone1,'') in (IsNull(phone1,''),IsNull(phone2,'')) or IsNull(A.phone2,'') in (IsNull(phone1,''),IsNull(phone2,'')) )
For XML Path ('')),1,1,'')
)
) B
Where Dupes is not null
Returns
custid firstname surname phone1 phone2 lastupdate Dupes
1000 Sam Son 334566 NULL 2016-01-01 1001,1003
1001 sam son NULL 334566 2016-02-01 1000,1003
1003 sam son NULL NULL 2016-02-02 1000,1001

Related

Generate list of mismatched data element Combinations between two sources

We receive data on a weekly and monthly basis with information regarding customers. We also sometimes have the same information stored from another source. The two sources sometimes provide contradictory information regarding customers.
How would I write a query which tells me the mismatched CustomerId and corresponding Vehicle? For example, CustomerId 947623 is associated with Kia in the vendor extract [Table 1] whereas we have the same customer stored as related to Hyundai [Table 2].
Table 1: Data received from the vendor.
CustomerId
FirstName
LastName
Vehicle
MiscColumns
027548
Jane
Doe
Honda
MiscData
947623
John
Smith
Kia
MiscData
549816
Erin
Woods
Chevy
MiscData
739232
Henry
Jackson
Ford
MiscData
Table 2: Internal data records
CustomerId
FirstName
LastName
Vehicle
MiscColumns
027548
Jane
Doe
Honda
MiscData
947623
John
Smith
Hyundai
MiscData
549816
Erin
Woods
Chevy
MiscData
739232
Henry
Jackson
Ford
MiscData

Please try the following solution.
It will work starting from SQL Server 2016 onwards.
SQL
-- DDL and sample data population, start
DECLARE #TableA TABLE (CustomerId CHAR(6) PRIMARY KEY, FirstName VARCHAR(100), LastName VARCHAR(100), Vehicle VARCHAR(100));
DECLARE #TableB table (CustomerId CHAR(6) PRIMARY KEY, FirstName VARCHAR(100), LastName VARCHAR(100), Vehicle VARCHAR(100));
INSERT INTO #TableA (CustomerId, FirstName, LastName, Vehicle) VALUES
('027548', 'Jane', 'Doe', 'Honda'),
('947623', 'John', 'Smith', 'Kia'),
('549816', 'Erin', 'Woods', 'Chevy'),
('739232', 'Henry', 'Jackson', 'Ford');
INSERT INTO #TableB (CustomerId, FirstName, LastName, Vehicle) VALUES
('027548', 'Jane', 'Doe', 'Honda'),
('947623', 'John', 'Smith', 'Hyundai'),
('549816', 'Erin', 'Woods', 'Chevy'),
('739232', 'Henry', 'Jackson', 'Ford');
-- DDL and sample data population, end
SELECT CustomerId
,[key] AS [column]
,Org_Value = MAX( CASE WHEN Src=1 THEN Value END)
,New_Value = MAX( CASE WHEN Src=2 THEN Value END)
FROM (
SELECT Src=1
,CustomerId
,B.*
FROM #TableA A
CROSS APPLY ( SELECT [Key]
,COALESCE(Value, '') AS Value
FROM OpenJson( (SELECT A.* For JSON Path,Without_Array_Wrapper,INCLUDE_NULL_VALUES))
) AS B
UNION ALL
SELECT Src=2
,CustomerId
,B.*
FROM #TableB A
CROSS APPLY ( SELECT [Key]
,COALESCE(Value, '') AS Value
FROM OpenJson( (SELECT A.* For JSON Path,Without_Array_Wrapper,INCLUDE_NULL_VALUES))
) AS B
) AS A
GROUP BY CustomerId,[key]
HAVING MAX(CASE WHEN Src=1 THEN Value END)
<> MAX(CASE WHEN Src=2 THEN Value END)
ORDER BY CustomerId,[key];
Output
CustomerId
column
Org_Value
New_Value
947623
Vehicle
Kia
Hyundai

Get value of two columns when pivoting

I have a table that has entries like
Acct Nurse EntryDateTime DBCode Answer FormSeq
123 Sally 9/8/2020 09:22 Code1 Ans1 0001
123 Jim 9/8/2020 10:25 Code1 Ans2 0001
123 Sally 9/8/2020 09:15 Code2 C2Ans1 0001
I have a query that is pivoting this to get the answer from the last entry based on DBCode and EntryDateTime that works great. What I need to do is get the NURSE as well as the answer.
So my row would be
Acct Code1 Code1Nurse Code2 Code2Nurse
123 Ans2 Jim C2Ans1 Sally
Is there a way to do this? I would need the nurse for each unique DBCode
Here is my pivot code:
SELECT * FROM (
SELECT
[AcctNumber],
[Answer],
[DBCode],
[EntryDate],
[FormCode],[FormSeq]
FROM V_FAC_MULTIAPP_FORM_WITH_HOURLY
) MultiApp
PIVOT (
MAX( [Answer])
FOR [DBCode]
IN (
[AST],
[SDRM],
[SDRF],[SDAAS],[SDDCT],[SDDAS],[SDABY],[SDDAT],[SDTRCC],[SDADMTW],[Prptic],[Pdcrt]
)
) AS PivotTable WHERE EntryDate >='8/15/2020' and FormCode='LL003' ORDER BY EntryDate

You could use conditional aggregation.
Data
drop table if exists #tTEST;
go
select * INTO #tTEST from (values
(123, 'Sally', '9/8/2020 09:22', 'Code1', 'Ans1', '0001'),
(123, 'Jim', '9/8/2020 10:25', 'Code1', 'Ans2', '0001'),
(123, 'Sally', '9/8/2020 09:15', 'Code2', 'C2Ans1', '0001')) V(Acct, Nurse, EntryDateTime, DBCode, Answer, FormSeq);
Query
;with rn_cte as (
select *, row_number() over (partition by DBCode order by EntryDateTime desc) rn
from #tTEST)
select Acct,
max(case when DBCode='Code1' and rn=1 then Answer else null end) Code1,
max(case when DBCode='Code1' and rn=1 then Nurse else null end) Code1Nurse,
max(case when DBCode='Code2' and rn=1 then Answer else null end) Code2,
max(case when DBCode='Code2' and rn=1 then Nurse else null end) Code2Nurse
from rn_cte
group by Acct;
Output
Acct Code1 Code1Nurse Code2 Code2Nurse
123 Ans2 Jim C2Ans1 Sally

Here is something to play with. It would take some tinkering, but you could make the query dynamic to build out the columns to select based upon the DBCodes you have in your table...
IF OBJECT_ID('tempdb..#V_FAC_MULTIAPP_FORM_WITH_HOURLY') IS NOT NULL
DROP TABLE #V_FAC_MULTIAPP_FORM_WITH_HOURLY;
CREATE TABLE #V_FAC_MULTIAPP_FORM_WITH_HOURLY
(
Acct INT,
Nurse VARCHAR(20),
EntryDateTime DATETIME,
DBCode VARCHAR(10),
Answer VARCHAR(10),
FormSeq VARCHAR(10)
)
INSERT #V_FAC_MULTIAPP_FORM_WITH_HOURLY
VALUES
(123,'Sally','9/8/2020 09:22','Code1','Ans1','0001'),
(123,'Jim','9/8/2020 10:25','Code1','Ans2','0001'),
(123,'Sally','9/8/2020 09:15','Code2','C2Ans1','0001');
WITH Top_Row_Per_DBCode_By_EntryDate AS
(
SELECT ROW_NUMBER() OVER (PARTITION BY DBCode ORDER BY EntryDateTime DESC) AS top_row,
Acct,
DBCode,
Nurse+':'+Answer AS Answer
FROM #V_FAC_MULTIAPP_FORM_WITH_HOURLY
), Filtered_CTE AS
(
SELECT *
FROM Top_Row_Per_DBCode_By_EntryDate
WHERE top_row = 1
)
SELECT Acct,
Substring([Code1],CHARINDEX(':',[Code1])+1,LEN([Code1])-CHARINDEX(':',[Code1])) AS Code1,
Substring([Code1],0,CHARINDEX(':',[Code1])) AS Code1Nurse,
Substring([Code2],CHARINDEX(':',[Code2])+1,LEN([Code2])-CHARINDEX(':',[Code2])) AS Code2,
Substring([Code2],0,CHARINDEX(':',[Code2])) AS Code2Nurse
FROM Filtered_CTE
PIVOT
(
MAX( [Answer]) FOR [DBCode]
IN ([Code1],[Code2])
) AS PivotTable

Another way to show ID in pivot

I have the following code. Is there another way to show ID in pivot besides what I have? What I have does not look very efficient:
create table #salary(
id int
, fNAme varchar(10)
, salary int
);
insert into #salary(id, fName, salary)
values(1,'jim',1000)
,(2,'mike',2000)
,(3,'tim',500)
,(1,'jim',300)
,(2,'mike',400)
,(3,'tim',250)
select 'salary' as salary, id, [jim], [mike], [tim]
from (
select id, fNAme, salary
from #salary
) x
PIVOT (
sum(salary) for fNAme in ([jim], [mike], [tim])
) as pvt
Output:
salary id jim mike tim
salary 1 1300 NULL NULL
salary 2 NULL 2400 NULL
salary 3 NULL NULL 750

If you are looking for an alternative approach you can consider conditional aggregation:
select
'salary' as salary
, id
, sum(case when fName = 'jim' then salary else null end) as jim
, sum(case when fName = 'mike' then salary else null end) as mike
, sum(case when fName = 'tim' then salary else null end) as tim
from
#salary
group by id
Result:

How to update column with if-else-if condition

I am new to SQL Server and on the learning phase. I wanted to perform following task.
I have two table Table1 and Table2. I want to loop the row of Table1 to check if value matches with any row of Table2.
Table1:
ID Name Nationality DOB Priority
--------------------------------------------
1 Sujan Nepali 1996 NULL
2 Sujan Nepali 1999 NULL
3 Sujan Chinese 1996 NULL
4 Sujan Chinese 1888 NULL
Table 2:
ID Name Nationality DOB Address Rank
---------------------------------------------------
1 Sujan Nepali 1996 Kathmandu 1
In Table1 with ID 1 matches all value of same column name in Table2. I need to Update priority of it as 1.
In ID 2 DOB is different and Name and Nationality matches so Update priority as 2.
In ID 3 Name and Year is same as of Table2, so Update priority as 3.
In ID 4 only Name is same, so Update priority as 4.
Expected Output:
Table1:
ID Name Nationality DOB Priority
---------------------------------------------
1 Sujan Nepali 1996 1
2 Sujan Nepali 1999 2
3 Sujan Chinese 1996 3
4 Sujan Chinese 1888 4
I have used CASE but need to perform using IF ELSE IF condition. Any help would be appreciated.

I guess you need something like this. Cross join two tables and look for columns that match each other. According to that make updates.
declare #table1 table (ID int, Name varchar(100), Nationality varchar(100), DOB int, Priority int)
insert into #table1
values
(1, 'Sujan', 'Nepali', 1996, NULL)
, (2, 'Sujan', 'Nepali', 1999, NULL)
, (3, 'Sujan', 'Chinese', 1996, NULL)
, (4, 'Sujan', 'Chinese', 1888, NULL)
declare #table2 table (ID int, Name varchar(100), Nationality varchar(100), DOB int, Address varchar(100), Rank int)
insert into #table2 values (1, 'Sujan', 'Nepali', 1996, 'Kathmandu', 1)
;with cte as (
select
a.*, rnk = row_number() over (order by case when a.Name = b.Name then 100 else 0 end
+ case when a.Nationality = b.Nationality then 10 else 0 end
+ case when a.DOB = b.DOB then 1 else 0 end desc)
from
#table1 a
join #table2 b on a.Name = b.Name
)
update cte
set priority = rnk
select * from #table1
Here's a rextester demo

Unless this is homework I see no reason to use if constructs.
update table1 set priority = (
select min(case
when table2.id = table1.id and ... then 1
when ... then 2
when ... then 3
...
end
from table2
)
Just make sure the branches go in order of highest to lowest priority.

blank entries separately counted within a "partition by" clause

I am working with Sql-Server 2008.
I have a table (MyTable) which contains two columns: IDCustomer and PhoneNumber. Some of the IDCustomers do not have a PhoneNumber, i.e. the corresponding PhoneNumber entry is blank (PhoneNumber is a varchar variable).
Here I give the first entries of my Table:
IDCUstomer PhoneNumber
22
13 911
10 118
8
17 112
.... ....
I am evaluating how many times every distinct Phone Number appears using this statement:
select
PhoneNumer,
RN = ROW_NUMBER() OVER (PARTITION BY PhoneNumber ORDER BY PhoneNumber ASC)
FROM MyTable
I am intentionally not using
select PhoneNumber,
count(PhoneNumber)
from MyTable
group by PhoneNumber
because in order to achieve my final result (which is not the topic of the question) I need to use the former expression.
My question is on the result obtained using the former expression (the one with the partition by). In fact, I would expect this result:
PhoneNumber RN
2
112 1
118 1
911 1
because I know I will obtain it using the query with the group by.
But instead I get:
PhoneNumber RN
1
2
112 1
118 1
911 1
so it looks like the blank rows are separately and progressively counted. I have checked that the same happens with more than two blank entries. For istance, if I have ten blank PhoneNumbers the result of the first query is: ten blank entries in the first column and RN growing from 1 to 10 in the second column.
Hence, I would like to ask you if you know why the results are not the ones I was expecting. Am I not obtaining the expected results since I am missing something or making any mistake?
Thank you in advance.

Try This it may helps you
;WITH cte(IDCUstomer,PhoneNumber)
AS
(
SELECT 22, NULL UNION ALL
SELECT 13, 911 UNION ALL
SELECT 10, 118 UNION ALL
SELECT 8, NULL UNION ALL
SELECT 17, 112
)
SELECT ISNULL(CAST(PhoneNumber AS VARCHAR(10)), '') AS PhoneNumber
,RN
FROM (
SELECT PhoneNumber
,RN = ROW_NUMBER() OVER (
PARTITION BY PhoneNumber ORDER BY IDCUstomer DESC
)
,ROW_NUMBER() OVER (
ORDER BY (
SELECT NULL
)
) - 1 AS Seq
FROM cte
) DT
WHERE dt.Seq > 0
Result
PhoneNumber RN
2
112 1
118 1
911 1

If you need to stick with partitions instead of GROUP BY then you will need to change ROW_NUMBER to COUNT(IDCUSTOMER) OVER (PARTITION BY PhoneNumber)

As explained int he comments...
declare #table table (IDCUstomer int, Phonenumber varchar(8))
insert into #table
values
(1,' '),
(2,' '),
(3,'123'),
(4,'456')
select
Phonenumber,
ROW_NUMBER() over (partition by Phonenumber order by Phonenumber) as RN
from #table
group by Phonenumber
If you have NULL values in the data you can use this
declare #table table (IDCUstomer int, Phonenumber varchar(8))
insert into #table
values
(1,' '),
(2,' '),
(3,'123'),
(4,'456'),
(5,NULL)
select
Phonenumber,
ROW_NUMBER() over (partition by Phonenumber order by Phonenumber) as RN
from
(select isnull(Phonenumber,'') as Phonenumber from #table) x
group by Phonenumber

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight