How to properly join when comparing the difference between two tables pertaining to specific fields

How to properly join when comparing the difference between two tables pertaining to specific fields - sql-server

I'm having problems writing this query. So I'm comparing a temp table to a table within our database. I want to find any records that don't have the same Case_Number the same Id_Number combination between the two table. The query I am using only provides me one or the other depending on how I join them. If I join by Case_Number, it returns the Case_Number records that do not match between the two tables. If I join by Id_Number, it will return the Id_Numbers that do not match between the two tables.. Is there a way to join by both Case_Number and ID_Number so that the query returns both? I would also like to know if it would be possible for me to include an "If Exist" to the query? Code below
SELECT T1.Case_Number, T1.Id_Number, T1.FirstDate, T1.LastDate, T2.Case_Number, T2.Id_Number, T2.FirstDate, T2.LastDate
FROM dbo.table T
inner join #TempTable T2
on T1.Id_Number = T2.Id_Number
--on T1.Case_Number = T2.Case_Number
where T1.LastDate is null
and T1.Case_Number <> T2.Case_Number
OR T1.Id_number <> T2.Id_Number

Use or in inner join statement
SELECT T1.Case_Number,
T1.Id_Number,
T1.FirstDate,
T1.LastDate,
T2.Case_Number,
T2.Id_Number,
T2.FirstDate,
T2.LastDate
FROM dbo.table T
INNER JOIN #TempTable T2 ON (T1.Id_Number = T2.Id_Number
OR T1.Case_Number = T2.Case_Number)
WHERE T1.LastDate IS NULL
AND T1.Case_Number <> T2.Case_Number
OR T1.Id_number <> T2.Id_Number

This should find both - firsst colum tells you what is the case
(
SELECT 'in Table not in Temp' as R,
T1.Case_Number as 'T1.Case_Number',
T1.Id_Number as 'T1.Id_Number',
T1.FirstDate as 'T1.FirstDate',
T1.LastDate as 'T1.LastDate',
null as 'T2.Case_Number',
null as 'T2.Id_Number',
null as 'T2.FirstDate',
null as 'T2.LastDate'
FROM dbo.table T1
where not exists (
select 1
from #TempTable T2
where T1.Case_Number == T2.CaseNumber
and T1.Id_Number == T2.Id_Number)
)
UNION ALL
(
SELECT 'in temp, not in list' as R
null as 'T1.Case_Number',
null as 'T1.Id_Number',
null as 'T1.FirstDate',
null as 'T1.LastDate',
T2.Case_Number as 'T2.Case_Number',
T2.Id_Number as 'T2.Id_Number',
T2.FirstDate as 'T2.FirstDate',
T2.LastDate as 'T2.LastDate'
FROM #TempTable T2
where not exists (
select 1
from dbo.table T1
where T1.Case_Number == T2.CaseNumber
and T1.Id_Number == T2.Id_Number )
)
Add the T1.LastDate is null - not sure where you want it.
You probably need to tweak the column names , I switched T1/T2 for 2nd statement so column will be the same table all the time, but names have duplicates.

Related

Select distinct on declared table

I need to create concatenated query will get data with checking data in other tables.
Firstly I declared (and filled it) two tables with list of ID I'll check in other tables. On next step (2) I declared new table and filled it values I get by some checking params. This table contains only one column (ID).
After filling if I execute SELECT DISTINCT query from this table I'll get really unique ID. It's OK.
But on next step (3) I declare more one table and filling it by 3 tables. Of course it contains many duplicates. But I must create this query for checking and concatenating. And after that if I execute select distinct h from #NonUnicalConcat it returns many duplicate IDs.
What I did wrong? Where is an error?
USE CurrentBase;
--STEP 1
DECLARE #TempCSTable TABLE(TempTableIDColumn int);
INSERT INTO #TempCSTable VALUES('3'),('4');
DECLARE #TempCVTable TABLE(TempTableIDColumn int);
INSERT INTO #TempCVTable VALUES('2'),('13');
--STEP 2
DECLARE #TempIdTable TABLE(id int);
INSERT INTO #TempIdTable
SELECT TT1.ID
FROM Table1 AS TT1
LEFT OUTER JOIN Table2 ON Table2.ID = TT1.OptionalColumn
LEFT OUTER JOIN Table3 AS TT2 ON TT2.ID = TT1.OptionalColumn
LEFT OUTER JOIN Table4 AS TT3 ON TT3.ID = TT2.OptionalColumn
WHERE TT1.ValueDate > '2020-06-30'
AND TT1.ValueDate < '2020-08-04'
AND TT1.OptBool = '1'
AND TT1.OptBool2 = '0'
AND EXISTS
(
SELECT Table5.ID
FROM Table5
WHERE Table5.ID = TT1.ID
AND Table5.CV IN
(
SELECT TempTableIDColumn
FROM #TempCVTable
)
AND Table5.OptBool = '1'
)
AND EXISTS
(
SELECT Table6.ID
FROM Table6
WHERE Table6.IID = TT3.ID
AND Table6.CS IN
(
SELECT TempTableIDColumn
FROM #TempCSTable
)
);
SELECT distinct * FROM #TempIdTable;--this code realy select distinct
--STEP 3
DECLARE #NonUnicalConcat TABLE(c int, s int, h int);
INSERT INTO #NonUnicalConcat
SELECT TT1.TempTableIDColumn AS cc,
TT2.TempTableIDColumn AS ss,
TT3.id AS hh
FROM #TempCVTable AS TT1,
#TempCSTable AS TT2,
#TempIdTable AS TT3
WHERE NOT EXISTS
(
SELECT HID
FROM OtherBase.dbo.Table1
WHERE HID = TT3.id
AND CS = TT2.TempTableIDColumn
AND CV = TT1.TempTableIDColumn
);
select distinct h from #NonUnicalConcat;--this code return many duplicates

Update records SQL?

First when I started this project seemed very simple. Two tables, field tbl1_USERMASTERID in Table 1 should be update from field tbl2_USERMASTERID Table 2. After I looked deeply in Table 2, there is no unique ID that I can use as a key to join these two tables. Only way to match the records from Table 1 and Table 2 is based on FIRST_NAME, LAST_NAME AND DOB. So I have to find records in Table 1 where:
tbl1_FIRST_NAME equals tbl2_FIRST_NAME
AND
tbl1_LAST_NAME equals tbl2_LAST_NAME
AND
tbl1_DOB equals tbl2_DOB
and then update USERMASTERID field. I was afraid that this can cause some duplicates and some users will end up with USERMASTERID that does not belong to them. So if I find more than one record based on first,last name and dob those records would not be updated. I would like just to skip and leave them blank. That way I wouldn't populate invalid USERMASTERID. I'm not sure what is the best way to approach this problem, should I use SQL or ColdFusion (my server side language)? Also how to detect more than one matching record?
Here is what I have so far:
UPDATE Table1 AS tbl1
LEFT OUTER JOIN Table2 AS tbl2
ON tbl1.dob = tbl2.dob
AND tbl1.fname = tbl2.fname
AND tbl1.lname = tbl2.lname
SET tbl1.usermasterid = tbl2.usermasterid
WHERE LTRIM(RTRIM(tbl1.usermasterid)) = ''
Here is query where I tried to detect duplicates:
SELECT DISTINCT
tbl1.FName,
tbl1.LName,
tbl1.dob,
COUNT(*) AS count
FROM Table1 AS tbl1
LEFT OUTER JOIN Table2 AS tbl2
ON tbl1.dob = tbl2.dob
AND tbl1.FName = tbl2.first
AND tbl1.LName = tbl2.last
WHERE LTRIM(RTRIM(tbl1.usermasterid)) = ''
AND LTRIM(RTRIM(tbl1.first)) <> ''
AND LTRIM(RTRIM(tbl1.last)) <> ''
AND LTRIM(RTRIM(tbl1.dob)) <> ''
GROUP BY tbl1.FName,tbl1.LName,tbl1.dob
Some data after I tested query above:
First Last DOB Count
John Cook 2008-07-11 2
Kate Witt 2013-06-05 1
Deb Ruis 2016-01-22 1
Mike Bennet 2007-01-15 1
Kristy Cruz 1997-10-20 1
Colin Jones 2011-10-13 1
Kevin Smith 2010-02-24 1
Corey Bruce 2008-04-11 1
Shawn Maiers 2016-08-28 1
Alenn Fitchner 1998-05-17 1
If anyone have idea how I can prevent/skip updating duplicate records or how to improve this query please let me know. Thank you.

You could check for and avoid duplicate matches using with common_table_expression (Transact-SQL)
along with row_number()., like so:
with cte as (
select
t.fname
, t.lname
, t.dob
, t.usermasterid
, NewUserMasterId = t2.usermasterid
, rn = row_number() over (partition by t.fname, t.lname, t.dob order by t2.usermasterid)
from table1 as t
inner join table2 as t2 on t.dob = t2.dob
and t.fname = t2.fname
and t.lname = t2.lname
and ltrim(rtrim(t.usermasterid)) = ''
)
--/* confirm these are the rows you want updated
select *
from cte as t
where t.NewUserMasterId != ''
and not exists (
select 1
from cte as i
where t.dob = i.dob
and t.fname = i.fname
and t.lname = i.lname
and i.rn>1
);
--*/
/* update those where only 1 usermasterid matches this record
update t
set t.usermasterid = t.NewUserMasterId
from cte as t
where t.NewUserMasterId != ''
and not exists (
select 1
from cte as i
where t.dob = i.dob
and t.fname = i.fname
and t.lname = i.lname
and i.rn>1
);
--*/
I use the cte to extract out the sub query for readability. Per the documentation, a common table expression (cte):
Specifies a temporary named result set, known as a common table expression (CTE). This is derived from a simple query and defined within the execution scope of a single SELECT, INSERT, UPDATE, or DELETE statement.
Using row_number() to assign a number for each row, starting at 1 for each partition of t.fname, t.lname, t.dob. Having those numbered allows us to check for the existence of duplicates with the not exists() clause with ... and i.rn>1

You could use a CTE to filter out the duplicates from Table1 before joining:
; with CTE as (select *
, count(ID) over (partition by LastName, FirstName, DoB) as IDs
from Table1)
update a
set a.ID = b.ID
from Table2 a
left join CTE b
on a.FirstName = b.FirstName
and a.LastName = b.LastName
and a.Dob = b.Dob
and b.IDs = 1
This will work provided there are no exact duplicates (same demographics and same ID) in table 1. If there are exact duplicates, they will also be excluded from the join, but you can filter them out before the CTE to avoid this.

Please try below SQL:
UPDATE Table1 AS tbl1
INNER JOIN Table2 AS tbl2
ON tbl1.dob = tbl2.dob
AND tbl1.fname = tbl2.fname
AND tbl1.lname = tbl2.lname
LEFT JOIN Table2 AS tbl3
ON tbl3.dob = tbl2.dob
AND tbl3.fname = tbl2.fname
AND tbl3.lname = tbl2.lname
AND tbl3.usermasterid <> tbl2.usermasterid
SET tbl1.usermasterid = tbl2.usermasterid
WHERE LTRIM(RTRIM(tbl1.usermasterid)) = ''
AND tbl3.usermasterid is null

Display common values in two tables that are not present in the third table

The three tables in question are:
Table A - relevant columns are TimeTicket and IdAddress
Table B - relevant columns are CommunicationNumber, TimeCreate and IdAddress.
Table C - relevant columns are CommunicationNumber, LastCalled, NextCall
Table C is created by a join of TableA and TableB on IdAddress
INSERT INTO tblC ([CommunicationNumber], [LastCalled] ,[NextCall])
SELECT T2.CommunicationNumber, T2.TimeCreate, T1.TimeTicket
FROM tblA T1
INNER JOIN tblB T2
ON T1.IdAddress = T2.IdAddress AND T2.CommunicationNumber IS NOT NULL
That's one part of the process, and that's fine.
Now, when there is new data in Table A and Table B, I want to update the data entries in Table C. However, I want to ignore the values from Table A and Table B that I have already entered into Table C.
To achieve this, I used NOT EXISTS and wrote a query that looks like this.
INSERT INTO tblC ([CommunicationNumber], [LastCalled] ,[NextCall])
SELECT T2.CommunicationNumber, T2.TimeCreate, T1.TimeTicket
FROM tblA T1
INNER JOIN tblB T2
ON T1.IdAddress = T2.IdAddress AND T2.CommunicationNumber IS NOT NULL
WHERE NOT EXISTS (SELECT T3.CommunicationNumber
FROM [dbo].[tblPhoneLogRep] T3
WHERE T1.TimeTicket <> T3.NextCall AND T2.TimeCreate <> T3.LastCalled AND T2.CommunicationNumber <> T3.CommunicationNumber)
However, this query always returns an empty set.
Could someone please explain to me what is it that I am doing incorrectly?

Try using the EXCEPT set operator:
INSERT INTO tblC ([CommunicationNumber], [LastCalled] ,[NextCall])
SELECT T2.CommunicationNumber, T2.TimeCreate, T1.TimeTicket
FROM tblA T1
INNER JOIN tblB T2
ON T1.IdAddress = T2.IdAddress AND T2.CommunicationNumber IS NOT NULL
EXCEPT
SELECT CommunicationNumber, LastCalled, NextCall FROM tblC
To fix your existing query, you would need to change your <> operators to = operators, like so:
INSERT INTO tblC ([CommunicationNumber], [LastCalled] ,[NextCall])
SELECT T2.CommunicationNumber, T2.TimeCreate, T1.TimeTicket
FROM tblA T1
INNER JOIN tblB T2
ON T1.IdAddress = T2.IdAddress AND T2.CommunicationNumber IS NOT NULL
WHERE NOT EXISTS (SELECT 1
FROM tblC
WHERE T1.TimeTicket = tblC.NextCall AND T2.TimeCreate = tblC.LastCalled AND T2.CommunicationNumber = tblC.CommunicationNumber)
Personally, I think the EXCEPT syntax is more clear though.

Your issue is that you are essentially using a double negative. You are saying NOT EXISTS and you are setting your WHERE criteria to <>. I think it would work out if you either used EXISTS or change you criteria =.

Conditional JOIN Statement SQL Server

Is it possible to do the following:
IF [a] = 1234 THEN JOIN ON TableA
ELSE JOIN ON TableB
If so, what is the correct syntax?

I think what you are asking for will work by joining the Initial table to both Option_A and Option_B using LEFT JOIN, which will produce something like this:
Initial LEFT JOIN Option_A LEFT JOIN NULL
OR
Initial LEFT JOIN NULL LEFT JOIN Option_B
Example code:
SELECT i.*, COALESCE(a.id, b.id) as Option_Id, COALESCE(a.name, b.name) as Option_Name
FROM Initial_Table i
LEFT JOIN Option_A_Table a ON a.initial_id = i.id AND i.special_value = 1234
LEFT JOIN Option_B_Table b ON b.initial_id = i.id AND i.special_value <> 1234
Once you have done this, you 'ignore' the set of NULLS. The additional trick here is in the SELECT line, where you need to decide what to do with the NULL fields. If the Option_A and Option_B tables are similar, then you can use the COALESCE function to return the first NON NULL value (as per the example).
The other option is that you will simply have to list the Option_A fields and the Option_B fields, and let whatever is using the ResultSet to handle determining which fields to use.

This is just to add the point that query can be constructed dynamically based on conditions.
An example is given below.
DECLARE #a INT = 1235
DECLARE #sql VARCHAR(MAX) = 'SELECT * FROM [sourceTable] S JOIN ' + IIF(#a = 1234,'[TableA] A ON A.col = S.col','[TableB] B ON B.col = S.col')
EXEC(#sql)
--Query will be
/*
SELECT * FROM [sourceTable] S JOIN [TableB] B ON B.col = S.col
*/

You can solve this with union
select a, b
from tablea
join tableb on tablea.a = tableb.a
where b = 1234
union
select a, b
from tablea
join tablec on tablec.a = tableb.a
where b <> 1234

I disagree with the solution suggesting 2 left joins. I think a table-valued function is more appropriate so you don't have all the coalescing and additional joins for each condition you would have.
CREATE FUNCTION f_GetData (
#Logic VARCHAR(50)
) RETURNS #Results TABLE (
Content VARCHAR(100)
) AS
BEGIN
IF #Logic = '1234'
INSERT #Results
SELECT Content
FROM Table_1
ELSE
INSERT #Results
SELECT Content
FROM Table_2
RETURN
END
GO
SELECT *
FROM InputTable
CROSS APPLY f_GetData(InputTable.Logic) T

I think it will be better to think about your query in a different way and treat them more like sets.
I do believe if you make two separate queries then join them using UNION, It will be much better in performance and more readable.

Finding difference between 2 tables in MS Access or SQL Server

I have 2 Excel files which I imported into MS Access as two tables. These two tables are identical but imported on different dates.
Now, how can I find out what rows and what fields are updated on the later date? Any help would be highly appreciated.

Finding Inserted records is easy
select * from B where not exists (select 1 from A where A.pk=B.pk)
Finding Deleted records is just as easy
select * from A where not exists (select 1 from B where A.pk=B.pk)
Finding Updated records is a pain. The following rigorous query assumes you have nullable columns and it should work in all situations.
select B.*
from B
inner join A on B.pk=A.pk
where A.col1<>B.col1 or (IsNull(A.col1) and not IsNull(B.col1)) or (not IsNull(A.col1) and IsNull(B.col1))
or A.col2<>B.col2 or (IsNull(A.col2) and not IsNull(B.col2)) or (not IsNull(A.col2) and IsNull(B.col2))
or A.col3<>B.col3 or (IsNull(A.col3) and not IsNull(B.col3)) or (not IsNull(A.col3) and IsNull(B.col3))
etc...
If the columns are defined as NOT NULL then the query is much simper, just remove all the NULL tests.
If the columns are nullable but you can identify a value that will never appear in the data, then use a simple comparison like:
Nz(A.col1,neverAppearingValue)<>Nz(B.col1,neverAppearingValue)

I believe this should be as simple as running a query like this:
SELECT *
FROM Table1
JOIN Table2
ON Table1.ID = Table2.ID AND Table1.Date != Table2.Date

One way to do this is by unpivoting both tables, so you get a new table with , , . Note, though, that you have to take types into account.
For example, the following gets differences in fields:
with oldt as (select id, col, val
from <old table> t
unpivot (val for col in (<column list>)) unpvt
),
newt as (select id, col, val
from <new table> t
unpivot (val for col in (<column list>)) unpvit
)
select *
from oldt full outer join newt on oldt.id = newt.id
where oldt.id is null or newt.id is null
The alternative way with a join is rather cumbersome. This version shows whether columns are added, deleted, and which columns changed if any:
select *
from (select coalesce(oldt.id, newt.id) as id,
(case when oldt.id is null and newt.id is not null then 'ADDED'
when oldt.id is not null and newt.id is null then 'DELETED'
else 'SAME'
end) as stat,
(case when oldt.col1 <> newt.col1 or oldt.col1 is null and newt.col1 is null
then 1 else 0 end) as diff_col1,
(case when oldt.col2 <> newt.col2 or oldt.col2 is null and newt.col2 is null
then 1 else 0 end) as diff_col2,
...
from <old table> oldt full outer join <new table> newt on oldt.id = newt.id
) c
where status in ('ADDED', 'DELETED') or
(diff_col1 + diff_col2 + ... ) > 0
It does have the advantage of working for any data types.

(Select * from OldTable Except Select *from NewTable)
Union All
(Select * from NewTable Except Select *from OldTable)

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

How to properly join when comparing the difference between two tables pertaining to specific fields - sql-server

Related

Select distinct on declared table

Update records SQL?

Display common values in two tables that are not present in the third table

Conditional JOIN Statement SQL Server

Finding difference between 2 tables in MS Access or SQL Server

Categories

Resources