Display common values in two tables that are not present in the third table - sql-server

The three tables in question are:
Table A - relevant columns are TimeTicket and IdAddress
Table B - relevant columns are CommunicationNumber, TimeCreate and IdAddress.
Table C - relevant columns are CommunicationNumber, LastCalled, NextCall
Table C is created by a join of TableA and TableB on IdAddress
INSERT INTO tblC ([CommunicationNumber], [LastCalled] ,[NextCall])
SELECT T2.CommunicationNumber, T2.TimeCreate, T1.TimeTicket
FROM tblA T1
INNER JOIN tblB T2
ON T1.IdAddress = T2.IdAddress AND T2.CommunicationNumber IS NOT NULL
That's one part of the process, and that's fine.
Now, when there is new data in Table A and Table B, I want to update the data entries in Table C. However, I want to ignore the values from Table A and Table B that I have already entered into Table C.
To achieve this, I used NOT EXISTS and wrote a query that looks like this.
INSERT INTO tblC ([CommunicationNumber], [LastCalled] ,[NextCall])
SELECT T2.CommunicationNumber, T2.TimeCreate, T1.TimeTicket
FROM tblA T1
INNER JOIN tblB T2
ON T1.IdAddress = T2.IdAddress AND T2.CommunicationNumber IS NOT NULL
WHERE NOT EXISTS (SELECT T3.CommunicationNumber
FROM [dbo].[tblPhoneLogRep] T3
WHERE T1.TimeTicket <> T3.NextCall AND T2.TimeCreate <> T3.LastCalled AND T2.CommunicationNumber <> T3.CommunicationNumber)
However, this query always returns an empty set.
Could someone please explain to me what is it that I am doing incorrectly?

Try using the EXCEPT set operator:
INSERT INTO tblC ([CommunicationNumber], [LastCalled] ,[NextCall])
SELECT T2.CommunicationNumber, T2.TimeCreate, T1.TimeTicket
FROM tblA T1
INNER JOIN tblB T2
ON T1.IdAddress = T2.IdAddress AND T2.CommunicationNumber IS NOT NULL
EXCEPT
SELECT CommunicationNumber, LastCalled, NextCall FROM tblC
To fix your existing query, you would need to change your <> operators to = operators, like so:
INSERT INTO tblC ([CommunicationNumber], [LastCalled] ,[NextCall])
SELECT T2.CommunicationNumber, T2.TimeCreate, T1.TimeTicket
FROM tblA T1
INNER JOIN tblB T2
ON T1.IdAddress = T2.IdAddress AND T2.CommunicationNumber IS NOT NULL
WHERE NOT EXISTS (SELECT 1
FROM tblC
WHERE T1.TimeTicket = tblC.NextCall AND T2.TimeCreate = tblC.LastCalled AND T2.CommunicationNumber = tblC.CommunicationNumber)
Personally, I think the EXCEPT syntax is more clear though.

Your issue is that you are essentially using a double negative. You are saying NOT EXISTS and you are setting your WHERE criteria to <>. I think it would work out if you either used EXISTS or change you criteria =.

Related

Many left join on the same table

I have a table in 2 different database. Table1 in DataBase1 and Table2 in database2.
Table1 and Table2 have the same columns with different rows content.
each row correspond to a parcel number (TShipping_Tracking or TShipping_Reference or TShipping_OrderRef or TShipping_Barcode) and an ID (TShipping_ID). Remarque: for each parcel (row) only 1 of the 4 column , related to parcel number, listed above is not null
These are the schema of the table in each database:
create table Database1..Table1 (TShipping_ID varchar(50),TShipping_Tracking varchar(50),TShipping_Reference varchar(50),TShipping_OrderRef varchar(50),TShipping_Barcode varchar(50))
create table Database2..Table2 (TShipping_ID varchar(50),TShipping_Tracking varchar(50),TShipping_Reference varchar(50),TShipping_OrderRef varchar(50),TShipping_Barcode varchar(50))
Moreover, I have the table Database1..Reject having the same columns as Table1 (and Table2) exept TShipping_ID:
create table Database1..Table3(TShipping_Tracking varchar(50),TShipping_Reference varchar(50),TShipping_OrderRef varchar(50),TShipping_Barcode varchar(50))
I want to retreive TShipping_ID of the parcel that does not exist in Database1
I did the following query but it has a very bad response time:
select isnull(isnull(isnull(D2t1 .TShipping_ID,D2t2.TShipping_ID),D2t3.tshipping_id),D2t4.tshipping_id) as TShipping_ID
from Database1..Table3 D1t3
left join Database1..Table1 D1t1 on D1t3.TShipping_tracking=D1t1.TShipping_tracking
left join Database1..Table1 D1t2 on D1t3.TShipping_Reference=D1t2.TShipping_Reference
left join Database1..Table1 D1t3 on D1t3.TShipping_OrderRef=D1t3.TShipping_OrderRef
left join Database1..Table1 D1t4 on D1t3.TShipping_barcode=D1t4.TShipping_barcode
left join Database2..Table2 D2t1 on D1t3.TShipping_tracking=D2t1.TShipping_tracking
left join Database2..Table2 D2t2 on D1t3.TShipping_Reference=D2t2.TShipping_Reference
left join Database2..Table2 D2t3 on D1t3.TShipping_OrderRef=D2t3.TShipping_OrderRef
left join Database2..Table2 D2t4 on D1t3.TShipping_barcode=D2t4.TShipping_barcode
where D1t1.TShipping_Tracking is null and D1t2.TShipping_Reference is null and D1t3.TShipping_OrderRef is null and D1t4.TShipping_BarCode is null
Does anyone has a better way to do it?
Thanks
Assuming that Table1.TShipping_ID is a NOT NULL field:
SELECT t2.TShipping_ID
FROM Table3 t3
LEFT JOIN Table2 t2
ON t2.TShipping_Tracking = t3.TShipping_Tracking
OR t2.TShipping_Reference = t3.TShipping_Reference
OR t2.TShipping_OrderRef = t3.TShipping_OrderRef
OR t2.TShipping_Barcode = t3.TShipping_Barcode
LEFT JOIN Table2 t1
ON t1.TShipping_Tracking = t3.TShipping_Tracking
OR t1.TShipping_Reference = t3.TShipping_Reference
OR t1.TShipping_OrderRef = t3.TShipping_OrderRef
OR t1.TShipping_Barcode = t3.TShipping_Barcode
WHERE t1.TShipping_ID IS NULL
This will still be slow as hell, not the least of which because it's pretty close to a CROSS JOIN, but you don't supply enough information about what you're doing or what the key fields of the tables are to make me think you can replace the OR with AND in the join conditions. I suspect that may be the case, however.

Conditional JOIN Statement SQL Server

Is it possible to do the following:
IF [a] = 1234 THEN JOIN ON TableA
ELSE JOIN ON TableB
If so, what is the correct syntax?
I think what you are asking for will work by joining the Initial table to both Option_A and Option_B using LEFT JOIN, which will produce something like this:
Initial LEFT JOIN Option_A LEFT JOIN NULL
OR
Initial LEFT JOIN NULL LEFT JOIN Option_B
Example code:
SELECT i.*, COALESCE(a.id, b.id) as Option_Id, COALESCE(a.name, b.name) as Option_Name
FROM Initial_Table i
LEFT JOIN Option_A_Table a ON a.initial_id = i.id AND i.special_value = 1234
LEFT JOIN Option_B_Table b ON b.initial_id = i.id AND i.special_value <> 1234
Once you have done this, you 'ignore' the set of NULLS. The additional trick here is in the SELECT line, where you need to decide what to do with the NULL fields. If the Option_A and Option_B tables are similar, then you can use the COALESCE function to return the first NON NULL value (as per the example).
The other option is that you will simply have to list the Option_A fields and the Option_B fields, and let whatever is using the ResultSet to handle determining which fields to use.
This is just to add the point that query can be constructed dynamically based on conditions.
An example is given below.
DECLARE #a INT = 1235
DECLARE #sql VARCHAR(MAX) = 'SELECT * FROM [sourceTable] S JOIN ' + IIF(#a = 1234,'[TableA] A ON A.col = S.col','[TableB] B ON B.col = S.col')
EXEC(#sql)
--Query will be
/*
SELECT * FROM [sourceTable] S JOIN [TableB] B ON B.col = S.col
*/
You can solve this with union
select a, b
from tablea
join tableb on tablea.a = tableb.a
where b = 1234
union
select a, b
from tablea
join tablec on tablec.a = tableb.a
where b <> 1234
I disagree with the solution suggesting 2 left joins. I think a table-valued function is more appropriate so you don't have all the coalescing and additional joins for each condition you would have.
CREATE FUNCTION f_GetData (
#Logic VARCHAR(50)
) RETURNS #Results TABLE (
Content VARCHAR(100)
) AS
BEGIN
IF #Logic = '1234'
INSERT #Results
SELECT Content
FROM Table_1
ELSE
INSERT #Results
SELECT Content
FROM Table_2
RETURN
END
GO
SELECT *
FROM InputTable
CROSS APPLY f_GetData(InputTable.Logic) T
I think it will be better to think about your query in a different way and treat them more like sets.
I do believe if you make two separate queries then join them using UNION, It will be much better in performance and more readable.

Get a collision free hash for a specific query or a view with SQL Server 2008

I am working on a project where I need to synchronize data from our system to an external system. What I want to achieve, is to periodically send only changed items (rows) from a custom query. This query looks like this (but with many more columns) :
SELECT T1.field1,
T1.field2,
T1.field2,
T1.field3,
CASE WHEN T1.field4 = 'some-value' THEN 1 ELSE 0 END,
T2.field1,
T3.field1,
T4.field1
FROM T1
INNER JOIN T2 ON T2.pk = T2.fk
INNER JOIN T3 ON T3.pk = T2.fk
INNER JOIN T4 ON T4.pk = T2.fk
I want to avoid to have to compare every field one to one between synchronizations. I came with the idea that I could generate a hash for every row from my query, and compare this with the hash from the previous synchronization, which will return only the changed rows. I am aware of the CHECKSUM function, but it is very collision-prone and might miss changes sometimes. However I like the way I could just make a temp table and use CHECKSUM(*), which makes maintenance easier (not having to add fields in the query and in the CHECKSUM) :
SELECT T1.field1,
T1.field2,
T1.field2,
T1.field3,
CASE WHEN T1.field4 = 'some-value' THEN 1 ELSE 0 END,
T2.field1,
T3.field1,
T4.field1
INTO #tmp
FROM T1
INNER JOIN T2 ON T2.pk = T2.fk
INNER JOIN T3 ON T3.pk = T2.fk
INNER JOIN T4 ON T4.pk = T2.fk;
-- get all columns from the query, plus a hash of the row
SELECT *, CHECKSUM(*)
FROM #tmp;
I am aware of HASHBYTES function (which supports sha1, md5, which are less prone to collisions), but it only accept varchar or varbinary, not a list of columns or * the way CHECKSUM does. Having to cast/convert every column from the query is a pain in the ... and opens the door to errors (forget to include a new field for instance)
I also noticed Change Data Capture and Change Tracking features of SQL Server, but they all seems complicated and overkill for what I am doing.
So my question : is there an other method to generate a hash from a query or a temp table that meets my criterias ?
If not, is there an other way to achieve this kind of work (to sync differences from a query)
I found a way to do exactly what I wanted, thanks to the FOR XML clause :
SELECT T1.field1,
T1.field2,
T1.field2,
T1.field3,
CASE WHEN T1.field4 = 'some-value' THEN 1 ELSE 0 END,
T2.field1,
T3.field1,
T4.field1
INTO #tmp
FROM T1
INNER JOIN T2 ON T2.pk = T2.fk
INNER JOIN T3 ON T3.pk = T2.fk
INNER JOIN T4 ON T4.pk = T2.fk;
-- get all columns from the query, plus a hash of the row (converted in an hex string)
SELECT T.*, CONVERT(VARCHAR(100), HASHBYTES('sha1', (SELECT T.* FOR XML RAW)), 2) AS sHash
FROM #tmp AS T;

SQL Server stored procedure delete statement

I wish to delete a result a select statement returns, reason I'm doing this is because I have relationships between tables and if I delete from the top-most table its children rows in other tables have to be deleted, too.
Can anyone correct this stored procedure for me please?
ALTER proc [dbo].[storedprocname]
(#Parameter uniqueidentifier = '00000000-0000-0000-0000-000000000000')
AS BEGIN
DELETE FROM TableOne
WHERE IDOne IN
(SELECT
IDOne,
DescOne, IndexOne,
IDTwo,
QuestionTwo, ControlTypeTwo, IndexTwo,
IDThree,
DescThree, IndexThree,
QuestionFour,
OptionFour
FROM
TableOne
INNER JOIN
TableTwo ON TableTwo.CatID = TableOne.IDOne
INNER JOIN
TableThree ON TableThree.Question = TableTwo.IDTwo
LEFT OUTER JOIN
TableFour ON TableFour.Question = TableThree.IDThree
WHERE
TableOne.IDOne = #Parameter)
END
ALTER proc [dbo].[storedprocname]
(#Parameter uniqueidentifier = '00000000-0000-0000-0000-000000000000')
AS BEGIN
DELETE FROM TableOne
WHERE IDOne IN
(SELECT
IDOne
FROM
TableOne
INNER JOIN
TableTwo ON TableTwo.CatID = TableOne.IDOne
INNER JOIN
TableThree ON TableThree.Question = TableTwo.IDTwo
LEFT OUTER JOIN
TableFour ON TableFour.Question = TableThree.IDThree
WHERE
TableOne.IDOne = #Parameter)
END
Since you want to delete all rows where IDOne is in a list of possible values - then you need to make sure the subquery after the IN (...) also returns a single column which can be used to compare! After all, you cannot compare a single IDOne value to the whole list of columns that you're currently returning ....
Try something like this:
ALTER proc [dbo].[storedprocname]
(#Parameter uniqueidentifier = '00000000-0000-0000-0000-000000000000')
AS BEGIN
DELETE FROM TableOne
WHERE IDOne IN
(SELECT
IDOne
FROM
TableOne
INNER JOIN
TableTwo ON TableTwo.CatID = TableOne.IDOne
INNER JOIN
TableThree ON TableThree.Question = TableTwo.IDTwo
LEFT OUTER JOIN
TableFour ON TableFour.Question = TableThree.IDThree
WHERE
TableOne.IDOne = #Parameter)
END
I don't know (your question is too vague and not clear enough) whether all those JOIN's inside the subquery are really needed .... you might need to tweak that to your requirements.

Update a table using JOIN in SQL Server?

I want to update a column in a table making a join on other table e.g.:
UPDATE table1 a
INNER JOIN table2 b ON a.commonfield = b.[common field]
SET a.CalculatedColumn= b.[Calculated Column]
WHERE
b.[common field]= a.commonfield
AND a.BatchNO = '110'
But it is complaining :
Msg 170, Level 15, State 1, Line 2
Line 2: Incorrect syntax near 'a'.
What is wrong here?
You don't quite have SQL Server's proprietary UPDATE FROM syntax down. Also not sure why you needed to join on the CommonField and also filter on it afterward. Try this:
UPDATE t1
SET t1.CalculatedColumn = t2.[Calculated Column]
FROM dbo.Table1 AS t1
INNER JOIN dbo.Table2 AS t2
ON t1.CommonField = t2.[Common Field]
WHERE t1.BatchNo = '110';
If you're doing something silly - like constantly trying to set the value of one column to the aggregate of another column (which violates the principle of avoiding storing redundant data), you can use a CTE (common table expression) - see here and here for more details:
;WITH t2 AS
(
SELECT [key], CalculatedColumn = SUM(some_column)
FROM dbo.table2
GROUP BY [key]
)
UPDATE t1
SET t1.CalculatedColumn = t2.CalculatedColumn
FROM dbo.table1 AS t1
INNER JOIN t2
ON t1.[key] = t2.[key];
The reason this is silly, is that you're going to have to re-run this entire update every single time any row in table2 changes. A SUM is something you can always calculate at runtime and, in doing so, never have to worry that the result is stale.
Try it like this:
UPDATE a
SET a.CalculatedColumn= b.[Calculated Column]
FROM table1 a INNER JOIN table2 b ON a.commonfield = b.[common field]
WHERE a.BatchNO = '110'
Answer given above by Aaron is perfect:
UPDATE a
SET a.CalculatedColumn = b.[Calculated Column]
FROM Table1 AS a
INNER JOIN Table2 AS b
ON a.CommonField = b.[Common Field]
WHERE a.BatchNo = '110';
Just want to add why this problem occurs in SQL Server when we try to use alias of a table while updating that table, below mention syntax will always give error:
update tableName t
set t.name = 'books new'
where t.id = 1
case can be any if you are updating a single table or updating while using join.
Although above query will work fine in PL/SQL but not in SQL Server.
Correct way to update a table while using table alias in SQL Server is:
update t
set t.name = 'books new'
from tableName t
where t.id = 1
Hope it will help everybody why error came here.
MERGE table1 T
USING table2 S
ON T.CommonField = S."Common Field"
AND T.BatchNo = '110'
WHEN MATCHED THEN
UPDATE
SET CalculatedColumn = S."Calculated Column";
UPDATE mytable
SET myfield = CASE other_field
WHEN 1 THEN 'value'
WHEN 2 THEN 'value'
WHEN 3 THEN 'value'
END
From mytable
Join otherTable on otherTable.id = mytable.id
Where othertable.somecolumn = '1234'
More alternatives here.
Seems like SQL Server 2012 can handle the old update syntax of Teradata too:
UPDATE a
SET a.CalculatedColumn= b.[Calculated Column]
FROM table1 a, table2 b
WHERE
b.[common field]= a.commonfield
AND a.BatchNO = '110'
If I remember correctly, 2008R2 was giving error when I tried similar query.
I find it useful to turn an UPDATE into a SELECT to get the rows I want to update as a test before updating. If I can select the exact rows I want, I can update just those rows I want to update.
DECLARE #expense_report_id AS INT
SET #expense_report_id = 1027
--UPDATE expense_report_detail_distribution
--SET service_bill_id = 9
SELECT *
FROM expense_report_detail_distribution erdd
INNER JOIN expense_report_detail erd
INNER JOIN expense_report er
ON er.expense_report_id = erd.expense_report_id
ON erdd.expense_report_detail_id = erd.expense_report_detail_id
WHERE er.expense_report_id = #expense_report_id
Another approach would be to use MERGE
;WITH cteTable1(CalculatedColumn, CommonField)
AS
(
select CalculatedColumn, CommonField from Table1 Where BatchNo = '110'
)
MERGE cteTable1 AS target
USING (select "Calculated Column", "Common Field" FROM dbo.Table2) AS source ("Calculated Column", "Common Field")
ON (target.CommonField = source."Common Field")
WHEN MATCHED THEN
UPDATE SET target.CalculatedColumn = source."Calculated Column";
-Merge is part of the SQL Standard
-Also I'm pretty sure inner join updates are non deterministic..
Similar question here where the answer talks about that
http://ask.sqlservercentral.com/questions/19089/updating-two-tables-using-single-query.html
I think, this is what you are looking for.
UPDATE
Table1
SET
Table1.columeName =T1.columeName * T2.columeName
FROM
Table1 T1
INNER JOIN Table2 T2
ON T1.columeName = T2.columeName;
I had the same issue.. and you don't need to add a physical column.. cuz now you will have to maintain it..
what you can do is add a generic column in the select query:
EX:
select tb1.col1, tb1.col2, tb1.col3 ,
(
select 'Match' from table2 as tbl2
where tbl1.col1 = tbl2.col1 and tab1.col2 = tbl2.col2
)
from myTable as tbl1
Aaron's approach above worked perfectly for me. My update statement was slightly different because I needed to join based on two fields concatenated in one table to match a field in another table.
--update clients table cell field from custom table containing mobile numbers
update clients
set cell = m.Phone
from clients as c
inner join [dbo].[COSStaffMobileNumbers] as m
on c.Last_Name + c.First_Name = m.Name
Those who are using MYSQL
UPDATE table1 INNER JOIN table2 ON table2.id = table1.id SET table1.status = 0 WHERE table1.column = 20
Try:
UPDATE table1
SET CalculatedColumn = ( SELECT [Calculated Column]
FROM table2
WHERE table1.commonfield = [common field])
WHERE BatchNO = '110'

Resources