Update records SQL? - sql-server

First when I started this project seemed very simple. Two tables, field tbl1_USERMASTERID in Table 1 should be update from field tbl2_USERMASTERID Table 2. After I looked deeply in Table 2, there is no unique ID that I can use as a key to join these two tables. Only way to match the records from Table 1 and Table 2 is based on FIRST_NAME, LAST_NAME AND DOB. So I have to find records in Table 1 where:
tbl1_FIRST_NAME equals tbl2_FIRST_NAME
AND
tbl1_LAST_NAME equals tbl2_LAST_NAME
AND
tbl1_DOB equals tbl2_DOB
and then update USERMASTERID field. I was afraid that this can cause some duplicates and some users will end up with USERMASTERID that does not belong to them. So if I find more than one record based on first,last name and dob those records would not be updated. I would like just to skip and leave them blank. That way I wouldn't populate invalid USERMASTERID. I'm not sure what is the best way to approach this problem, should I use SQL or ColdFusion (my server side language)? Also how to detect more than one matching record?
Here is what I have so far:
UPDATE Table1 AS tbl1
LEFT OUTER JOIN Table2 AS tbl2
ON tbl1.dob = tbl2.dob
AND tbl1.fname = tbl2.fname
AND tbl1.lname = tbl2.lname
SET tbl1.usermasterid = tbl2.usermasterid
WHERE LTRIM(RTRIM(tbl1.usermasterid)) = ''
Here is query where I tried to detect duplicates:
SELECT DISTINCT
tbl1.FName,
tbl1.LName,
tbl1.dob,
COUNT(*) AS count
FROM Table1 AS tbl1
LEFT OUTER JOIN Table2 AS tbl2
ON tbl1.dob = tbl2.dob
AND tbl1.FName = tbl2.first
AND tbl1.LName = tbl2.last
WHERE LTRIM(RTRIM(tbl1.usermasterid)) = ''
AND LTRIM(RTRIM(tbl1.first)) <> ''
AND LTRIM(RTRIM(tbl1.last)) <> ''
AND LTRIM(RTRIM(tbl1.dob)) <> ''
GROUP BY tbl1.FName,tbl1.LName,tbl1.dob
Some data after I tested query above:
First Last DOB Count
John Cook 2008-07-11 2
Kate Witt 2013-06-05 1
Deb Ruis 2016-01-22 1
Mike Bennet 2007-01-15 1
Kristy Cruz 1997-10-20 1
Colin Jones 2011-10-13 1
Kevin Smith 2010-02-24 1
Corey Bruce 2008-04-11 1
Shawn Maiers 2016-08-28 1
Alenn Fitchner 1998-05-17 1
If anyone have idea how I can prevent/skip updating duplicate records or how to improve this query please let me know. Thank you.

You could check for and avoid duplicate matches using with common_table_expression (Transact-SQL)
along with row_number()., like so:
with cte as (
select
t.fname
, t.lname
, t.dob
, t.usermasterid
, NewUserMasterId = t2.usermasterid
, rn = row_number() over (partition by t.fname, t.lname, t.dob order by t2.usermasterid)
from table1 as t
inner join table2 as t2 on t.dob = t2.dob
and t.fname = t2.fname
and t.lname = t2.lname
and ltrim(rtrim(t.usermasterid)) = ''
)
--/* confirm these are the rows you want updated
select *
from cte as t
where t.NewUserMasterId != ''
and not exists (
select 1
from cte as i
where t.dob = i.dob
and t.fname = i.fname
and t.lname = i.lname
and i.rn>1
);
--*/
/* update those where only 1 usermasterid matches this record
update t
set t.usermasterid = t.NewUserMasterId
from cte as t
where t.NewUserMasterId != ''
and not exists (
select 1
from cte as i
where t.dob = i.dob
and t.fname = i.fname
and t.lname = i.lname
and i.rn>1
);
--*/
I use the cte to extract out the sub query for readability. Per the documentation, a common table expression (cte):
Specifies a temporary named result set, known as a common table expression (CTE). This is derived from a simple query and defined within the execution scope of a single SELECT, INSERT, UPDATE, or DELETE statement.
Using row_number() to assign a number for each row, starting at 1 for each partition of t.fname, t.lname, t.dob. Having those numbered allows us to check for the existence of duplicates with the not exists() clause with ... and i.rn>1

You could use a CTE to filter out the duplicates from Table1 before joining:
; with CTE as (select *
, count(ID) over (partition by LastName, FirstName, DoB) as IDs
from Table1)
update a
set a.ID = b.ID
from Table2 a
left join CTE b
on a.FirstName = b.FirstName
and a.LastName = b.LastName
and a.Dob = b.Dob
and b.IDs = 1
This will work provided there are no exact duplicates (same demographics and same ID) in table 1. If there are exact duplicates, they will also be excluded from the join, but you can filter them out before the CTE to avoid this.

Please try below SQL:
UPDATE Table1 AS tbl1
INNER JOIN Table2 AS tbl2
ON tbl1.dob = tbl2.dob
AND tbl1.fname = tbl2.fname
AND tbl1.lname = tbl2.lname
LEFT JOIN Table2 AS tbl3
ON tbl3.dob = tbl2.dob
AND tbl3.fname = tbl2.fname
AND tbl3.lname = tbl2.lname
AND tbl3.usermasterid <> tbl2.usermasterid
SET tbl1.usermasterid = tbl2.usermasterid
WHERE LTRIM(RTRIM(tbl1.usermasterid)) = ''
AND tbl3.usermasterid is null

Related

I want to update one table data based on the join condition with another table in SQL server

I tried to update a column in one table from another table using join condition. I tried all possible ways and its showing SQL command not properly ended.I am trying t his in sql server.
update a
set a.col4=b.col4
from dummy_jd a
join --select * from dummy_jd a,
(select col1,col2,col3,col4,col5,sum(reported_amount) reported_amount
from dummy_jd_2
where col5=3843
and col3='abc'
and col1=4
and col2=3002
group by col1,col2,col3,col4,col5
) b on (a.col2=b.col5 and a.col4=b.col3)
where a.col1=9;
This is which i tried
Try below -
update a set a.col4=b.col4 from dummy_jd a
join
(select col1,col2,col3,col4,col5,sum(reported_amount) reported_amount
from dummy_jd_2 where col5=3843 and col3='abc' and col1=4 and col2=3002
group by col1,col2,col3,col4,col5
)b on a.col2=b.col5 and a.col4=b.col3
where a.col1=9
Thanks to everyone i achieved it by using for loopand joins below is the query i used
for i in (select * from dummy_jd1)
loop
update dummy_jd a set a.col4 = (select distinct col4 from dummy_jd_2 b
where /*(jnl_ref_no, bill_ref_no) in (select
jnl_ref_no,bill_ref_no from dummy_jd1) and*/
a.col2 = b.col5
and a.col4 = b.col3
and b.col5 = i.col2
and b.col3 = i.col3
and id_type = 4
--and id_value in(3002,3019)
and reported_amount <> 0
group by col1,col2,col3,col4,col5)
where a.id_type = 9
and a.col2=i.col2
and a.col4=i.col3;

How to optimize this SQL Server query

I have following query, which is working as expected but taking approx 3 seconds to execute. Reason is large number of records. Can somebody please suggest any steps in order to improve performance?
Explanation :
Check to see value using Comp id and Default_Comp = 1
If not found, ignore the Default_Comp and check only based on Comp id
Still not found, ignore the join with table 2 and try to get by Comp id.
My code:
DECLARE #Finished_Comp VARCHAR(MAX) = NULL;
SELECT #Finished_Comp = MIN(tbl2.Finished_Comp)
FROM Table1 tbl1
INNER JOIN Table2 tbl2 ON tbl1.Sav_ID = tbl2.Sav_ID
WHERE Comp_ID = #Comp_ID AND tbl1.Default_Comp = 1
IF #Finished_Comp IS NULL
BEGIN
SELECT #Finished_Comp = MIN(tbl2.Finished_Comp)
FROM Table1 tbl1
INNER JOIN Table2 tbl2 ON tbl1.Sav_ID = tbl2.Sav_ID
WHERE Comp_ID = #Comp_ID
END
IF #Finished_Comp IS NULL
BEGIN
SELECT #Finished_Comp = MIN(Finished_Comp)
FROM Table1 tbl1
WHERE Comp_ID = #Comp_ID AND #Finished_Comp != ''
END
I tried to use COALESCE, but it's returning wrong results for Finished_Comp
You say in the comments
I strongly believe the query can be changed to some extent so that no
multiple queries need to be executed.
Yes you're right.
SELECT #Finished_Comp = COALESCE(MIN(CASE WHEN tbl1.Default_Comp = 1 THEN tbl2.Finished_Comp END),
MIN(tbl2.Finished_Comp),
MIN(CASE WHEN tbl1.Finished_Comp <> '' THEN tbl1.Finished_Comp END))
FROM Table1 tbl1
LEFT JOIN Table2 tbl2
ON tbl1.Sav_ID = tbl2.Sav_ID
WHERE tbl1.Comp_ID = #Comp_IDV
But at best this will only reduce execution time to a third of current (for the case that all three queries need to be executed).
You should consider adding indexes on
Table1 - Comp_ID, Sav_ID INCLUDE (Default_Comp, Finished_Comp)
Table2 - Sav_ID INCLUDE (Finished_Comp)
For potentially much larger improvements.

Updating DISTINCTROW in SQL Server [duplicate]

What would the syntax be to convert this MS Access query to run in SQL Server as it doesn't have a DistinctRow keyword
UPDATE DISTINCTROW [MyTable]
INNER JOIN [AnotherTable] ON ([MyTable].J5BINB = [AnotherTable].GKBINB)
AND ([MyTable].J5BHNB = [AnotherTable].GKBHNB)
AND ([MyTable].J5BDCD = [AnotherTable].GKBDCD)
SET [AnotherTable].TessereCorso = [MyTable].[J5F7NR];
DISTINCTROW [MyTable] removes duplicate MyTable entries from the results. Example:
select distinctrow items
items.item_number, items.name
from items
join orders on orders.item_id = items.id;
In spite of the join getting you the same item_number and name multiple times when there is more than one order for it, DISTINCTROW reduces this to one row per item. So the whole join is merely for assuring that you only select items for which exist at least one order. You don't find DISTINCTROW in any other DBMS as far as I know. Probably because it is not needed. When checking for existence, we use EXISTS of course (or IN for that matter).
You are joining MyTable and AnotherTable and expect for some reason to get the same MyTable record multifold for one AnotherTable record, so you use DISTINCTROW to only get it once. Your query would (hopefully) fail if you got two different MyTable records for one AnotherTable record.
What the update does is:
update anothertable
set tesserecorso = (select top 1 j5f7nr from mytable where mytable.j5binb = anothertable.gkbinb and ...)
where exists (select * from mytable where mytable.j5binb = anothertable.gkbinb and ...)
But this uses about the same subquery twice. So we'd want to update from a query instead.
The easiest way to get one result record per <some columns> in a standard SQL query is to aggregate data:
select *
from anothertable a
join
(
select j5binb, j5bhnb, j5bdcd, max(j5f7nr) as j5f7nr
from mytable
group by j5binb, j5bhnb, j5bdcd
) m on m.j5binb = a.gkbinb and m.j5bhnb = a.gkbhnb and m.j5bdcd = a.gkbdcd;
How to write an updateble query is different from one DBMS to another. Here is the final update statement for SQL-Server:
update a
set a.tesserecorso = m.j5f7nr
from anothertable a
join
(
select j5binb, j5bhnb, j5bdcd, max(j5f7nr) as j5f7nr
from mytable
group by j5binb, j5bhnb, j5bdcd
) m on m.j5binb = a.gkbinb and m.j5bhnb = a.gkbhnb and m.j5bdcd = a.gkbdcd;
The DISTINCTROW predicate in MS Access SQL removes duplicates across all fields of a table in join statements and not just the selected fields of query (which DISTINCT in practically all SQL dialects do). So consider selecting all fields in a derived table with DISTINCT predicate:
UPDATE [AnotherTable]
SET [AnotherTable].TessereCorso = main.[J5F7NR]
FROM
(SELECT DISTINCT m.* FROM [MyTable] m) As main
INNER JOIN [AnotherTable]
ON (main.J5BINB = [AnotherTable].GKBINB)
AND (main.J5BHNB = [AnotherTable].GKBHNB)
AND (main.J5BDCD = [AnotherTable].GKBDCD)
Another variant of the query.. (Too lazy to get the original tables).
But like the query above updates 35 rows =, so does this one
UPDATE [Albi-Anagrafe-Associati]
SET
[Albi-Anagrafe-Associati].CRegDitte = [055- Registri ditte].[CRegDitte],
[Albi-Anagrafe-Associati].NIscrTribunale = [055- Registri ditte].[NIscrTribunale],
[Albi-Anagrafe-Associati].NRegImprese = [055- Registri ditte].[NRegImprese]
FROM [055- Registri ditte]
WHERE EXISTS(
SELECT *
FROM [055- Registri ditte]-- [Albi-Anagrafe-Associati]
WHERE ([055- Registri ditte].GIBINB = [Albi-Anagrafe-Associati].GKBINB)
AND ([055- Registri ditte].GIBHNB = [Albi-Anagrafe-Associati].GKBHNB)
AND ([055- Registri ditte].GIBDCD = [Albi-Anagrafe-Associati].GKBDCD))
Update [AnotherTable]
Set [AnotherTable].TessereCorso = MyTable.[J5F7NR]
From [AnotherTable]
Inner Join
(
Select Distinct [J5BINB],[5BHNB],[J5BDCD]
,(Select Top 1 [J5F7NR] From MyTable) as [J5F7NR]
,[J5BHNB]
From MyTable
)as MyTable
On (MyTable.J5BINB = [AnotherTable].GKBINB)
AND (MyTable.J5BHNB = [AnotherTable].GKBHNB)
AND (MyTable.J5BDCD = [AnotherTable].GKBDCD)

Finding difference between 2 tables in MS Access or SQL Server

I have 2 Excel files which I imported into MS Access as two tables. These two tables are identical but imported on different dates.
Now, how can I find out what rows and what fields are updated on the later date? Any help would be highly appreciated.
Finding Inserted records is easy
select * from B where not exists (select 1 from A where A.pk=B.pk)
Finding Deleted records is just as easy
select * from A where not exists (select 1 from B where A.pk=B.pk)
Finding Updated records is a pain. The following rigorous query assumes you have nullable columns and it should work in all situations.
select B.*
from B
inner join A on B.pk=A.pk
where A.col1<>B.col1 or (IsNull(A.col1) and not IsNull(B.col1)) or (not IsNull(A.col1) and IsNull(B.col1))
or A.col2<>B.col2 or (IsNull(A.col2) and not IsNull(B.col2)) or (not IsNull(A.col2) and IsNull(B.col2))
or A.col3<>B.col3 or (IsNull(A.col3) and not IsNull(B.col3)) or (not IsNull(A.col3) and IsNull(B.col3))
etc...
If the columns are defined as NOT NULL then the query is much simper, just remove all the NULL tests.
If the columns are nullable but you can identify a value that will never appear in the data, then use a simple comparison like:
Nz(A.col1,neverAppearingValue)<>Nz(B.col1,neverAppearingValue)
I believe this should be as simple as running a query like this:
SELECT *
FROM Table1
JOIN Table2
ON Table1.ID = Table2.ID AND Table1.Date != Table2.Date
One way to do this is by unpivoting both tables, so you get a new table with , , . Note, though, that you have to take types into account.
For example, the following gets differences in fields:
with oldt as (select id, col, val
from <old table> t
unpivot (val for col in (<column list>)) unpvt
),
newt as (select id, col, val
from <new table> t
unpivot (val for col in (<column list>)) unpvit
)
select *
from oldt full outer join newt on oldt.id = newt.id
where oldt.id is null or newt.id is null
The alternative way with a join is rather cumbersome. This version shows whether columns are added, deleted, and which columns changed if any:
select *
from (select coalesce(oldt.id, newt.id) as id,
(case when oldt.id is null and newt.id is not null then 'ADDED'
when oldt.id is not null and newt.id is null then 'DELETED'
else 'SAME'
end) as stat,
(case when oldt.col1 <> newt.col1 or oldt.col1 is null and newt.col1 is null
then 1 else 0 end) as diff_col1,
(case when oldt.col2 <> newt.col2 or oldt.col2 is null and newt.col2 is null
then 1 else 0 end) as diff_col2,
...
from <old table> oldt full outer join <new table> newt on oldt.id = newt.id
) c
where status in ('ADDED', 'DELETED') or
(diff_col1 + diff_col2 + ... ) > 0
It does have the advantage of working for any data types.
(Select * from OldTable Except Select *from NewTable)
Union All
(Select * from NewTable Except Select *from OldTable)

Update a table using JOIN in SQL Server?

I want to update a column in a table making a join on other table e.g.:
UPDATE table1 a
INNER JOIN table2 b ON a.commonfield = b.[common field]
SET a.CalculatedColumn= b.[Calculated Column]
WHERE
b.[common field]= a.commonfield
AND a.BatchNO = '110'
But it is complaining :
Msg 170, Level 15, State 1, Line 2
Line 2: Incorrect syntax near 'a'.
What is wrong here?
You don't quite have SQL Server's proprietary UPDATE FROM syntax down. Also not sure why you needed to join on the CommonField and also filter on it afterward. Try this:
UPDATE t1
SET t1.CalculatedColumn = t2.[Calculated Column]
FROM dbo.Table1 AS t1
INNER JOIN dbo.Table2 AS t2
ON t1.CommonField = t2.[Common Field]
WHERE t1.BatchNo = '110';
If you're doing something silly - like constantly trying to set the value of one column to the aggregate of another column (which violates the principle of avoiding storing redundant data), you can use a CTE (common table expression) - see here and here for more details:
;WITH t2 AS
(
SELECT [key], CalculatedColumn = SUM(some_column)
FROM dbo.table2
GROUP BY [key]
)
UPDATE t1
SET t1.CalculatedColumn = t2.CalculatedColumn
FROM dbo.table1 AS t1
INNER JOIN t2
ON t1.[key] = t2.[key];
The reason this is silly, is that you're going to have to re-run this entire update every single time any row in table2 changes. A SUM is something you can always calculate at runtime and, in doing so, never have to worry that the result is stale.
Try it like this:
UPDATE a
SET a.CalculatedColumn= b.[Calculated Column]
FROM table1 a INNER JOIN table2 b ON a.commonfield = b.[common field]
WHERE a.BatchNO = '110'
Answer given above by Aaron is perfect:
UPDATE a
SET a.CalculatedColumn = b.[Calculated Column]
FROM Table1 AS a
INNER JOIN Table2 AS b
ON a.CommonField = b.[Common Field]
WHERE a.BatchNo = '110';
Just want to add why this problem occurs in SQL Server when we try to use alias of a table while updating that table, below mention syntax will always give error:
update tableName t
set t.name = 'books new'
where t.id = 1
case can be any if you are updating a single table or updating while using join.
Although above query will work fine in PL/SQL but not in SQL Server.
Correct way to update a table while using table alias in SQL Server is:
update t
set t.name = 'books new'
from tableName t
where t.id = 1
Hope it will help everybody why error came here.
MERGE table1 T
USING table2 S
ON T.CommonField = S."Common Field"
AND T.BatchNo = '110'
WHEN MATCHED THEN
UPDATE
SET CalculatedColumn = S."Calculated Column";
UPDATE mytable
SET myfield = CASE other_field
WHEN 1 THEN 'value'
WHEN 2 THEN 'value'
WHEN 3 THEN 'value'
END
From mytable
Join otherTable on otherTable.id = mytable.id
Where othertable.somecolumn = '1234'
More alternatives here.
Seems like SQL Server 2012 can handle the old update syntax of Teradata too:
UPDATE a
SET a.CalculatedColumn= b.[Calculated Column]
FROM table1 a, table2 b
WHERE
b.[common field]= a.commonfield
AND a.BatchNO = '110'
If I remember correctly, 2008R2 was giving error when I tried similar query.
I find it useful to turn an UPDATE into a SELECT to get the rows I want to update as a test before updating. If I can select the exact rows I want, I can update just those rows I want to update.
DECLARE #expense_report_id AS INT
SET #expense_report_id = 1027
--UPDATE expense_report_detail_distribution
--SET service_bill_id = 9
SELECT *
FROM expense_report_detail_distribution erdd
INNER JOIN expense_report_detail erd
INNER JOIN expense_report er
ON er.expense_report_id = erd.expense_report_id
ON erdd.expense_report_detail_id = erd.expense_report_detail_id
WHERE er.expense_report_id = #expense_report_id
Another approach would be to use MERGE
;WITH cteTable1(CalculatedColumn, CommonField)
AS
(
select CalculatedColumn, CommonField from Table1 Where BatchNo = '110'
)
MERGE cteTable1 AS target
USING (select "Calculated Column", "Common Field" FROM dbo.Table2) AS source ("Calculated Column", "Common Field")
ON (target.CommonField = source."Common Field")
WHEN MATCHED THEN
UPDATE SET target.CalculatedColumn = source."Calculated Column";
-Merge is part of the SQL Standard
-Also I'm pretty sure inner join updates are non deterministic..
Similar question here where the answer talks about that
http://ask.sqlservercentral.com/questions/19089/updating-two-tables-using-single-query.html
I think, this is what you are looking for.
UPDATE
Table1
SET
Table1.columeName =T1.columeName * T2.columeName
FROM
Table1 T1
INNER JOIN Table2 T2
ON T1.columeName = T2.columeName;
I had the same issue.. and you don't need to add a physical column.. cuz now you will have to maintain it..
what you can do is add a generic column in the select query:
EX:
select tb1.col1, tb1.col2, tb1.col3 ,
(
select 'Match' from table2 as tbl2
where tbl1.col1 = tbl2.col1 and tab1.col2 = tbl2.col2
)
from myTable as tbl1
Aaron's approach above worked perfectly for me. My update statement was slightly different because I needed to join based on two fields concatenated in one table to match a field in another table.
--update clients table cell field from custom table containing mobile numbers
update clients
set cell = m.Phone
from clients as c
inner join [dbo].[COSStaffMobileNumbers] as m
on c.Last_Name + c.First_Name = m.Name
Those who are using MYSQL
UPDATE table1 INNER JOIN table2 ON table2.id = table1.id SET table1.status = 0 WHERE table1.column = 20
Try:
UPDATE table1
SET CalculatedColumn = ( SELECT [Calculated Column]
FROM table2
WHERE table1.commonfield = [common field])
WHERE BatchNO = '110'

Resources