Verifying Syntax of a Massive SQL Update Command - sql-server

I'm new to SQL Server and am doing some cleanup of our transaction database. However, to accomplish the last step, I need to update a column in one table of one database with the value from another column in another table from another database.
I found a SQL update code snippet and re-wrote it for our own needs but would love someone to give it a once over before I hit the execute button since the update will literally affect hundreds of thousands of entries.
So here are the two databases:
Database 1: Movement
Table 1: ItemMovement
Column 1: LongDescription (datatype: text / up to 40 char)
Database 2: Item
Table 2: ItemRecord
Column 2: Description (datatype: text / up to 20 char)
Goal: set Column1 from db1 to the value of Colum2 from db2.
Here is the code snippet:
update table1
set table1.longdescription = table2.description
from movement..itemmovement as table1
inner join item..itemrecord as table2 on table1.itemcode = table2.itemcode
where table1.longdescription <> table2.description
I added the last "where" line to prevent SQL from updating the column where it already matches the source table.
This should execute faster and just update the columns that have garbage. But as it stands, does this look like it will run? And lastly, is it a straightforward process, using SQL Server 2005 Express to just backup the entire Movement db before I execute? And if it messes up, just restore it?
Alternatively, is it even necessary to re-cast the tables as table1 and table 2? Is it valid to execute a SQL query like this:
update movement..itemmovement
set itemmovement.longdescription = itemrecord.description
from movement..itemmovement
inner join item..itemrecord on itemmovement.itemcode = itemrecord.itemcode
where itemmovement.longdescription <> itemrecord.description
Many thanks in advance!

You don't necessarily need to alias your tables but I recommend you do for faster typing and reduce the chances of making a typo.
update m
set m.longdescription = i.description
from movement..itemmovement as m
inner join item..itemrecord as i on m.itemcode = i.itemcode
where m.longdescription <> i.description
In the above query I have shortened the alias using m for itemmovement and i for itemrecord.
When a large number of records are to be updated and there's question whether it would succeed or not, always make a copy in a test database (residing on a test server) and try it out over there. In this case, one of the safest bet would be to create a new field first and call it longdescription_text. You can make it with SQL Server Management Studio Express (SSMS) or using the command below:
use movement;
alter table itemmovement add column longdescription_test varchar(100);
The syntax here says alter table itemmovement and add a new column called longdescription_test with datatype of varchar(100). If you create a new column using SSMS, in the background, SSMS will run the same alter table statement to create a new column.
You can then execute
update m
set m.longdescription_test = i.description
from movement..itemmovement as m
inner join item..itemrecord as i on m.itemcode = i.itemcode
where m.longdescription <> i.description
Check data in longdescription_test randomly. You can actually do a spot check faster by running:
select * from movement..itemmovement
where longdescription <> longdescription_test
and longdescription_test is not null
If information in longdescription_test looks good, you can change your update statement to set m.longdescription = i.description and run the query again.
It is easier to just create a copy of your itemmovement table before you do the update. To make a copy, you can just do:
use movement;
select * into itemmovement_backup from itemmovement;
If update does not succeed as desired, you can truncate itemmovement and copy data back from itemmovement_backup.

Zedfoxus provided a GREAT explanation on this and I appreciate it. It is excellent reference for next time around. After reading over some syntax examples, I was confident enough in being able to run the second SQL update query that I have in my OP. Luckily, the data here is not necessarily "live" so at low risk to damage anything, even during operating hours. Given the nature of the data, the updated executed perfectly, updating all 345,000 entries!

Related

SQL Server Linked Server Update - terrible performance

In my SQL Server 2012 database, I have a linked server reference to a second SQL Server database that I need to pull records from and update accordingly.
I have the following update statement that I am trying to run:
UPDATE
Linked_Tbl
SET
Transferred = 1
FROM
MyLinkedServer.dbo.MyTable Linked_Tbl
JOIN
MyTable Local_Tbl ON Local_Tbl.LinkedId = Linked_Tbl.Id
JOIN
MyOtherTable Local_Tbl2 ON Local_Tbl.LocalId = Local_Tbl2.LocalId
Which I had to stop after an hour of running as it was still executing.
I've read online and found solutions stating that the best solution is to create a stored procedure on the Linked Server itself to execute the update statement rather than run it over the wire.
The problems I have are:
I don't have the ability to create any procedures on the other server.
Even if I could create that procedure, I would need to pass through all the Ids to the stored procedure for the update and I'm not sure how to do that efficiently with thousands of Ids (this, obviously, is the smaller of the issues, though since I can't create that procedure in the first place).
I'm hoping there are other solutions people may have managed to come up with given that it's often the case you don't have permissions to make changes to a different server.
Any ideas??
I am not sure, whether it can give more performance, you an try:
UPDATE
Linked_Tbl
SET
Transferred = 1
FROM OPENDATASOURCE([MyLinkedServer],'select Id, LocalId,Transferred from remotedb.dbo.MyTable') AS Linked_Tbl
JOIN MyTable Local_Tbl
ON Local_Tbl.LinkedId = Linked_Tbl.Id
JOIN MyOtherTable Local_Tbl2
ON Local_Tbl.LocalId = Local_Tbl2.LocalId

How to create a "Ghost Table" in SQL Server based off of other tables?

I need to create a "ghost" table in SQL Server, which doesn't actually exist but is a result set of a SQL Query. Pseudo code is below:
SELECT genTbl_col1, genTblcol2
FROM genTbl;
However, "genTbl" is actually:
SELECT table1.col AS genTbl_col1,
table2.col AS genTbl_col2
FROM table1 INNER JOIN table2 ON (...)
In other words, I need that every time a query is run on the server trying to select from "genTbl", it simply creates a result set from the query and treats it like a real table.
The situation is that I have a software that runs queries on a database. I need to modify it, but I cannot change the software itself, so I need to trick it into thinking it can actually query "genTbl", when it actually doesn't exist but is simply a query of other tables.
To clarify, the query would have to be a sort of procedure, available by default in the database (i.e. every time there is a query for "genTbl").
Use #TMP
SELECT genTbl_col1, genTblcol2
INTO #TMP FROM genTbl;
It exists only in current session. You can also use ##TMP for all sessions.

SQL Server: update two tables using data from temp table for 200K rows

I have 200K+ rows data in xls and as per requirement i need to update database tables (2 tables) using xls data.
I know the process to copy data from xls to SQL server table however i am struggling with approach to update database tables.
I could not think of any other approach than writing a cursor and i dont want to go with cursor approach as updating
200k+ data using cursor may eat up transaction log and will take lot of time to finish the update.
Can someone help me with what else could be done to accomplish this.
Use the following techniques.
1 - Import the data into a staging table. Use the import / export tool is one way to do the task The target table should be in a throw away or staging database.
http://technet.microsoft.com/en-us/library/ms141209.aspx
2 - Make sure that the data types between the EXCEL data and TABLE data are the same.
3 - Make sure the existing target [TRG_TBL] TABLE has a primary key. Make sure the EXCEL data loaded into a [SRC_TBL] table has the same key. You can add a non-clustered index to speed up the JOIN in the UPDATE statement.
4 - Add a [FLAG] column as INT NULL to the [TRG_TABLE] with an ALTER TABLE command.
5 - Make sure a full backup is done before and after the large UPDATE. You can also use a DATABASE SNAPSHOT. The key point is to have a roll back plan in place if needed.
-- Select correct db
USE [TRG_DB]
GO
-- Set to simple mode
ALTER DATABASE [TRG_DB] SET RECOVERY SIMPLE;
GO
-- Update in batches
DECLARE #VAR_ROWS INT = 1;
WHILE (#VAR_ROWS > 0)
BEGIN
-- Update fields and flag on join
UPDATE TOP (10000) T
SET
T.FLD1 = S.FLD1,
-- ... Etc
T.FLAG = 1
FROM [TRG_TABLE] T JOIN [SRC_TABLE] S ON T.ID = S.ID
WHERE T.[FLAG] IS NULL
-- How many rows updated
SET #VAR_ROWS = ##ROWCOUNT;
-- WAL -> flush log entries to data file
CHECKPOINT;
END
-- Set to full mode
ALTER DATABASE [MATH] SET RECOVERY FULL;
GO
In summary, I gave you all the tools to do the job. Just modify them for your particular occurrence.
PS: Here is working code from my blog on large deletes. Same logic applies.
http://craftydba.com/?p=3079
PPS: I did not check the sample code for syntax. That is left up for you.

Stored Procedure for Updating a Column in Sql Server

I have a requirement to update a column with multiple values. The query looks like below.
Update table1 set column1 = (
select value from table2 where table1.column0 = table2.coulmn
)
Is there any generalised stored procedure for a requirement like the above?
short of creating a statement as a string and using the "execute" statement, I don't know of one. Generally "execute" is frowned on as it's a potential injection attack point.
Why would you want to update one table with information that is easily available in another? Seems like you are just guaranteeing that you are going to have to run this query every single time you perform an update, insert or delete against the camsnav table. Otherwise how are you going to keep them in sync?
Also, if you cannot guarantee that the sub-query will return exactly one row, it is probably safer to use the SQL Server-specific and proprietary update format:
UPDATE f SET nav = n.nav
FROM camsfolio AS f
INNER JOIN camsnav AS n
ON f.schcode = n.schcode;
SQL Server doesn't use "generalised stored procedures" for this kind of thing. It's up to you to build your own SP, composed using an appropriate parameterized UPDATE statement.

SQL 2005 copy single column between databases

I'm still fairly new to T-SQL and SQL 2005. I need to import a column of integers from a table in database1 to a identical table (only missing the column I need) in database2. Both are sql 2005 databases. I've tried the built in import command in Server Management Studio but it's forcing me to copy the entire table. This causes errors due to constraints and 'read-only' columns (whatever 'read-only' means in sql2005). I just want to grab a single column and copy it to a table.
There must be a simple way of doing this. Something like:
INSERT INTO database1.myTable columnINeed
SELECT columnINeed from database2.myTable
Inserting won't do it since it'll attempt to insert new rows at the end of the table. What it sounds like your trying to do is add a column to the end of existing rows.
I'm not sure if the syntax is exactly right but, if I understood you then this will do what you're after.
Create the column allowing nulls in database2.
Perform an update:
UPDATE database2.dbo.tablename
SET database2.dbo.tablename.colname = database1.dbo.tablename.colname
FROM database2.dbo.tablename INNER JOIN database1.dbo.tablename ON database2.dbo.tablename.keycol = database1.dbo.tablename.keycol
There is a simple way very much like this as long as both databases are on the same server. The fully qualified name is dbname.owner.table - normally the owner is dbo and there is a shortcut for ".dbo." which is "..", so...
INSERT INTO Datbase1..MyTable
(ColumnList)
SELECT FieldsIWant
FROM Database2..MyTable
first create the column if it doesn't exist:
ALTER TABLE database2..targetTable
ADD targetColumn int null -- or whatever column definition is needed
and since you're using Sql Server 2005 you can use the new MERGE statement.
The MERGE statement has the advantage of being able to treat all situations in one statement like missing rows from source (can do inserts), missing rows from destination (can do deletes), matching rows (can do updates), and everything is done atomically in a single transaction. Example:
MERGE database2..targetTable AS t
USING (SELECT sourceColumn FROM sourceDatabase1..sourceTable) as s
ON t.PrimaryKeyCol = s.PrimaryKeyCol -- or whatever the match should be bassed on
WHEN MATCHED THEN
UPDATE SET t.targetColumn = s.sourceColumn
WHEN NOT MATCHED THEN
INSERT (targetColumn, [other columns ...]) VALUES (s.sourceColumn, [other values ..])
The MERGE statement was introduced to solve cases like yours and I recommend using it, it's much more powerful than solutions using multiple sql batch statements that basically accomplish the same thing MERGE does in one statement without the added complexity.
You could also use a cursor. Assuming you want to iterate all the records in the first table and populate the second table with new rows then something like this would be the way to go:
DECLARE #FirstField nvarchar(100)
DECLARE ACursor CURSOR FOR
SELECT FirstField FROM FirstTable
OPEN ACursor
FETCH NEXT FROM ACursor INTO #FirstField
WHILE ##FETCH_STATUS = 0
BEGIN
INSERT INTO SecondTable ( SecondField ) VALUES ( #FirstField )
FETCH NEXT FROM ACursor INTO #FirstField
END
CLOSE ACursor
DEALLOCATE ACursor
MERGE is only available in SQL 2008 NOT SQL 2005
insert into Test2.dbo.MyTable (MyValue) select MyValue from Test1.dbo.MyTable
This is assuming a great deal. First that the destination database is empty. Second that the other columns are nullable. You may need an update instead. To do that you will need to have a common key.

Resources