Correct error handling when dropping and adding columns

Correct error handling when dropping and adding columns - sql-server

I have a table that is currently using a couple of columns named DateFrom and DateTo. I'm trying to replace them with a single NewDate column, populated for existing rows with the value from DateFrom.
I need good error/transaction handling as, if the change fails, I don't want a halfway in-between table, I want to revert.
I've tried a number of things but can't get it to work properly. Any help is appreciated as I'm far from experienced with this.
I started with
BEGIN TRAN
ALTER TABLE TableName
ADD NewDate DATETIME
IF ##ERROR = 0 AND ##TRANCOUNT = 1
UPDATE TableName
SET NewDate = ValidFrom
....
This fails immediately as NewDate is not currently a column in the table. Fine, so I add a GO in there. This breaks it into two batches and it now runs, except it makes the ##ERROR check pointless. I also can't use a local variable as those are lost after GO as well. Ideally I'd like to use a TRY...CATCH to avoid checking errors after each statement but I can't use a GO with that as it needs to be one batch.
None of the articles I've found talk about this situation (error handling with GO). So the question is: Is there any way I can get the transaction-with-error-handling approach I'm looking for when adding and updating a column (which seems to necessitate a GO somewhere)?
Or am I going to have to settle for doing it in several batches, without the ability to roll back to my original table if anything goes wrong?

Why are you worried about creating the new column in the transaction? Just create the column and then populate it. You don't even need an explicit tran when populating it. If it fails (which is very unlikely), just do the update again.
I would do the following steps
Add new column
Update new column
Check to see if the data in the new column looks correct
Drop the old columns no longer needed (you may want to check where these columns are being used before dropping them e.g. are they used in any stored procedures, reports, front-end application code)
Also, it is worth adding more context to your question. I assume you are testing a script against a test database and will later apply the script to a prod database. Is the prod database very big? Very busy? Mission critical? Backed up on a schedule?

Related

Stored procedure to update different columns

I have an API that i'm trying to read that gives me just the updated field. I'm trying to take that and update my tables using a stored procedure. So far the only way I have been able to figure out how to do this is with dynamic SQL but i would prefer to not do that if there is a way not to.
If it was just a couple columns, I'd just write a proc for each but we are talking about 100 fields and any of them could be updated together. One ticket might just need a timestamp updated at this time, but the next ticket might be a timestamp and who modified it while the next one might just be a note.
Everything I've read and have been taught have told me that dynamic SQL is bad and while I'll write it if I have too, I'd prefer to have a proc.

YOU CAN PERHAPS DO SOMETHING LIKE THIS:::
IF EXISTS (SELECT * FROM NEWTABLE NOT IN (SELECT * FROM OLDTABLE))
BEGIN
UPDATE OLDTABLE
SET OLDTABLE.OLDRECORDS = NEWTABLE.NEWRECORDS
WHERE OLDTABLE.PRIMARYKEY= NEWTABLE.PRIMARYKEY
END

The best way to solve your problem is using MERGE:
Performs insert, update, or delete operations on a target table based on the results of a join with a source table. For example, you can synchronize two tables by inserting, updating, or deleting rows in one table based on differences found in the other table.
As you can see your update could be more complex but more efficient as well. Using MERGE requires some proficiency, but when you start to use it you'll use it with pleasure again and again.

I am not sure how your business logic works that determines what columns are updated at what time. If there are separate business functions that require updating different but consistent columns per function, you will probably want to have individual update statements for each function. This will ensure that each process updates only the columns that it needs to update.
On the other hand, if your API is such that you really don't know ahead of time what needs to be updated, then building a dynamic SQL query is a good idea.
Another option is to build a save proc that sets every user-configurable field. As long as the calling process has all of that data, it can call the save procedure and pass every updateable column. There is no harm in having a UPDATE MyTable SET MyCol = #MyCol with the same values on each side.
Note that even if all of the values are the same, the rowversion (or timestampcolumns) will still be updated, if present.
With our software, the tables that users can edit have a widely varying range of columns. We chose to create a single save procedure for each table that has all of the update-able columns as parameters. The calling processes (our web servers) have all the required columns in memory. They pass all of the columns on every call. This performs fine for our purposes.

Conditional SQL block evaluated even when it won't be executed

I'm working on writing a migration script for a database, and am hoping to make it idempotent, so we can safely run it any number of times without fear of it altering the database (/ migrating data) beyond the first attempt.
Part of this migration involves removing columns from a table, but inserting that data into another table first. To do so, I have something along these lines.
IF EXISTS
(SELECT * FROM sys.columns
WHERE object_id = OBJECT_ID('TableToBeModified')
AND name = 'ColumnToBeDropped')
BEGIN
CREATE TABLE MigrationTable (
Id int,
ColumnToBeDropped varchar
);
INSERT INTO MigrationTable
(Id, ColumnToBeDropped)
SELECT Id, ColumnToBeDropped
FROM TableToBeModified;
END
The first time through, this works fine, since it still exists. However, on subsequent attempts, it fails because the column no longer exists. I understand that the entire script is evaluated, and I could instead put the inner contents into an EXEC statement, but is that really the best solution to this problem, or is there another, still potentially "validity enforced" option?

I understand that the entire script is evaluated, and I could instead put the inner contents into an EXEC statement, but is that really the best solution to this problem
Yes. There are several scenarios in which you would want to push off the parsing validation due to dependencies elsewhere in the script. I will even sometimes put things into an EXEC, even if there are no current problems, to ensure that there won't be as either the rest of the script changes or the environment due to addition changes made after the current rollout script was developed. Minorly, it helps break things up visually.
While there can be permissions issues related to breaking ownership changing due to using Dynamic SQL, that is rarely a concern for a rollout script, and not a problem I have ever run into.

If we are not sure that the script will work or not specially migrating database.
However, For query to updated data related change, i will execute script with BEGIN TRAN and check result is expected then we need to perform COMMIT TRAN otherwise ROLLBACK transaction, so it will discard transaction.

SQL Server wiped my table after (incorrectly) creating a new column .. what the heck happened?

I added a new column to an existing table in the SQL Server Management Studio table designer. Type INT, not null. Didn't set a default value.
I generated a change script and ran it, it errored out with a warning that the new column does not allow nulls, and no default value was being set. It said "0 rows affected".
Data was still there, and for some reason my new column was visible in the "columns" folder on the database tree on the left of SSMS even though it said "0 rows affected" and failed to make the database change.
Because the new column was visible in the list, I thought I would go ahead and update all rows and add a value in.
UPDATE MyTable SET NewColumn = 0
Boom.. table wiped clean. Every row deleted.
This is a big problem because it was on a production database that wasn't being backed up unbeknownst to me. But.. recoverable with some manual entry, so not the end of the world.
Anyone know what could have happened here.. and maybe what was going on internally that could have caused my update statement to wipe out every row in the table?

An UPDATE statement can't delete rows unless there is a trigger that performs the delete afterward, and you say the table has no triggers.
So it had to be the scenario I laid out for you in my comment: The rows did not get loaded properly to the new table, and the old table was dropped.
Note that it is even possible for it to have looked right for you, where the rows did get loaded at one point--if the transaction was not committed, and then (for example) later when your session was terminated the transaction was automatically rolled back. The transaction could have been rolled back for other reasons, too.
Also, I may have gotten the order incorrect: it may create the new table under a new name, load the rows, drop the old table, and rename the new one. In this case, you may have been querying the wrong table to find out if the data had been loaded. I can't remember off the top of my head right now which way the table designer structures its scripts--there's more than one way to skin this cat.

How to execute original command in an instead of trigger

We are doing a DB2 migration to SQL Server and there are a number of BEFORE inserts/updates that we need to migrate. We can take care of the insert pretty simply by using an INSTEAD OF INSERT by simply using the command "INSERT INTO TableName SELECT * FROM inserted".
However, for the update it is harder as you can't just do a command like "UDPATE TableName SELECT * FROM Inserted. Instead, the only option we have found is to declare variables for each of the incoming columns, and then use those in the UPDATE TableName SET ColumnName = #col1, etc. Unfortunately, this would result in quite a bit of manual work, and I would like to find a more automatable solution.
Some questions:
1) Is there a way you can issue an update using inserted from the trigger, without knowing the specific column information?
2) Is there a way to write a loop in the trigger that would automatically step through the inserted columns, and update those to the database?
3) Is there a way to get access to the original command that caused the trigger? So I can do an EXEC #command and take care of things that way?
Any help would be greatly appreciated!
Thanks!
Bob

You must specify the column names in an UPDATE
You could loop through the metadata of the target table (in sys.columns) and build an UPDATE statement dynamically, but dynamic SQL executes in its own scope, so it would not be able to access the inserted and deleted tables directly. Although you can work around this by copying the data into local temp tables (#inserted) first, it seems like a very awkward approach in general
There is no way to access the original UPDATE statement
But I'm not sure what you're really trying to achieve. Your question implies that the trigger does the original INSERT or UPDATE anyway without modifying any data. If that's really the case, you might want to explain what the purpose of your trigger is because there may be an alternative, easier way to do whatever it is that it's doing.
I'm also a bit confused by your statement that you have to "declare variables for each of the incoming columns, and then use those in the UPDATE TableName SET ColumnName = #col1, etc". Triggers in SQL Server always fire once per statement, so you normally do an UPDATE with a join to handle the case where the UPDATE is for more than one row.
You might also find the UPDATE() or COLUMNS_UPDATED() functions useful for limiting your trigger code to process only those columns that were really updated.

Updating column with it's current value

I have a stored proc that should conditionally update a bunch of fields in the same table. Conditionally, because for each field I also pass a "dirty" flag and a field should be updated only if flag is set to 1.
So I'm going to do the following:
create proc update
#field1 nvarchar(1000), #field1Dirty bit, ...other fields...
as
begin
update mytable
set field1 = case when #field1dirty = 1 then #field1 else field1 end,
... same for other fields
end
go
Question - is SQL Server (2008) smart enough to not physically update a field if it's been assigned its own value, like in case if #field1dirty = 0?

Question - is SQL Server (2008) smart enough to not physically update
a field if it's been assigned its own
value, like in case if #field1dirty =
0?
No you should add a where clause that says...where field <> the value you are updating to.
This doesn't seem like a big deal at first, but in truth it can create a massive amount of overhead. One example, think about triggers. If that updates every field in the table, that trigger will fire for every row. YIKES, that's a lot of code execution that's needless, especially if that code is say, moving updates rows to a logging table. I'm sure you get the idea.
Remember, you're updating the field, it just happens to be the same value it was before. It's actually good that this happens, because that means that you can still count the field as modified (think timestamp etc.). If it didn't think updating the field to the same value was modifying the row, you wouldn't know if someone inadvertently (or deliberately) tried to change data.
Update due to comments:
Link to the coalesce function
Example:
For handling null parameter values in your stored procedure
Update Table SET My_Field = COALESCE(#Variable, My_Field)
This doesn't get around what I was talking about before with the field being updated to the same value, but it does allow you to check parameter and conditionally update the field.

SQL doesn't check the value before writing to it. It will overwrite it anyway.

SQL Server will perform the update. The row will be updated as an entire row, so if one column in the row does have FieldxDirty = 1, the update is required anyway. There's no optimization gained in the SET clause.
#Kevin's answer will help more than optimizing the SET clause.

Sorry to come here with an opinion, but I have nowhere else to write :-)
There should at least be a kind of "hint" possibility to tell the UPDATE statement to generally NOT update to the same value.
There are at least 2 reasons I can think of:
1st: the value to update to can be a complicated expression and it is a waste of execution time (not to mention the maintenance of expression changes) to express it again in the WHERE clause. Think also of NULL values!
Ex. UPDATE X SET A = B WHERE ISNULL(A,'') <> ISNULL(B,'')
2nd: we have a synchronized mirroring scenario where the "backup" server is physically placed in another part of the city. This means, that the write to disk is comitted first when the backup-server has performed the write. There is a huge time difference between the write and skip writing. When the developers created the application, they worked in a test environment without mirroring. Most of the UPDATE statements just did not change the values, but it did not matter in the test environment. After deloying the application to production with mirroring, we would really love to have that "only changed value" hint. Reading the original value and checking it does not take time compared to writing