In SQL Server 2008, I have a scenario where I have a table with complex validation upon insert / update. This includes needing to temp convert an XML input into a table in order to validate its data against a permanent table.
However, I also have the scenario where I will often update simple integer columns that require no validation. From what I have read here, it seems that SQL Server is going to return the entire row in the temp in-memory "inserted" table, not just the affected columns, when I perform an update. If this is so, then it means for every simply integer update I perform, complex XML validation will be needlessly done.
Do I understand this correctly and if so, how do I get around this short of requiring inserts / updates via a stored proc?
Yes, first off, the triggers will fire for EVERY INSERT or UPDATE operation - you cannot limit that to only fire when certain columns will be affected. You can check inside the trigger to see whether or not certain columns have been affected and make decisions based on that - but you cannot prevent the trigger from firing in the first place.
And secondly, yes, when the trigger fires, you will have ALL columns of the underlying table in the INSERTED and/or DELETED pseudo tables.
One way you might want to change this is by moving the large XML column into a separate table and put that big heavy XML validation trigger only on that table. In that case, if you update or insert into the base table only, you'll get less data, less validation logic. Only when you insert or update into the XML table, you'll get the whole big validation going.
Related
I have an API that i'm trying to read that gives me just the updated field. I'm trying to take that and update my tables using a stored procedure. So far the only way I have been able to figure out how to do this is with dynamic SQL but i would prefer to not do that if there is a way not to.
If it was just a couple columns, I'd just write a proc for each but we are talking about 100 fields and any of them could be updated together. One ticket might just need a timestamp updated at this time, but the next ticket might be a timestamp and who modified it while the next one might just be a note.
Everything I've read and have been taught have told me that dynamic SQL is bad and while I'll write it if I have too, I'd prefer to have a proc.
YOU CAN PERHAPS DO SOMETHING LIKE THIS:::
IF EXISTS (SELECT * FROM NEWTABLE NOT IN (SELECT * FROM OLDTABLE))
BEGIN
UPDATE OLDTABLE
SET OLDTABLE.OLDRECORDS = NEWTABLE.NEWRECORDS
WHERE OLDTABLE.PRIMARYKEY= NEWTABLE.PRIMARYKEY
END
The best way to solve your problem is using MERGE:
Performs insert, update, or delete operations on a target table based on the results of a join with a source table. For example, you can synchronize two tables by inserting, updating, or deleting rows in one table based on differences found in the other table.
As you can see your update could be more complex but more efficient as well. Using MERGE requires some proficiency, but when you start to use it you'll use it with pleasure again and again.
I am not sure how your business logic works that determines what columns are updated at what time. If there are separate business functions that require updating different but consistent columns per function, you will probably want to have individual update statements for each function. This will ensure that each process updates only the columns that it needs to update.
On the other hand, if your API is such that you really don't know ahead of time what needs to be updated, then building a dynamic SQL query is a good idea.
Another option is to build a save proc that sets every user-configurable field. As long as the calling process has all of that data, it can call the save procedure and pass every updateable column. There is no harm in having a UPDATE MyTable SET MyCol = #MyCol with the same values on each side.
Note that even if all of the values are the same, the rowversion (or timestampcolumns) will still be updated, if present.
With our software, the tables that users can edit have a widely varying range of columns. We chose to create a single save procedure for each table that has all of the update-able columns as parameters. The calling processes (our web servers) have all the required columns in memory. They pass all of the columns on every call. This performs fine for our purposes.
My question is a little bit theoretical because I don't have any concrete working example. But I think it's worth to answer it.
What is the proper way to write insert-triggers in SQL Server?
Let's say I create a trigger like this (more or less pseudocode)
CREATE TRIGGER MY_TRIGGER
ON MY_TABLE
FOR INSERT AS
DECLARE #myVariable;
DECLARE InsertedRows CURSOR FAST_FORWARD FOR SELECT A_COLUMN FROM INSERTED;
OPEN InsertedRows;
FETCH NEXT FROM InsertedRows INTO #NewOrderCode;
...
INSERT INTO ANOTHER_TABLE (
CODE,
DATE_INSERTED
) VALUES (
#myVariable,
GETDATE()
);
...etc
Now what if someone else create another trigger on the same table and that trigger would change some columns on inserted rows? Something like this
CREATE TRIGGER ANOTHER_TRIGGER
ON MY_TABLE
FOR INSERT AS
UPDATE MY_TABLE
SET A_COLUMN = something
WHERE ID IN (SELECT ID FROM INSERTED);
...etc
Then my trigger (if fired after the another trigger) operates on wrong data, because INSERTED data are not the same as the real inserted data in the table which have been changed with the other trigger right?
Summary:
Trigger A updates new inserted rows on table T, trigger B then operates on dirty data because the update from trigger A is not visible in the INSERTED pseudo table which trigger B operates on. BUT if the trigger B would operate directly on the table instead of on the pseudo table INSERTED, it would see updated data by trigger A.
Is that true? Should I always work with the data from the table itself and not from the INSERTED table?
I'd usually recommend against having multiple triggers. For just two, you can, if you want to, define what order you want them to run in. Once you have a few more though, you have no control over the order in which the non-first, non-last triggers run.
It also increasingly makes it difficult just to reason about what's happening during insert.
I'd instead recommend having a single trigger per-table, per-action, that accomplishes all tasks that should happen for that action. If you're concerned about the size of the code that results, that's usually an indication that you ought to be moving that code out of the trigger all together - triggers should be fast and light.
Instead, you should start thinking about having the trigger just record an action and then use e.g. service broker or a SQL Server job that picks up those records and performs additional processing. Importantly, it does that within its own transactions rather than delaying the original INSERT.
I would also caution against the current code you're showing in example 1. Rather than using a cursor and inserting rows one by one, consider writing an INSERT ... SELECT statement that references inserted directly and inserts all new rows into the other table.
One thing you should absolutely avoid in a trigger is using a CURSOR!
A trigger should be very nimble, small, fast - and a cursor is anything but! After all, it's being executed in the context of the transaction that caused it to fire. Don't delay completion of that transaction unnecessarily!
You need to also be aware that Inserted will contain multiple rows and write your trigger accordingly, but please use set-based techniques - not cursors and while loops - to keep your trigger quick and fast.
Don't do heavy lifting, time-consuming work in a trigger - just updating a few columns, or making an entry into another table - that's fine - NO heavy lifting! and no e-mail sending etc!
My Personal Guide to SQL Trigger Happiness
The trigger should be light and fast. Expensive triggers make for a slow database for EVERYBODY (and not incidentally unhappiness for everybody concerned including the trigger author)
One trigger operation table combo please. That is at most one insert trigger on the foo table. Though the same trigger for multiple operations on a table is not necessarily bad.
Don't forget that the inserted and deleted tables may contain more than a single row or even no rows at all. A happy trigger (and more importantly happy database users and administrators) will be well-behaved no matter how many rows are involved in the operation.
Do not Not NOT NOT ever use cursors in triggers. Server-side cursors are usually an abuse of good practice though there are rare circumstances where their use is justified. A trigger is NEVER one of them. Prefer instead a series of set-oriented DML statements to anything resembling a trigger.
Remember there are two classes of triggers - AFTER triggers and INSTEAD OF triggers. Consider this when writing a trigger.
Never overlook that triggers (AFTER or INSTEAD OF) begin execution with ##trancount one greater than the context where the statement that fired them runs at.
Prefer declarative referential integrity (DRI) over triggers as a means of keeping data in the database consistent. Some application integrity rules require triggers. But DRI has come a long way over the years and functions like row_number() make triggers less necessary.
Triggers are transactional. If you tried to do a circular update as you've described, it should result in a deadlock - the first update will block the second from completing.
While looking at this code though, you're trying to cursor through the INSERTED pseudo-table to do the inserts - nothing in the example requires that behaviour. If you just insert directly from the full INSERTED table you'd get a definite improvement, and also less firings of your second trigger.
I have to insert one record per tables across 30 tables. The data coming from some other System. I have to insert data in the tables for the first time, then if any update happened, then I need to update tables in the SQL Server. I have two options:
a) I can check timestamp for individual table rows and update if the timestamp is greater.
b) Everytime I can stateway delete records and insert data.
Which one will be faster in SQL Server Database? Is there any other option to address the situatation?
If you are not changing the index fields of the record, the stategy of trying to update first and then insert is usually faster than drop/insert as you don't force the database into updating a bunch of index info.
If using Sql2008+ you should be using the merge command, as it explictly handles the update/insert condition cleanly and clearly
ADDED
I should also add that is you know the usage pattern in rarely update (i.e., 90% insert), you may have a case when drop/insert in faster than update/insert -- depends on lots of details. Regardless, merge is the clear winner if using 2008+
I generally like drop and re-insert. I find it to be cleaner and easier to code. However, if this is happening very frequently and you're worried about concurrency issues, you're probably better off with option 1.
Also, another thing to factor in is how often does the timestamp check fail (where you don't have to insert nor update). If 99% of data is redundant/outdated data, you're probably better off with option 1 regardless.
In Microsoft SQL Server :
I've added an insert trigger my table ACCOUNTS that does an insert into table BLAH based upon the inserted values.
The inserts come from only one place, and those happen one at a time. (By that, I mean, that there's never two inserts in a transaction - two web users could, theoretically click submit and have their inserts done in a near-simulataneous way.)
Do I need to adapt the trigger to handle more than one row being in inserted, the special table created for triggers - or does each individual insert transaction launch the trigger separately?
Each insert calls the trigger. However, if a single insert adds more than one row the trigger is only called once, so your trigger has to be able to handle multiple records.
The granularity is at the INSERT statement level not at the transaction level.
So no, if you have two transactions inserting into the same table they will each call the trigger ATOMICALLY.
BOb
in your situation each insert happens in its own transaction and fires off the trigger individually, so you should be fine. if there was ever a circumstance where you had two inserts within the same transaction you would have to modify the trigger to do either a set based insert from the 'inserted' table or some kind of cursor if additional processing is necessary.
If you do only one insert in a transaction, I don't see any reason for more rows to be in inserted, except if there was a possibility of recursive trigger calls.
Still, it could cause you troubles if you'd change the behavior of your application in future and you forget to change the triggers. So just to be sure, I would rather implement the trigger as if it could contain multiple rows in inserted.
I am trying to do an audit history by adding triggers to my tables and inserting rows intto my Audit table. I have a stored procedure that makes doing the inserts a bit easier because it saves code; I don't have to write out the entire insert statement, but I instead execute the stored procedure with a few parameters of the columns I want to insert.
I am not sure how to execute a stored procedure for each of the rows in the "inserted" table. I think maybe I need to use a cursor, but I'm not sure. I've never used a cursor before.
Since this is an audit, I am going to need to compare the value for each column old to new to see if it changed. If it did change I will execute the stored procedure that adds a row to my Audit table.
Any thoughts?
I would trade space for time and not do the comparison. Simply push the new values to the audit table on insert/update. Disk is cheap.
Also, I'm not sure what the stored procedure buys you. Can't you do something simple in the trigger like:
insert into dbo.mytable_audit
(select *, getdate(), getdate(), 'create' from inserted)
Where the trigger runs on insert and you are adding created time, last updated time, and modification type fields. For an update, it's a little tricker since you'll need to supply named parameters as the created time shouldn't be updated
insert into dbo.mytable_audit (col1, col2, ...., last_updated, modification)
(select *, getdate(), 'update' from inserted)
Also, are you planning to audit only successes or failures as well? If you want to audit failures, you'll need something other than triggers I think since the trigger won't run if the transaction is rolled back -- and you won't have the status of the transaction if the trigger runs first.
I've actually moved my auditing to my data access layer and do it in code now. It makes it easier to both success and failure auditing and (using reflection) is pretty easy to copy the fields to the audit object. The other thing that it allows me to do is give the user context since I don't give the actual user permissions to the database and run all queries using a service account.
If your database needs to scale past a few users this will become very expensive. I would recommend looking into 3rd party database auditing tools.
There is already a built in function UPDATE() which tells you if a column has changed (but it is over the entire set of inserted rows).
You can look at some of the techniques in Paul Nielsen's AutoAudit triggers which are code generated.
What it does is check both:
IF UPDATE(<column_name>)
INSERT Audit (...)
SELECT ...
FROM Inserted
JOIN Deleted
ON Inserted.KeyField = Deleted.KeyField -- (AutoAudit does not support multi-column primary keys, but the technique can be done manually)
AND NOT (Inserted.<column_name> = Deleted.<column_name> OR COALESCE(Inserted.<column_name>, Deleted.<column_name>) IS NULL)
But it audits each column change as a separate row. I use it for auditing changes to configuration tables. I am not currently using it for auditing heavy change tables. (But in most transactional systems I've designed, rows on heavy activity tables are typically immutable, you don't have a lot of UPDATEs, just a lot of INSERTs - so you wouldn't even need this kind of auditing). For instance, orders or ledger entries are never changed, and shopping carts are disposable - neither would have this kind of auditing. On low volume change tables, like customer, you can use this kind of auditing.
Jeff,
I agree with Zodeus..a good option is to use a 3rd tool.
I have used auditdatabase (FREE)web tool that generates audit triggers (you do not need to write a single line of TSQL code)
Another good tools is Apex SQL Audit but..it's not free.
I hope this helps you,
F. O'Neill