Instead of trigger in SQL Server loses SCOPE_IDENTITY? - sql-server

I have a table where I created an INSTEAD OF trigger to enforce some business rules.
The issue is that when I insert data into this table, SCOPE_IDENTITY() returns a NULL value, rather than the actual inserted identity.
Insert + Scope code
INSERT INTO [dbo].[Payment]([DateFrom], [DateTo], [CustomerId], [AdminId])
VALUES ('2009-01-20', '2009-01-31', 6, 1)
SELECT SCOPE_IDENTITY()
Trigger:
CREATE TRIGGER [dbo].[TR_Payments_Insert]
ON [dbo].[Payment]
INSTEAD OF INSERT
AS
BEGIN
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON;
IF NOT EXISTS(SELECT 1 FROM dbo.Payment p
INNER JOIN Inserted i ON p.CustomerId = i.CustomerId
WHERE (i.DateFrom >= p.DateFrom AND i.DateFrom <= p.DateTo) OR (i.DateTo >= p.DateFrom AND i.DateTo <= p.DateTo)
) AND NOT EXISTS (SELECT 1 FROM Inserted p
INNER JOIN Inserted i ON p.CustomerId = i.CustomerId
WHERE (i.DateFrom <> p.DateFrom AND i.DateTo <> p.DateTo) AND
((i.DateFrom >= p.DateFrom AND i.DateFrom <= p.DateTo) OR (i.DateTo >= p.DateFrom AND i.DateTo <= p.DateTo))
)
BEGIN
INSERT INTO dbo.Payment (DateFrom, DateTo, CustomerId, AdminId)
SELECT DateFrom, DateTo, CustomerId, AdminId
FROM Inserted
END
ELSE
BEGIN
ROLLBACK TRANSACTION
END
END
The code worked before the creation of this trigger. I am using LINQ to SQL in C#. I don't see a way of changing SCOPE_IDENTITY to ##IDENTITY. How do I make this work?

Use ##identity instead of scope_identity().
While scope_identity() returns the last created id in the current scope, ##identity returns the last created id in the current session.
The scope_identity() function is normally recommended over the ##identity field, as you usually don't want triggers to interfer with the id, but in this case you do.

Since you're on SQL 2008, I would highly recommend using the OUTPUT clause instead of one of the custom identity functions. SCOPE_IDENTITY currently has some issues with parallel queries that cause me to recommend against it entirely. ##Identity does not, but it's still not as explicit, and as flexible, as OUTPUT. Plus OUTPUT handles multi-row inserts. Have a look at the BOL article which has some great examples.

I was having serious reservations about using ##identity, because it can return the wrong answer.
But there is a workaround to force ##identity to have the scope_identity() value.
Just for completeness, first I'll list a couple of other workarounds for this problem I've seen on the web:
Make the trigger return a rowset. Then, in a wrapper SP that performs the insert, do INSERT Table1 EXEC sp_ExecuteSQL ... to yet another table. Then scope_identity() will work. This is messy because it requires dynamic SQL which is a pain. Also, be aware that dynamic SQL runs under the permissions of the user calling the SP rather than the permissions of the owner of the SP. If the original client could insert to the table, he should still have that permission, just know that you could run into problems if you deny permission to insert directly to the table.
If there is another candidate key, get the identity of the inserted row(s) using those keys. For example, if Name has a unique index on it, then you can insert, then select the (max for multiple rows) ID from the table you just inserted to using Name. While this may have concurrency problems if another session deletes the row you just inserted, it's no worse than in the original situation if someone deleted your row before the application could use it.
Now, here's how to definitively make your trigger safe for ##Identity to return the correct value, even if your SP or another trigger inserts to an identity-bearing table after the main insert.
Also, please put comments in your code about what you are doing and why so that future visitors to the trigger don't break things or waste time trying to figure it out.
CREATE TRIGGER TR_MyTable_I ON MyTable INSTEAD OF INSERT
AS
SET NOCOUNT ON
DECLARE #MyTableID int
INSERT MyTable (Name, SystemUser)
SELECT I.Name, System_User
FROM Inserted
SET #MyTableID = Scope_Identity()
INSERT AuditTable (SystemUser, Notes)
SELECT SystemUser, 'Added Name ' + I.Name
FROM Inserted
-- The following statement MUST be last in this trigger. It resets ##Identity
-- to be the same as the earlier Scope_Identity() value.
SELECT MyTableID INTO #Trash FROM MyTable WHERE MyTableID = #MyTableID
Normally, the extra insert to the audit table would break everything, because since it has an identity column, then ##Identity will return that value instead of the one from the insertion to MyTable. However, the final select creates a new ##Identity value that is the correct one, based on the Scope_Identity() that we saved from earlier. This also proofs it against any possible additional AFTER trigger on the MyTable table.
Update:
I just noticed that an INSTEAD OF trigger isn't necessary here. This does everything you were looking for:
CREATE TRIGGER dbo.TR_Payments_Insert ON dbo.Payment FOR INSERT
AS
SET NOCOUNT ON;
IF EXISTS (
SELECT *
FROM
Inserted I
INNER JOIN dbo.Payment P ON I.CustomerID = P.CustomerID
WHERE
I.DateFrom < P.DateTo
AND P.DateFrom < I.DateTo
) ROLLBACK TRAN;
This of course allows scope_identity() to keep working. The only drawback is that a rolled-back insert on an identity table does consume the identity values used (the identity value is still incremented by the number of rows in the insert attempt).
I've been staring at this for a few minutes and don't have absolute certainty right now, but I think this preserves the meaning of an inclusive start time and an exclusive end time. If the end time was inclusive (which would be odd to me) then the comparisons would need to use <= instead of <.

Main Problem : Trigger and Entity framework both work in diffrent scope.
The problem is, that if you generate new PK value in trigger, it is different scope. Thus this command returns zero rows and EF will throw exception.
The solution is to add the following SELECT statement at the end of your Trigger:
SELECT * FROM deleted UNION ALL
SELECT * FROM inserted;
in place of * you can mention all the column name including
SELECT IDENT_CURRENT(‘tablename’) AS <IdentityColumnname>

Like araqnid commented, the trigger seems to rollback the transaction when a condition is met. You can do that easier with an AFTER INSTERT trigger:
CREATE TRIGGER [dbo].[TR_Payments_Insert]
ON [dbo].[Payment]
AFTER INSERT
AS
BEGIN
SET NOCOUNT ON;
IF <Condition>
BEGIN
ROLLBACK TRANSACTION
END
END
Then you can use SCOPE_IDENTITY() again, because the INSERT is no longer done in the trigger.
The condition itself seems to let two identical rows past, if they're in the same insert. With the AFTER INSERT trigger, you can rewrite the condition like:
IF EXISTS(
SELECT *
FROM dbo.Payment a
LEFT JOIN dbo.Payment b
ON a.Id <> b.Id
AND a.CustomerId = b.CustomerId
AND (a.DateFrom BETWEEN b.DateFrom AND b.DateTo
OR a.DateTo BETWEEN b.DateFrom AND b.DateTo)
WHERE b.Id is NOT NULL)
And it will catch duplicate rows, because now it can differentiate them based on Id. It also works if you delete a row and replace it with another row in the same statement.
Anyway, if you want my advice, move away from triggers altogether. As you can see even for this example they are very complex. Do the insert through a stored procedure. They are simpler and faster than triggers:
create procedure dbo.InsertPayment
#DateFrom datetime, #DateTo datetime, #CustomerId int, #AdminId int
as
BEGIN TRANSACTION
IF NOT EXISTS (
SELECT *
FROM dbo.Payment
WHERE CustomerId = #CustomerId
AND (#DateFrom BETWEEN DateFrom AND DateTo
OR #DateTo BETWEEN DateFrom AND DateTo))
BEGIN
INSERT into dbo.Payment
(DateFrom, DateTo, CustomerId, AdminId)
VALUES (#DateFrom, #DateTo, #CustomerId, #AdminId)
END
COMMIT TRANSACTION

A little late to the party, but I was looking into this issue myself. A workaround is to create a temp table in the calling procedure where the insert is being performed, insert the scope identity into that temp table from inside the instead of trigger, and then read the identity value out of the temp table once the insertion is complete.
In procedure:
CREATE table #temp ( id int )
... insert statement ...
select id from #temp
-- (you can add sorting and top 1 selection for extra safety)
drop table #temp
In instead of trigger:
-- this check covers you for any inserts that don't want an identity value returned (and therefore don't provide a temp table)
IF OBJECT_ID('tempdb..#temp') is not null
begin
insert into #temp(id)
values
(SCOPE_IDENTITY())
end
You probably want to call it something other than #temp for safety sake (something long and random enough that no one else would be using it: #temp1234235234563785635).

Related

Automatic bulk insert / update with multiple variables

Hej,
Update
Thank you guys for your answers and hints. I understand that my first attempt was wrong - so maybe the better approach is to describe my problem in words rather than trying a shitty trigger.
Table A is a list of all clients. For earch client exists multiple orders (next to other not needed information in that table):
CLIENT
ORDER
OPTIONAL
A
1
NO
A
2
YES
A
3
NO
B
16818
YES
B
342
YES
I need to insert all OPTIONAL=NO orders into table B in an automatic bulk process. It is possible that an order is changed from OPTIONAL=NO to OPTIONAL=YES and therefore i need the solutions for not only inserts but updates.
Thank you very much!
Best practice for triggers:
Don't create them unless you have no other option.
It's usually best to separate insert and update triggers, sometimes they can be combined.
NOCOUNT stops spurious messages going back to the client, XACT_ABORT means any error automatically rolls back the transaction.
Check if there are any relevant rows, if not bail out early.
Also check if a column is present in the update statement using the UPDATE() function if relevant (this doesn't mean the row has actually changed)
In an UPDATE trigger, make sure you compare inserted and deleted rows for actual changes.
Most important: be aware that inserted and deleted may contain multiple rows.
CREATE TRIGGER Trg_TableA_OPTIONAL_ins
ON TableA
AFTER INSERT
AS
SET NOCOUNT, XACT_ABORT ON;
IF (NOT EXISTS (SELECT 1 FROM inserted WHERE OPTIONAL = 'NO'))
RETURN;
INSERT TableB (CLIENT, [ORDER])
SELECT CLIENT, [ORDER]
FROM inserted
WHERE OPTIONAL = 'NO';
GO
CREATE TRIGGER Trg_TableA_OPTIONAL_upd
ON TableA
AFTER UPDATE
AS
SET NOCOUNT, XACT_ABORT ON;
IF (NOT EXISTS (SELECT 1 FROM inserted))
RETURN;
INSERT TableB (CLIENT, [ORDER])
SELECT CLIENT, [ORDER]
FROM (
SELECT CLIENT, [ORDER], OPTIONAL
FROM inserted
WHERE OPTIONAL = 'NO'
EXCEPT
SELECT CLIENT, [ORDER], OPTIONAL
FROM deleted
) i
EXCEPT
SELECT CLIENT, [ORDER]
FROM TableB;
DELETE FROM b
FROM TableB b
JOIN (
SELECT CLIENT, [ORDER], OPTIONAL
FROM inserted
WHERE OPTIONAL = 'YES'
EXCEPT
SELECT CLIENT, [ORDER], OPTIONAL
FROM deleted
) i
ON TableB.CLIENT = Source.CLIENT AND TABLEB.[ORDER] = Source.[ORDER];
GO

Trigger AFTER INSERT, UPDATE, DELETE to call stored procedure with table name and primary key

For a sync process, my SQL Server database should record a list items that have changed - table name and primary key.
The DB already has a table and stored procedure to do this:
EXEC #ErrCode = dbo.SyncQueueItem "tableName", 1234;
I'd like to add triggers to a table to call this stored procedure on INSERT, UPDATE, DELETE. How do I get the key? What's the simplest thing that could possibly work?
CREATE TABLE new_employees
(
id_num INT IDENTITY(1,1),
fname VARCHAR(20),
minit CHAR(1),
lname VARCHAR(30)
);
GO
IF OBJECT_ID ('dbo.sync_new_employees','TR') IS NOT NULL
DROP TRIGGER sync_new_employees;
GO
CREATE TRIGGER sync_new_employees
ON new_employees
AFTER INSERT, UPDATE, DELETE
AS
DECLARE #Key Int;
DECLARE #ErrCode Int;
-- How to get the key???
SELECT #Key = 12345;
EXEC #ErrCode = dbo.SyncQueueItem "new_employees", #key;
GO
The way to access the records changed by the operation is by using the Inserted and Deleted pseudo-tables that are provided to you by SQL Server.
Inserted contains any inserted records, or any updated records with their new values.
Deleted contains any deleted records, or any updated records with their old values.
More Info
When writing a trigger, to be safe, one should always code for the case when multiple records are acted upon. Unfortunately if you need to call a SP that means a loop - which isn't ideal.
The following code shows how this could be done for your example, and includes a method of detecting whether the operation is an Insert/Update/Delete.
declare #Key int, #ErrCode int, #Action varchar(6);
declare #Keys table (id int, [Action] varchar(6));
insert into #Keys (id, [Action])
select coalesce(I.id, D.id_num)
, case when I.id is not null and D.id is not null then 'Update' when I.id is not null then 'Insert' else 'Delete' end
from Inserted I
full join Deleted D on I.id_num = D.id_num;
while exists (select 1 from #Keys) begin
select top 1 #Key = id, #Action = [Action] from #Keys;
exec #ErrCode = dbo.SyncQueueItem 'new_employees', #key;
delete from #Keys where id = #Key;
end
Further: In addition to solving your specified problem its worth noting a couple of points regarding the bigger picture.
As #Damien_The_Unbeliever points out there are built in mechanisms to accomplish change tracking which will perform much better.
If you still wish to handle your own change tracking, it would perform better if you could arrange it such that you handle the entire recordset in one go as opposed to carrying out a row-by-row operation. There are 2 ways to accomplish this a) Move your change tracking code inside the trigger and don't use a SP. b) Use a "User Defined Table Type" to pass the record-set of changes to the SP.
You should use the Magic Table to get the data.
Usually, inserted and deleted tables are called Magic Tables in the context of a trigger. There are Inserted and Deleted magic tables in SQL Server. These tables are automatically created and managed by SQL Server internally to hold recently inserted, deleted and updated values during DML operations (Insert, Update and Delete) on a database table.
Inserted magic table
The Inserted table holds the recently inserted values, in other words, new data values. Hence recently added records are inserted into the Inserted table.
Deleted magic table
The Deleted table holds the recently deleted or updated values, in other words, old data values. Hence the old updated and deleted records are inserted into the Deleted table.
**You can use the inserted and deleted magic table to get the value of id_num **
SELECT top 1 #Key = id_num from inserted
Note: This code sample will only work for a single record for insert scenario. For Bulk insert/update scenarios you need to fetch records from inserted and deleted table stored in the temp table or variable and then loop through it to pass to your procedure or you can pass a table variable to your procedure and handle the multiple records there.
A DML trigger should operate set data else only one row will be processed. It can be something like this. And of course use magic tables inserted and deleted.
CREATE TRIGGER dbo.tr_employees
ON dbo.employees --the table from Northwind database
AFTER INSERT,DELETE,UPDATE
AS
BEGIN
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON;
declare #tbl table (id int identity(1,1),delId int,insId int)
--Use "magic tables" inserted and deleted
insert #tbl(delId, insId)
select d.EmployeeID, i.EmployeeID
from inserted i --empty when "delete"
full join deleted d --empty when "insert"
on i.EmployeeID=d.EmployeeID
declare #id int,#key int,#action char
select top 1 #id=id, #key=isnull(delId, insId),
#action=case
when delId is null then 'I'
when insId is null then 'D'
else 'U' end --just in case you need the operation executed
from #tbl
--do something for each row
while #id is not null --instead of cursor
begin
--do the main action
--exec dbo.sync 'employees', #key, #action
--remove processed row
delete #tbl where id=#id
--refill #variables
select top 1 #id=id, #key=isnull(delId, insId),
#action=case
when delId is null then 'I'
when insId is null then 'D'
else 'U' end --just in case you need the operation executed
from #tbl
end
END
Not the best solution, but just a direct answer on the question:
SELECT #Key = COALESCE(deleted.id_num,inserted.id_num);
Also not the best way (if not the worst) (do not try this at home), but at least it will help with multiple values:
DECLARE #Key INT;
DECLARE triggerCursor CURSOR LOCAL FAST_FORWARD READ_ONLY
FOR SELECT COALESCE(i.id_num,d.id_num) AS [id_num]
FROM inserted i
FULL JOIN deleted d ON d.id_num = i.id_num
WHERE (
COALESCE(i.fname,'')<>COALESCE(d.fname,'')
OR COALESCE(i.minit,'')<>COALESCE(d.minit,'')
OR COALESCE(i.lname,'')<>COALESCE(d.lname,'')
)
;
OPEN triggerCursor;
FETCH NEXT FROM triggerCursor INTO #Key;
WHILE ##FETCH_STATUS = 0
BEGIN
EXEC #ErrCode = dbo.SyncQueueItem 'new_employees', #key;
FETCH NEXT FROM triggerCursor INTO #Key;
END
CLOSE triggerCursor;
DEALLOCATE triggerCursor;
Better way to use trigger based "value-change-tracker":
INSERT INTO [YourTableHistoryName] (id_num, fname, minit, lname, WhenHappened)
SELECT COALESCE(i.id_num,d.id_num) AS [id_num]
,i.fname,i.minit,i.lname,CURRENT_TIMESTAMP AS [WhenHeppened]
FROM inserted i
FULL JOIN deleted d ON d.id_num = i.id_num
WHERE ( COALESCE(i.fname,'')<>COALESCE(d.fname,'')
OR COALESCE(i.minit,'')<>COALESCE(d.minit,'')
OR COALESCE(i.lname,'')<>COALESCE(d.lname,'')
)
;
The best (in my opinion) way to track changes is to use Temporal tables (SQL Server 2016+)
inserted/deleted in triggers will generate as many rows as touched and calling a stored proc per key would require a cursor or similar approach per row.
You should check timestamp/rowversion in SQL Server. You could add that to the all tables in question (not null, auto increment, unique within database for each table/row etc).
You could add a unique index on that column to all tables you added the column.
##DBTS is the current timestamp, you can store today's ##DBTS and tomorrow you will scan all tables from that to current ##DBTS. timestamp/rowversion will be incremented for all updates and inserts but for deletes it won't track, for deletes you can have a delete only trigger and insert keys into a different table.
Change data capture or change tracking could do this easier, but if there is heavy volumes on the server or large number of data loads, partition switches scanning the transaction log becomes a bottleneck and in some cases you will have to remove change data capture to save the transaction log from growing indefinetely.

Reset SCOPE_IDENTITY()

I have a stored procedure that first inserts some data into a temp table and then inserts a row into another table. I am calling Scope_Identity() after the second insert to pick up the newly inserted record Identity.
If the second insert does nothing due to a join, I want to check the Scope_Identity and raise an exception. But Scope_Identity is returning the last identity created from the temp table insert before the second insert.
Is there a way to reset SCOPE_IDENTITY before calling the second insert, or a better way to determine if the second insert didn't actually insert anything?
Check ##ROWCOUNT immediately after the 2nd insert. If it is 0 then no rows were inserted.
INSERT INTO YourTable
SELECT ...
IF (##ROWCOUNT = 0)
BEGIN
RAISERROR('Nothing inserted',16,1)
RETURN
END
Martin Smith's answer totally answers your question.
This is apparently the only page on the internet asking how to reset the Scope_Identity().
I believe this is vital for anyone working with T-SQL.
I am leaving this answer for anyone who came here (like me) looking for the identity that was inserted by the previous insert statement (and not the last randomly successful identity insert).
This is what I came up with:
SET #SomeID = (CASE WHEN ##ROWCOUNT > 0 THEN SCOPE_IDENTITY() ELSE NULL END)
I think other answers given may be more practical, but I did want to record my finding here in case it helps someone some day. (This is in SQL Server 2005; not sure whether this behavior persists in newer versions.)
The basis of the trick is an exploitation of the following property (from Books Online's documentation of ##IDENTITY): "After an INSERT, SELECT INTO, or bulk copy statement is completed . . . If the statement did not affect any tables with identity columns, ##IDENTITY returns NULL." Although I can't find it explicitly stated, it appears that this behavior applies to SCOPE_IDENTITY() as well. So, we complete an INSERT statement that does not affect any tables with identity columns:
CREATE TABLE NoIdentity (notId BIT NOT NULL)
-- An insert that actually inserts sets SCOPE_IDENTITY():
INSERT INTO YourTable (name)
SELECT 'a'
WHERE 1 = 1 -- simulate a join that yields rows
SELECT ##identity, SCOPE_IDENTITY()
-- 14, 14 (or similar)
-- The problem: an insert that doesn't insert any rows leaves SCOPE_IDENTITY() alone.
INSERT INTO YourTable (name)
SELECT 'a'
WHERE 1 = 0 -- simulate a join that yields no rows
SELECT ##identity, SCOPE_IDENTITY()
-- Still 14, 14 . . . how do we know we didn't insert any rows?
-- Now for the trick:
INSERT INTO NoIdentity (notId)
SELECT 0
WHERE 1 = 0 -- we don't actually need to insert any rows for this to work
SELECT ##identity, SCOPE_IDENTITY()
-- NULL, NULL. Magic!
INSERT INTO YourTable (name)
SELECT 'a'
WHERE 1 = 0 -- simulate a join that yields no rows
SELECT ##identity, SCOPE_IDENTITY()
-- Still NULL, NULL since we didn't insert anything. But if we had, it would be non-NULL.
-- We can tell the difference!
So, for your case, it would seem that you could do
INSERT INTO NoIdentity (notId)
SELECT 0
WHERE 1 = 0
to reset SCOPE_IDENTITY() before performing your second INSERT.
Having considered several alternatives, I find myself liking a riff on #BenThul's answer to a related question:
DECLARE #result TABLE (id INT NOT NULL)
INSERT INTO YourTable (name)
OUTPUT INSERTED.id INTO #result (id)
SELECT 'a'
WHERE 1 = 0 -- simulate a join result
SELECT CASE
WHEN (SELECT COUNT(1) FROM #result) = 1 THEN (SELECT TOP 1 id FROM #result)
ELSE -1
END
As you can see from my final SELECT CASE..., in my situation I was trying to end up with a single INT NOT NULL that would help me understand whether a row was inserted (in which case I wanted its ID) or not. (I would not recommend being in this situation in the first place, if possible!) What you would do with #result depends on what you need to do.
I like that the relationship between the INSERT and #result is explicit and unlikely to be contaminated by other intervening operations I might not be thinking about. I also like that #result naturally handles cases with more than one row inserted.
MikeTeeVee's answer which I found when combined with the answer from Martin-Smith's answer is very powerful.
Here is my merged use:
BEGIN TRY
INSERT INTO YourTable
SELECT ...
SELECT #SomeID = (CASE WHEN ##ROWCOUNT > 0 THEN SCOPE_IDENTITY() ELSE NULL END)
IF (#SomeID IS NULL)
BEGIN
RAISERROR('Nothing inserted',16,1)
END
END TRY
BEGIN CATCH
/* Handle stuff here - In my case I had several inserts
- some could not happen and I did not raise errors for them
- Some had to make a Transaction to rollback
*/
END CATCH

SQL - Inserting and Updating Multiple Records at Once

I have a stored procedure that is responsible for inserting or updating multiple records at once. I want to perform this in my stored procedure for the sake of performance.
This stored procedure takes in a comma-delimited list of permit IDs and a status. The permit IDs are stored in a variable called #PermitIDs. The status is stored in a variable called #Status. I have a user-defined function that converts this comma-delimited list of permit IDs into a Table. I need to go through each of these IDs and do either an insert or update into a table called PermitStatus.
If a record with the permit ID does not exist, I want to add a record. If it does exist, I'm want to update the record with the given #Status value. I know how to do this for a single ID, but I do not know how to do it for multiple IDs. For single IDs, I do the following:
-- Determine whether to add or edit the PermitStatus
DECLARE #count int
SET #count = (SELECT Count(ID) FROM PermitStatus WHERE [PermitID]=#PermitID)
-- If no records were found, insert the record, otherwise add
IF #count = 0
BEGIN
INSERT INTO
PermitStatus
(
[PermitID],
[UpdatedOn],
[Status]
)
VALUES
(
#PermitID,
GETUTCDATE(),
1
)
END
ELSE
UPDATE
PermitStatus
SET
[UpdatedOn]=GETUTCDATE(),
[Status]=#Status
WHERE
[PermitID]=#PermitID
How do I loop through the records in the Table returned by my user-defined function to dynamically insert or update the records as needed?
create a split function, and use it like:
SELECT
*
FROM YourTable y
INNER JOIN dbo.splitFunction(#Parameter) s ON y.ID=s.Value
I prefer the number table approach
For this method to work, you need to do this one time table setup:
SELECT TOP 10000 IDENTITY(int,1,1) AS Number
INTO Numbers
FROM sys.objects s1
CROSS JOIN sys.objects s2
ALTER TABLE Numbers ADD CONSTRAINT PK_Numbers PRIMARY KEY CLUSTERED (Number)
Once the Numbers table is set up, create this function:
CREATE FUNCTION [dbo].[FN_ListToTableAll]
(
#SplitOn char(1) --REQUIRED, the character to split the #List string on
,#List varchar(8000)--REQUIRED, the list to split apart
)
RETURNS TABLE
AS
RETURN
(
----------------
--SINGLE QUERY-- --this WILL return empty rows
----------------
SELECT
ROW_NUMBER() OVER(ORDER BY number) AS RowNumber
,LTRIM(RTRIM(SUBSTRING(ListValue, number+1, CHARINDEX(#SplitOn, ListValue, number+1)-number - 1))) AS ListValue
FROM (
SELECT #SplitOn + #List + #SplitOn AS ListValue
) AS InnerQuery
INNER JOIN Numbers n ON n.Number < LEN(InnerQuery.ListValue)
WHERE SUBSTRING(ListValue, number, 1) = #SplitOn
);
GO
You can now easily split a CSV string into a table and join on it:
select * from dbo.FN_ListToTableAll(',','1,2,3,,,4,5,6777,,,')
OUTPUT:
RowNumber ListValue
----------- ----------
1 1
2 2
3 3
4
5
6 4
7 5
8 6777
9
10
11
(11 row(s) affected)
To make what you need work, do the following:
--this would be the existing table
DECLARE #OldData table (RowID int, RowStatus char(1))
INSERT INTO #OldData VALUES (10,'z')
INSERT INTO #OldData VALUES (20,'z')
INSERT INTO #OldData VALUES (30,'z')
INSERT INTO #OldData VALUES (70,'z')
INSERT INTO #OldData VALUES (80,'z')
INSERT INTO #OldData VALUES (90,'z')
--these would be the stored procedure input parameters
DECLARE #IDList varchar(500)
,#StatusList varchar(500)
SELECT #IDList='10,20,30,40,50,60'
,#StatusList='A,B,C,D,E,F'
--stored procedure local variable
DECLARE #InputList table (RowID int, RowStatus char(1))
--convert input prameters into a table
INSERT INTO #InputList
(RowID,RowStatus)
SELECT
i.ListValue,s.ListValue
FROM dbo.FN_ListToTableAll(',',#IDList) i
INNER JOIN dbo.FN_ListToTableAll(',',#StatusList) s ON i.RowNumber=s.RowNumber
--update all old existing rows
UPDATE o
SET RowStatus=i.RowStatus
FROM #OldData o WITH (UPDLOCK, HOLDLOCK) --to avoid race condition when there is high concurrency as per #emtucifor
INNER JOIN #InputList i ON o.RowID=i.RowID
--insert only the new rows
INSERT INTO #OldData
(RowID, RowStatus)
SELECT
i.RowID, i.RowStatus
FROM #InputList i
LEFT OUTER JOIN #OldData o ON i.RowID=o.RowID
WHERE o.RowID IS NULL
--display the old table
SELECT * FROM #OldData order BY RowID
OUTPUT:
RowID RowStatus
----------- ---------
10 A
20 B
30 C
40 D
50 E
60 F
70 z
80 z
90 z
(9 row(s) affected)
EDIT thanks to #Emtucifor click here for the tip about the race condition, I have included the locking hints in my answer, to prevent race condition problems when there is high concurrency.
There are various methods to accomplish the parts you ask are asking about.
Passing Values
There are dozens of ways to do this. Here are a few ideas to get you started:
Pass in a string of identifiers and parse it into a table, then join.
SQL 2008: Join to a table-valued parameter
Expect data to exist in a predefined temp table and join to it
Use a session-keyed permanent table
Put the code in a trigger and join to the INSERTED and DELETED tables in it.
Erland Sommarskog provides a wonderful comprehensive discussion of lists in sql server. In my opinion, the table-valued parameter in SQL 2008 is the most elegant solution for this.
Upsert/Merge
Perform a separate UPDATE and INSERT (two queries, one for each set, not row-by-row).
SQL 2008: MERGE.
An Important Gotcha
However, one thing that no one else has mentioned is that almost all upsert code, including SQL 2008 MERGE, suffers from race condition problems when there is high concurrency. Unless you use HOLDLOCK and other locking hints depending on what's being done, you will eventually run into conflicts. So you either need to lock, or respond to errors appropriately (some systems with huge transactions per second have used the error-response method successfully, instead of using locks).
One thing to realize is that different combinations of lock hints implicitly change the transaction isolation level, which affects what type of locks are acquired. This changes everything: which other locks are granted (such as a simple read), the timing of when a lock is escalated to update from update intent, and so on.
I strongly encourage you to read more detail on these race condition problems. You need to get this right.
Conditional Insert/Update Race Condition
“UPSERT” Race Condition With MERGE
Example Code
CREATE PROCEDURE dbo.PermitStatusUpdate
#PermitIDs varchar(8000), -- or (max)
#Status int
AS
SET NOCOUNT, XACT_ABORT ON -- see note below
BEGIN TRAN
DECLARE #Permits TABLE (
PermitID int NOT NULL PRIMARY KEY CLUSTERED
)
INSERT #Permits
SELECT Value FROM dbo.Split(#PermitIDs) -- split function of your choice
UPDATE S
SET
UpdatedOn = GETUTCDATE(),
Status = #Status
FROM
PermitStatus S WITH (UPDLOCK, HOLDLOCK)
INNER JOIN #Permits P ON S.PermitID = P.PermitID
INSERT PermitStatus (
PermitID,
UpdatedOn,
Status
)
SELECT
P.PermitID,
GetUTCDate(),
#Status
FROM #Permits P
WHERE NOT EXISTS (
SELECT 1
FROM PermitStatus S
WHERE P.PermitID = S.PermitID
)
COMMIT TRAN
RETURN ##ERROR;
Note: XACT_ABORT helps guarantee the explicit transaction is closed following a timeout or unexpected error.
To confirm that this handles the locking problem, open several query windows and execute an identical batch like so:
WAITFOR TIME '11:00:00' -- use a time in the near future
EXEC dbo.PermitStatusUpdate #PermitIDs = '123,124,125,126', 1
All of these different sessions will execute the stored procedure in nearly the same instant. Check each session for errors. If none exist, try the same test a few times more (since it's possible to not always have the race condition occur, especially with MERGE).
The writeups at the links I gave above give even more detail than I did here, and also describe what to do for the SQL 2008 MERGE statement as well. Please read those thoroughly to truly understand the issue.
Briefly, with MERGE, no explicit transaction is needed, but you do need to use SET XACT_ABORT ON and use a locking hint:
SET NOCOUNT, XACT_ABORT ON;
MERGE dbo.Table WITH (HOLDLOCK) AS TableAlias
...
This will prevent concurrency race conditions causing errors.
I also recommend that you do error handling after each data modification statement.
If you're using SQL Server 2008, you can use table valued parameters - you pass in a table of records into a stored procedure and then you can do a MERGE.
Passing in a table valued parameter would remove the need to parse CSV strings.
Edit:
ErikE has raised the point about race conditions, please refer to his answer and linked articles.
If you have SQL Server 2008, you can use MERGE. Here's an article describing this.
You should be able to do your insert and your update as two set based queries.
The code below was based on a data load procedure that I wrote a while ago that took data from a staging table and inserted or updated it into the main table.
I've tried to make it match your example, but you may need to tweak this (and create a table valued UDF to parse your CSV into a table of ids).
-- Update where the join on permitstatus matches
Update
PermitStatus
Set
[UpdatedOn]=GETUTCDATE(),
[Status]=staging.Status
From
PermitStatus status
Join
StagingTable staging
On
staging.PermitId = status.PermitId
-- Insert the new records, based on the Where Not Exists
Insert
PermitStatus(Updatedon, Status, PermitId)
Select (GETUTCDATE(), staging.status, staging.permitId
From
StagingTable staging
Where Not Exists
(
Select 1 from PermitStatus status
Where status.PermitId = staging.PermidId
)
Essentially you have an upsert stored procedure (eg. UpsertSinglePermit)
(like the code you have given above) for dealing with one row.
So the steps I see are to create a new stored procedure (UpsertNPermits) which does
a) Parse input string into n record entries (each record contains permit id and status)
b) Foreach entry in above, invoke UpsertSinglePermit

SQl Server Express 2005 - updating 2 tables and atomicity?

First off, I want to start by saying I am not an SQL programmer (I'm a C++/Delphi guy), so some of my questions might be really obvious. So pardon my ignorance :o)
I've been charged with writing a script that will update certain tables in a database based on the contents of a CSV file. I have it working it would seem, but I am worried about atomicity for one of the steps:
One of the tables contains only one field - an int which must be incremented each time, but from what I can see is not defined as an identity for some reason. I must create a new row in this table, and insert that row's value into another newly-created row in another table.
This is how I did it (as part of a larger script):
DECLARE #uniqueID INT,
#counter INT,
#maxCount INT
SELECT #maxCount = COUNT(*) FROM tempTable
SET #counter = 1
WHILE (#counter <= #maxCount)
BEGIN
SELECT #uniqueID = MAX(id) FROM uniqueIDTable <----Line 1
INSERT INTO uniqueIDTableVALUES (#uniqueID + 1) <----Line 2
SELECT #uniqueID = #uniqueID + 1
UPDATE TOP(1) tempTable
SET userID = #uniqueID
WHERE userID IS NULL
SET #counter = #counter + 1
END
GO
First of all, am I correct using a "WHILE" construct? I couldn't find a way to achieve this with a simple UPDATE statement.
Second of all, how can I be sure that no other operation will be carried out on the database between Lines 1 and 2 that would insert a value into the uniqueIDTable before I do? Is there a way to "synchronize" operations in SQL Server Express?
Also, keep in mind that I have no control over the database design.
Thanks a lot!
You can do the whole 9 yards in one single statement:
WITH cteUsers AS (
SELECT t.*
, ROW_NUMBER() OVER (ORDER BY userID) as rn
, COALESCE(m.id,0) as max_id
FROM tempTable t WITH(UPDLOCK)
JOIN (
SELECT MAX(id) as id
FROM uniqueIDTable WITH (UPDLOCK)
) as m ON 1=1
WHERE userID IS NULL)
UPDATE cteUsers
SET userID = rn + max_id
OUTPUT INSERTED.userID
INTO uniqueIDTable (id);
You get the MAX(id), lock the uniqueIDTable, compute sequential userIDs for users with NULL userID by using ROW_NUMBER(), update the tempTable and insert the new ids into uniqueIDTable. All in one operation.
For performance you need and index on uniqueIDTable(id) and index on tempTable(userID).
SQL is all about set oriented operations, WHILE loops are the code smell of SQL.
You need a transaction to ensure atomicity and you need to move the select and insert into one statement or do the select with an updlock to prevent two people from running the select at the same time, getting the same value and then trying to insert the same value into the table.
Basically
DECLARE #MaxValTable TABLE (MaxID int)
BEGIN TRANSACTION
BEGIN TRY
INSERT INTO uniqueIDTable VALUES (id)
OUTPUT inserted.id INTO #MaxValTable
SELECT MAX(id) + 1 FROM uniqueIDTable
UPDATE TOP(1) tempTable
SET userID = (SELECT MAXid FROM #MaxValTable)
WHERE userID IS NULL
COMMIT TRANSACTION
END TRY
BEGIN CATCH
ROLLBACK TRANSACTION
RAISERROR 'Error occurred updating tempTable' -- more detail here is good
END CATCH
That said, using an identity would make things far simpler. This is a potential concurrency problem. Is there any way you can change the column to be identity?
Edit: Ensuring that only one connection at a time will be able to insert into the uniqueIDtable. Not going to scale well though.
Edit: Table variable's better than exclusive table lock. If need be, this can be used when inserting users as well.

Resources