SQl Server Express 2005 - updating 2 tables and atomicity? - sql-server

First off, I want to start by saying I am not an SQL programmer (I'm a C++/Delphi guy), so some of my questions might be really obvious. So pardon my ignorance :o)
I've been charged with writing a script that will update certain tables in a database based on the contents of a CSV file. I have it working it would seem, but I am worried about atomicity for one of the steps:
One of the tables contains only one field - an int which must be incremented each time, but from what I can see is not defined as an identity for some reason. I must create a new row in this table, and insert that row's value into another newly-created row in another table.
This is how I did it (as part of a larger script):
DECLARE #uniqueID INT,
#counter INT,
#maxCount INT
SELECT #maxCount = COUNT(*) FROM tempTable
SET #counter = 1
WHILE (#counter <= #maxCount)
BEGIN
SELECT #uniqueID = MAX(id) FROM uniqueIDTable <----Line 1
INSERT INTO uniqueIDTableVALUES (#uniqueID + 1) <----Line 2
SELECT #uniqueID = #uniqueID + 1
UPDATE TOP(1) tempTable
SET userID = #uniqueID
WHERE userID IS NULL
SET #counter = #counter + 1
END
GO
First of all, am I correct using a "WHILE" construct? I couldn't find a way to achieve this with a simple UPDATE statement.
Second of all, how can I be sure that no other operation will be carried out on the database between Lines 1 and 2 that would insert a value into the uniqueIDTable before I do? Is there a way to "synchronize" operations in SQL Server Express?
Also, keep in mind that I have no control over the database design.
Thanks a lot!

You can do the whole 9 yards in one single statement:
WITH cteUsers AS (
SELECT t.*
, ROW_NUMBER() OVER (ORDER BY userID) as rn
, COALESCE(m.id,0) as max_id
FROM tempTable t WITH(UPDLOCK)
JOIN (
SELECT MAX(id) as id
FROM uniqueIDTable WITH (UPDLOCK)
) as m ON 1=1
WHERE userID IS NULL)
UPDATE cteUsers
SET userID = rn + max_id
OUTPUT INSERTED.userID
INTO uniqueIDTable (id);
You get the MAX(id), lock the uniqueIDTable, compute sequential userIDs for users with NULL userID by using ROW_NUMBER(), update the tempTable and insert the new ids into uniqueIDTable. All in one operation.
For performance you need and index on uniqueIDTable(id) and index on tempTable(userID).
SQL is all about set oriented operations, WHILE loops are the code smell of SQL.

You need a transaction to ensure atomicity and you need to move the select and insert into one statement or do the select with an updlock to prevent two people from running the select at the same time, getting the same value and then trying to insert the same value into the table.
Basically
DECLARE #MaxValTable TABLE (MaxID int)
BEGIN TRANSACTION
BEGIN TRY
INSERT INTO uniqueIDTable VALUES (id)
OUTPUT inserted.id INTO #MaxValTable
SELECT MAX(id) + 1 FROM uniqueIDTable
UPDATE TOP(1) tempTable
SET userID = (SELECT MAXid FROM #MaxValTable)
WHERE userID IS NULL
COMMIT TRANSACTION
END TRY
BEGIN CATCH
ROLLBACK TRANSACTION
RAISERROR 'Error occurred updating tempTable' -- more detail here is good
END CATCH
That said, using an identity would make things far simpler. This is a potential concurrency problem. Is there any way you can change the column to be identity?
Edit: Ensuring that only one connection at a time will be able to insert into the uniqueIDtable. Not going to scale well though.
Edit: Table variable's better than exclusive table lock. If need be, this can be used when inserting users as well.

Related

SQL Server custom counter stored procedure creating dupes

I created a stored procedure to implement rate limiting on my API, this is called about 5-10k times a second and each day I'm noticing dupes in the counter table.
It looks up the API key being passed in and then checks the counter table with the ID and date combination using an "UPSERT" and if it finds a result it does an UPDATE [count]+1 and if not it will INSERT a new row.
There is no primary key in the counter table.
Here is the stored procedure:
USE [omdb]
GO
/****** Object: StoredProcedure [dbo].[CheckKey] Script Date: 6/17/2017 10:39:37 PM ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
ALTER PROCEDURE [dbo].[CheckKey] (
#apikey AS VARCHAR(10)
)
AS
BEGIN
SET NOCOUNT ON;
DECLARE #userID as int
DECLARE #limit as int
DECLARE #curCount as int
DECLARE #curDate as Date = GETDATE()
SELECT #userID = id, #limit = limit FROM [users] WHERE apiKey = #apikey
IF #userID IS NULL
BEGIN
--Key not found
SELECT 'False' as [Response], 'Invalid API key!' as [Reason]
END
ELSE
BEGIN
--Key found
BEGIN TRANSACTION Upsert
MERGE [counter] AS t
USING (SELECT #userID AS ID) AS s
ON t.[ID] = s.[ID] AND t.[date] = #curDate
WHEN MATCHED THEN UPDATE SET t.[count] = t.[count]+1
WHEN NOT MATCHED THEN INSERT ([ID], [date], [count]) VALUES (#userID, #curDate, 1);
COMMIT TRANSACTION Upsert
SELECT #curCount = [count] FROM [counter] WHERE ID = #userID AND [date] = #curDate
IF #limit IS NOT NULL AND #curCount > #limit
BEGIN
SELECT 'False' as [Response], 'Request limit reached!' as [Reason]
END
ELSE
BEGIN
SELECT 'True' as [Response], NULL as [Reason]
END
END
END
I also think some locks are happening after introducing this SP.
The dupes aren't breaking anything, but I'm curious if it's something fundamentally wrong with my code or if I should setup a constraint in the table to prevent this. Thanks
Update 6/23/17: I dropped the MERGE statement and tried using ##ROWCOUNT but it also caused dupes
BEGIN TRANSACTION Upsert
UPDATE [counter] SET [count] = [count]+1 WHERE [ID] = #userID AND [date] = #curDate
IF ##ROWCOUNT = 0 AND ##ERROR = 0
INSERT INTO [counter] ([ID], [date], [count]) VALUES (#userID, #curDate, 1)
COMMIT TRANSACTION Upsert
A HOLDLOCK hint on the update statement will avoid the race condition. To prevent deadlocks, I suggest a clustered composite primary key (or unique index) on ID and date.
The example below incorporates these changes and uses the SET <variable> = <column> = <expression> form of the SET clause to avoid the need for the subsequent SELECT of the final counter value and thereby improve performance.
ALTER PROCEDURE [dbo].[CheckKey]
#apikey AS VARCHAR(10)
AS
SET NOCOUNT ON;
--SET XACT_ABORT ON is a best practice for procs with explcit transactions
SET XACT_ABORT ON;
DECLARE
#userID as int
, #limit as int
, #curCount as int
, #curDate as Date = GETDATE();
BEGIN TRY;
SELECT
#userID = id
, #limit = limit
FROM [users]
WHERE apiKey = #apikey;
IF #userID IS NULL
BEGIN
--Key not found
SELECT 'False' as [Response], 'Invalid API key!' as [Reason];
END
ELSE
BEGIN
--Key found
BEGIN TRANSACTION Upsert;
UPDATE [counter] WITH(HOLDLOCK)
SET #curCount = [count] = [count] + 1
WHERE
[ID] = #userID
AND [date] = #curDate;
IF ##ROWCOUNT = 0
BEGIN
INSERT INTO [counter] ([ID], [date], [count])
VALUES (#userID, #curDate, 1);
END;
IF #limit IS NOT NULL AND #curCount > #limit
BEGIN
SELECT 'False' as [Response], 'Request limit reached!' as [Reason]
END
ELSE
BEGIN
SELECT 'True' as [Response], NULL as [Reason]
END;
COMMIT TRANSACTION Upsert;
END;
END TRY
BEGIN CATCH
IF ##TRANCOUNT > 0 ROLLBACK;
THROW;
END CATCH;
GO
Probably not the answer you're looking for but for a rate-limiting counter I would use a cache like Redis in a middleware before hitting the API. Performance-wise it's pretty great since Redis would have no problem with the load and your DB won’t be impacted.
And if you want to keep a history of hits per api key per day in SQL, run a daily task to import yesterday's counts from Redis to SQL.
The data-set would be small enough to get a Redis instance that would cost literally nothing (or close).
It will be the merge statement getting into a race condition with itself, i.e. your API is getting called by the same client and both times the merge statement finds no row so inserts one. Merge isn't an atomic operation, even though it's reasonable to assume it is. For example see this bug report for SQL 2008, about merge causing deadlocks, the SQL server team said this is by design.
From your post I think the immediate issue is that your clients will be potentially getting​ a small number of free hits on your API. For example if two requests come in and see no row you'll start with two rows with a count of 1 when you'd actually want one row with a count of 2 and the client could end up getting 1 free API hit that day. If three requests crossed over you'd get three rows with a count of 1 and they could get 2 free API hits, etc.
Edit
So as your link suggests you've got two categories of options you could explore, firstly just try and get this working in SQL server, secondly other architectural solutions.
For the SQL option I would do away with the merge and consider pre-populating your clients ahead of time, either nightly or less often for several days at a time, this will leave you a single update instead of the merge/update and insert. Then you can confirm both your update and your select are fully optimised, ie have the necessary index and that they aren't causing scans. Next you could look at tweaking locking so you're only locking at the row level, see this for some more info. For the select you could also look at using NOLOCK which means you could get slightly incorrect data but this shouldn't matter in your case, you'll be using a WHERE which targets a single row always as well.
For the non-SQL options, as your link says you could look at queuing things up, obviously these would be the updates/inserts so your selects would be seeing old data. This may or may not be acceptable depending on how far apart they are although you could have this as an "eventually consistent" solution if you wanted to be strict and charge extra or take off API hits the next day or something. You could also look at caching options to store the counts, this would get more complex if your app is distributed but there are caching solutions for this. If you went with caching you could choose to not persist anything but then you'd potentially give away a load of free hits if your site went down, but you'd probably have bigger issues to worry about then anyway!
At a high level, have you considered pursuing the following scenario?
Restructuring: Set the primary key on on your table to be a composite of (ID, date). Possibly even better, just use the API Key itself instead of the arbitrary ID you're assigning it.
Query A: Do SQL Server's equivalent of "INSERT IGNORE" (it seems there are semantic equivalents for SQL Server based on a Google search) with the values (ID, TODAY(), 1). You'll also want to specify a WHERE clause that checks the ID actually exists in your API/limits table).
Query B: Update the row with (ID, TODAY()) as its primary key, setting count := count + 1, and in the very same query, do an inner join with your limits table, so that in the where clause you can specify that you'll only update the count if the count < limit.
If the majority of your requests are valid API requests or rate-limited requests, I would perform queries in the following order on every request:
Run Query B.
If 0 rows updated:
Run query A.
If 0 rows updated:
Run query B.
If 0 rows updated, reject because of rate limit.
If 1 rows updated, continue.
If 1 rows updated:
continue.
If 1 row updated:
continue.
If the majority of your requests are invalid API requests, I'd do the following:
Run query A.
If 0 rows updated:
Run query B.
If 0 rows updated, reject because of rate limit.
If 1 rows updated, continue.
If 1 rows updated:
continue.

SQL Server race condition issue with range lock

I'm implementing a queue in SQL Server (please no discussions about this) and am running into a race condition issue. The T-SQL of interest is the following:
set transaction isolation level serializable
begin tran
declare #RecordId int
declare #CurrentTS datetime2
set #CurrentTS=CURRENT_TIMESTAMP
select top 1 #RecordId=Id from QueuedImportJobs with (updlock) where Status=#Status and (LeaseTimeout is null or #CurrentTS>LeaseTimeout) order by Id asc
if ##ROWCOUNT> 0
begin
update QueuedImportJobs set LeaseTimeout = DATEADD(mi,5,#CurrentTS), LeaseTicket=newid() where Id=#RecordId
select * from QueuedImportJobs where Id = #RecordId
end
commit tran
RecordId is the PK and there is also an index on Status,LeaseTimeout.
What I'm basically doing is select a record of which the lease happens to be expired, while simultaneously updating the lease time with 5 minutes and setting a new lease ticket.
So the problem is that I'm getting deadlocks when I run this code in parallel using a couple of threads. I've debugged it up to the point where I found out that the update statement sometimes gets executing twice for the same record. Now, I was under the impression that the with (updlock) should prevent this (it also happens with xlock btw, not with tablockx). So it actually look like there is a RangeS-U and a RangeX-X lock on the same range of records, which ought to be impossible.
So what am I missing? I'm thinking it might have something to do with the top 1 clause or that SQL Server does not know that where Id=#RecordId is actually in the locked range?
Deadlock graph:
Table schema (simplified):
It looks like the locks are on different HOBT's. Are there multiple indexes on the table?
If so, the select with (updlock) might only take an update lock on one index.
Why not just:
DECLARE #t TABLE(Id INT);
UPDATE TOP (1) dbo.QueuedImportJobs
SET LeaseTimeout = DATEADD(MINUTE, 5, CURRENT_TIMESTAMP)
OUTPUT inserted.Id INTO #t
WHERE Status = #Status
AND COALESCE(LeaseTimeout, '19000101') < CURRENT_TIMESTAMP;
SELECT <cols> FROM dbo.QueuedImportJobs
WHERE Id IN (SELECT Id FROM #t);
As an aside you might want to have ORDER BY to ensure the selected row is the first one on the queue according to the desired index order. If the index on Id is clustered, this is probably how it will work anyway, but there is no guarantee unless you say so. This will require a slight re-structuring of the query, since you can't apply ORDER BY (or an index hint) directly on an UPDATE, e.g.:
WITH x AS
(
SELECT TOP (1) Id, LeaseTimeout
FROM dbo.QueuedImportJobs
WHERE Status = #Status
AND COALESCE(LeaseTimeout, '19000101') < CURRENT_TIMESTAMP
ORDER BY Id
)
UPDATE x
SET LeaseTimeout = DATEADD(MINUTE, 5, CURRENT_TIMESTAMP)
OUTPUT inserted.id INTO #t;

T-SQL Is a sub query for an Update restriction Atomic with the update?

I've got a simple queue implementation in MS Sql Server 2008 R2. Here's the essense of the queue:
CREATE TABLE ToBeProcessed
(
Id BIGINT IDENTITY(1,1) PRIMARY KEY NOT NULL,
[Priority] INT DEFAULT(100) NOT NULL,
IsBeingProcessed BIT default (0) NOT NULL,
SomeData nvarchar(MAX) NOT null
)
I want to atomically select the top n rows ordered by the priority and the id where IsBeingProcessed is false and update those rows to say they are being processed. I thought I'd use a combination of Update, Top, Output and Order By but unfortunately you can't use top and order by in an Update statement.
So I've made an in clause to restrict the update and that sub query does the order by (see below). My question is, is this whole statement atomic, or do I need to wrap it in a transaction?
DECLARE #numberToProcess INT = 2
CREATE TABLE #IdsToProcess
(
Id BIGINT NOT null
)
UPDATE
ToBeProcessed
SET
ToBeProcessed.IsBeingProcessed = 1
OUTPUT
INSERTED.Id
INTO
#IdsToProcess
WHERE
ToBeProcessed.Id IN
(
SELECT TOP(#numberToProcess)
ToBeProcessed.Id
FROM
ToBeProcessed
WHERE
ToBeProcessed.IsBeingProcessed = 0
ORDER BY
ToBeProcessed.Id,
ToBeProcessed.Priority DESC)
SELECT
*
FROM
#IdsToProcess
DROP TABLE #IdsToProcess
Here's some sql to insert some dummy rows:
INSERT INTO ToBeProcessed (SomeData) VALUES (N'');
INSERT INTO ToBeProcessed (SomeData) VALUES (N'');
INSERT INTO ToBeProcessed (SomeData) VALUES (N'');
INSERT INTO ToBeProcessed (SomeData) VALUES (N'');
INSERT INTO ToBeProcessed (SomeData) VALUES (N'');
If I understand the motivation for the question you want to avoid the possibility that two concurrent transactions could both execute the sub query to get the top N rows to process then proceed to update the same rows?
In that case I'd use this approach.
;WITH cte As
(
SELECT TOP(#numberToProcess)
*
FROM
ToBeProcessed WITH(UPDLOCK,ROWLOCK,READPAST)
WHERE
ToBeProcessed.IsBeingProcessed = 0
ORDER BY
ToBeProcessed.Id,
ToBeProcessed.Priority DESC
)
UPDATE
cte
SET
IsBeingProcessed = 1
OUTPUT
INSERTED.Id
INTO
#IdsToProcess
I was a bit uncertain earlier whether SQL Server would take U locks when processing your version with the sub query thus blocking two concurrent transactions from reading the same TOP N rows. This does not appear to be the case.
Test Table
CREATE TABLE JobsToProcess
(
priority INT IDENTITY(1,1),
isprocessed BIT ,
number INT
)
INSERT INTO JobsToProcess
SELECT TOP (1000000) 0,0
FROM master..spt_values v1, master..spt_values v2
Test Script (Run in 2 concurrent SSMS sessions)
BEGIN TRY
DECLARE #FinishedMessage VARBINARY (128) = CAST('TestFinished' AS VARBINARY (128))
DECLARE #SynchMessage VARBINARY (128) = CAST('TestSynchronising' AS VARBINARY (128))
SET CONTEXT_INFO #SynchMessage
DECLARE #OtherSpid int
WHILE(#OtherSpid IS NULL)
SELECT #OtherSpid=spid
FROM sys.sysprocesses
WHERE context_info=#SynchMessage and spid<>##SPID
SELECT #OtherSpid
DECLARE #increment INT = ##spid
DECLARE #number INT = #increment
WHILE (#number = #increment AND NOT EXISTS(SELECT * FROM sys.sysprocesses WHERE context_info=#FinishedMessage))
UPDATE JobsToProcess
SET #number=number +=#increment,isprocessed=1
WHERE priority = (SELECT TOP 1 priority
FROM JobsToProcess
WHERE isprocessed=0
ORDER BY priority DESC)
SELECT *
FROM JobsToProcess
WHERE number not in (0,#OtherSpid,##spid)
SET CONTEXT_INFO #FinishedMessage
END TRY
BEGIN CATCH
SET CONTEXT_INFO #FinishedMessage
SELECT ERROR_MESSAGE(), ERROR_NUMBER()
END CATCH
Almost immediately execution stops as both concurrent transactions update the same row so the S locks taken whilst identifying the TOP 1 priority must get released before it aquires a U lock then the 2 transactions proceed to get the row U and X lock in sequence.
If a CI is added ALTER TABLE JobsToProcess ADD PRIMARY KEY CLUSTERED (priority) then deadlock occurs almost immediately instead as in this case the row S lock doesn't get released, one transaction aquires a U lock on the row and waits to convert it to an X lock and the other transaction is still waiting to convert its S lock to a U lock.
If the query above is changed to use MIN rather than TOP
WHERE priority = (SELECT MIN(priority)
FROM JobsToProcess
WHERE isprocessed=0
)
Then SQL Server manages to completely eliminate the sub query from the plan and takes U locks all the way.
Every individual T-SQL statement is, according to all my experience and all the documenation I've ever read, supposed to be atomic. What you have there is a single T-SQL statement, ergo is should be atomic and will not require explicit transaction statements. I've used this precise kind of logic many times, and never had a problem with it. I look forward to seeing if anyone as a supportable alternate opinion.
Incidentally, look into the ranking functions, specifically row_number(), for retrieving a set number of items. The syntax is perhaps a tad awkward, but overall they are flexible and powerful tools. (There are about a bazillion Stack Overlow questions and answers that discuss them.)

Instead of trigger in SQL Server loses SCOPE_IDENTITY?

I have a table where I created an INSTEAD OF trigger to enforce some business rules.
The issue is that when I insert data into this table, SCOPE_IDENTITY() returns a NULL value, rather than the actual inserted identity.
Insert + Scope code
INSERT INTO [dbo].[Payment]([DateFrom], [DateTo], [CustomerId], [AdminId])
VALUES ('2009-01-20', '2009-01-31', 6, 1)
SELECT SCOPE_IDENTITY()
Trigger:
CREATE TRIGGER [dbo].[TR_Payments_Insert]
ON [dbo].[Payment]
INSTEAD OF INSERT
AS
BEGIN
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON;
IF NOT EXISTS(SELECT 1 FROM dbo.Payment p
INNER JOIN Inserted i ON p.CustomerId = i.CustomerId
WHERE (i.DateFrom >= p.DateFrom AND i.DateFrom <= p.DateTo) OR (i.DateTo >= p.DateFrom AND i.DateTo <= p.DateTo)
) AND NOT EXISTS (SELECT 1 FROM Inserted p
INNER JOIN Inserted i ON p.CustomerId = i.CustomerId
WHERE (i.DateFrom <> p.DateFrom AND i.DateTo <> p.DateTo) AND
((i.DateFrom >= p.DateFrom AND i.DateFrom <= p.DateTo) OR (i.DateTo >= p.DateFrom AND i.DateTo <= p.DateTo))
)
BEGIN
INSERT INTO dbo.Payment (DateFrom, DateTo, CustomerId, AdminId)
SELECT DateFrom, DateTo, CustomerId, AdminId
FROM Inserted
END
ELSE
BEGIN
ROLLBACK TRANSACTION
END
END
The code worked before the creation of this trigger. I am using LINQ to SQL in C#. I don't see a way of changing SCOPE_IDENTITY to ##IDENTITY. How do I make this work?
Use ##identity instead of scope_identity().
While scope_identity() returns the last created id in the current scope, ##identity returns the last created id in the current session.
The scope_identity() function is normally recommended over the ##identity field, as you usually don't want triggers to interfer with the id, but in this case you do.
Since you're on SQL 2008, I would highly recommend using the OUTPUT clause instead of one of the custom identity functions. SCOPE_IDENTITY currently has some issues with parallel queries that cause me to recommend against it entirely. ##Identity does not, but it's still not as explicit, and as flexible, as OUTPUT. Plus OUTPUT handles multi-row inserts. Have a look at the BOL article which has some great examples.
I was having serious reservations about using ##identity, because it can return the wrong answer.
But there is a workaround to force ##identity to have the scope_identity() value.
Just for completeness, first I'll list a couple of other workarounds for this problem I've seen on the web:
Make the trigger return a rowset. Then, in a wrapper SP that performs the insert, do INSERT Table1 EXEC sp_ExecuteSQL ... to yet another table. Then scope_identity() will work. This is messy because it requires dynamic SQL which is a pain. Also, be aware that dynamic SQL runs under the permissions of the user calling the SP rather than the permissions of the owner of the SP. If the original client could insert to the table, he should still have that permission, just know that you could run into problems if you deny permission to insert directly to the table.
If there is another candidate key, get the identity of the inserted row(s) using those keys. For example, if Name has a unique index on it, then you can insert, then select the (max for multiple rows) ID from the table you just inserted to using Name. While this may have concurrency problems if another session deletes the row you just inserted, it's no worse than in the original situation if someone deleted your row before the application could use it.
Now, here's how to definitively make your trigger safe for ##Identity to return the correct value, even if your SP or another trigger inserts to an identity-bearing table after the main insert.
Also, please put comments in your code about what you are doing and why so that future visitors to the trigger don't break things or waste time trying to figure it out.
CREATE TRIGGER TR_MyTable_I ON MyTable INSTEAD OF INSERT
AS
SET NOCOUNT ON
DECLARE #MyTableID int
INSERT MyTable (Name, SystemUser)
SELECT I.Name, System_User
FROM Inserted
SET #MyTableID = Scope_Identity()
INSERT AuditTable (SystemUser, Notes)
SELECT SystemUser, 'Added Name ' + I.Name
FROM Inserted
-- The following statement MUST be last in this trigger. It resets ##Identity
-- to be the same as the earlier Scope_Identity() value.
SELECT MyTableID INTO #Trash FROM MyTable WHERE MyTableID = #MyTableID
Normally, the extra insert to the audit table would break everything, because since it has an identity column, then ##Identity will return that value instead of the one from the insertion to MyTable. However, the final select creates a new ##Identity value that is the correct one, based on the Scope_Identity() that we saved from earlier. This also proofs it against any possible additional AFTER trigger on the MyTable table.
Update:
I just noticed that an INSTEAD OF trigger isn't necessary here. This does everything you were looking for:
CREATE TRIGGER dbo.TR_Payments_Insert ON dbo.Payment FOR INSERT
AS
SET NOCOUNT ON;
IF EXISTS (
SELECT *
FROM
Inserted I
INNER JOIN dbo.Payment P ON I.CustomerID = P.CustomerID
WHERE
I.DateFrom < P.DateTo
AND P.DateFrom < I.DateTo
) ROLLBACK TRAN;
This of course allows scope_identity() to keep working. The only drawback is that a rolled-back insert on an identity table does consume the identity values used (the identity value is still incremented by the number of rows in the insert attempt).
I've been staring at this for a few minutes and don't have absolute certainty right now, but I think this preserves the meaning of an inclusive start time and an exclusive end time. If the end time was inclusive (which would be odd to me) then the comparisons would need to use <= instead of <.
Main Problem : Trigger and Entity framework both work in diffrent scope.
The problem is, that if you generate new PK value in trigger, it is different scope. Thus this command returns zero rows and EF will throw exception.
The solution is to add the following SELECT statement at the end of your Trigger:
SELECT * FROM deleted UNION ALL
SELECT * FROM inserted;
in place of * you can mention all the column name including
SELECT IDENT_CURRENT(‘tablename’) AS <IdentityColumnname>
Like araqnid commented, the trigger seems to rollback the transaction when a condition is met. You can do that easier with an AFTER INSTERT trigger:
CREATE TRIGGER [dbo].[TR_Payments_Insert]
ON [dbo].[Payment]
AFTER INSERT
AS
BEGIN
SET NOCOUNT ON;
IF <Condition>
BEGIN
ROLLBACK TRANSACTION
END
END
Then you can use SCOPE_IDENTITY() again, because the INSERT is no longer done in the trigger.
The condition itself seems to let two identical rows past, if they're in the same insert. With the AFTER INSERT trigger, you can rewrite the condition like:
IF EXISTS(
SELECT *
FROM dbo.Payment a
LEFT JOIN dbo.Payment b
ON a.Id <> b.Id
AND a.CustomerId = b.CustomerId
AND (a.DateFrom BETWEEN b.DateFrom AND b.DateTo
OR a.DateTo BETWEEN b.DateFrom AND b.DateTo)
WHERE b.Id is NOT NULL)
And it will catch duplicate rows, because now it can differentiate them based on Id. It also works if you delete a row and replace it with another row in the same statement.
Anyway, if you want my advice, move away from triggers altogether. As you can see even for this example they are very complex. Do the insert through a stored procedure. They are simpler and faster than triggers:
create procedure dbo.InsertPayment
#DateFrom datetime, #DateTo datetime, #CustomerId int, #AdminId int
as
BEGIN TRANSACTION
IF NOT EXISTS (
SELECT *
FROM dbo.Payment
WHERE CustomerId = #CustomerId
AND (#DateFrom BETWEEN DateFrom AND DateTo
OR #DateTo BETWEEN DateFrom AND DateTo))
BEGIN
INSERT into dbo.Payment
(DateFrom, DateTo, CustomerId, AdminId)
VALUES (#DateFrom, #DateTo, #CustomerId, #AdminId)
END
COMMIT TRANSACTION
A little late to the party, but I was looking into this issue myself. A workaround is to create a temp table in the calling procedure where the insert is being performed, insert the scope identity into that temp table from inside the instead of trigger, and then read the identity value out of the temp table once the insertion is complete.
In procedure:
CREATE table #temp ( id int )
... insert statement ...
select id from #temp
-- (you can add sorting and top 1 selection for extra safety)
drop table #temp
In instead of trigger:
-- this check covers you for any inserts that don't want an identity value returned (and therefore don't provide a temp table)
IF OBJECT_ID('tempdb..#temp') is not null
begin
insert into #temp(id)
values
(SCOPE_IDENTITY())
end
You probably want to call it something other than #temp for safety sake (something long and random enough that no one else would be using it: #temp1234235234563785635).

Efficient transaction, record locking

I've got a stored procedure, which selects 1 record back. the stored procedure could be called from several different applications on different PCs. The idea is that the stored procedure brings back the next record that needs to be processed, and if two applications call the stored proc at the same time, the same record should not be brought back. My query is below, I'm trying to write the query as efficiently as possible (sql 2008). Can it get done more efficiently than this?
CREATE PROCEDURE GetNextUnprocessedRecord
AS
BEGIN
SET NOCOUNT ON;
--ID of record we want to select back
DECLARE #iID BIGINT
-- Find the next processable record, and mark it as dispatched
-- Must be done in a transaction to ensure no other query can get
-- this record between the read and update
BEGIN TRAN
SELECT TOP 1
#iID = [ID]
FROM
--Don't read locked records, only lock the specific record
[MyRecords] WITH (READPAST, ROWLOCK)
WHERE
[Dispatched] is null
ORDER BY
[Received]
--Mark record as picked up for processing
UPDATE
[MyRecords]
SET
[Dispatched] = GETDATE()
WHERE
[ID] = #iID
COMMIT TRAN
--Select back the specific record
SELECT
[ID],
[Data]
FROM
[MyRecords] WITH (NOLOCK, READPAST)
WHERE
[ID] = #iID
END
Using the READPAST locking hint is correct and your SQL looks OK.
I'd add use XLOCK though which is also HOLDLOCK/SERIALIZABLE
...
[MyRecords] WITH (READPAST, ROWLOCK, XLOCK)
...
This means you get the ID, and exclusively lock that row while you carry on and update it.
Edit: add an index on Dispatched and Received columns to make it quicker. If [ID] (I assume it's the PK) is not clustered, INCLUDE [ID]. And filter the index too because it's SQL 2008
You could also use this construct which does it all in one go without XLOCK or HOLDLOCK
UPDATE
MyRecords
SET
--record the row ID
#id = [ID],
--flag doing stuff
[Dispatched] = GETDATE()
WHERE
[ID] = (SELECT TOP 1 [ID] FROM MyRecords WITH (ROWLOCK, READPAST) WHERE Dispatched IS NULL ORDER BY Received)
UPDATE, assign, set in one
You can assign each picker process a unique id, and add columns pickerproc and pickstate to your records. Then
UPDATE MyRecords
SET pickerproc = myproc,
pickstate = 'I' -- for 'I'n process
WHERE Id = (SELECT MAX(Id) FROM MyRecords WHERE pickstate = 'A') -- 'A'vailable
That gets you your record in one atomic step, and you can do the rest of your processing at your leisure. Then you can set pickstate to 'C'omplete', 'E'rror, or whatever when it's resolved.
I think Mitch is referring to another good technique where you create a message-queue table and insert the Ids there. There are several SO threads - search for 'message queue table'.
You can keep MyRecords on a "MEMORY" table for faster processing.

Resources