Can I Select and Update at the same time? - sql-server

This is an over-simplified explanation of what I'm working on.
I have a table with status column. Multiple instances of the application will pull the contents of the first row with a status of NEW, update the status to WORKING and then go to work on the contents.
It's easy enough to do this with two database calls; first the SELECT then the UPDATE. But I want to do it all in one call so that another instance of the application doesn't pull the same row. Sort of like a SELECT_AND_UPDATE thing.
Is a stored procedure the best way to go?

You could use the OUTPUT statement.
DECLARE #Table TABLE (ID INTEGER, Status VARCHAR(32))
INSERT INTO #Table VALUES (1, 'New')
INSERT INTO #Table VALUES (2, 'New')
INSERT INTO #Table VALUES (3, 'Working')
UPDATE #Table
SET Status = 'Working'
OUTPUT Inserted.*
FROM #Table t1
INNER JOIN (
SELECT TOP 1 ID
FROM #Table
WHERE Status = 'New'
) t2 ON t2.ID = t1.ID

Sounds like a queue processing scenario, whereby you want one process only to pick up a given record.
If that is the case, have a look at the answer I provided earlier today which describes how to implement this logic using a transaction in conjunction with UPDLOCK and READPAST table hints:
Row locks - manually using them
Best wrapped up in sproc.
I'm not sure this is what you are wanting to do, hence I haven't voted to close as duplicate.

Not quite, but you can SELECT ... WITH (UPDLOCK), then UPDATE.. subsequently. This is as good as an atomic operation as it tells the database that you are about to update what you previously selected, so it can lock those rows, preventing collisions with other clients. Under Oracle and some other database (MySQL I think) the syntax is SELECT ... FOR UPDATE.
Note: I think you'll need to ensure the two statements happen within a transaction for it to work.

You should do three things here:
Lock the row you're working on
Make sure that this and only this row is locked
Do not wait for the locked records: skip the the next ones instead.
To do this, you just issue this:
SELECT TOP 1 *
FROM mytable (ROWLOCK, UPDLOCK, READPAST)
WHERE status = 'NEW'
ORDER BY
date
UPDATE …
within a transaction.

A stored procedure is the way to go. You need to look at transactions. Sql server was born for this kind of thing.

Yes, and maybe use the rowlock hint to keep it isolated from the other threads, eg.
UPDATE
Jobs WITH (ROWLOCK, UPDLOCK, READPAST)
SET Status = 'WORKING'
WHERE JobID =
(SELECT Top 1 JobId FROM Jobs WHERE Status = 'NEW')
EDIT: Rowlock would be better as suggested by Quassnoi, but the same idea applies to do the update in one query.

Related

Split field and insert rows in SQL Server trigger, when mutliple rows are affected without using a cursor

I have an INSERT trigger of a table, where one field of the table contains a comma-separated list of key-value pairs, that are separated by a :
I can select this field with the two values into a temp table easily with this statement:
-- SAMPLE DATA FOR PRESENTATION ONLY
DECLARE #messageIds VARCHAR(2000) = '29708332:55197,29708329:54683,29708331:54589,29708330:54586,29708327:54543,29708328:54539,29708333:54538,29708334:62162,29708335:56798';
SELECT
SUBSTRING(value, 1,CHARINDEX(':', value) - 1)AS MessageId,
SUBSTRING(value, CHARINDEX(':', value) + 1, LEN(value)-SUBSTRING(value,0,CHARINDEX(value,':'))) AS DeviceId
INTO #temp_messages
FROM STRING_SPLIT(#messageIds, ',')
SELECT * FROM #temp_messages
DROP TABLE #temp_messages
The result will look like this
29708332 55197
29708329 54683
29708331 54589
29708330 54586
29708327 54543
29708328 54539
29708333 54538
29708334 62162
29708335 56798
From here I can join the temp table to other tables and insert some of the results into a third table.
Inside the trigger I can get the messageIds with a simple SELECT statement like
DECLARE #messageIds VARCHAR(2000) = (SELECT ProcessMessageIds FROM INSERTED)
Now I create the temp table (like described above) and process my
INSERT INto <new_table> SELECT col1, col1, .. FROM #temp_messages
JOIN <another_table> ON ...
Unfortunately this will only work for single row inserts. As soon as there is more than one row, my SELECT ProcessMessageIds FROM INSERTED will fail, as there are multiple rows in the INSERTED table.
I can process the rows in a CURSOR but as far as I know CURSORS are a no-go in triggers and I should avoid them whenever it is possible.
Therefore my question is, if there is another way to do this without using a CURSOR inside the trigger?
Before we get into the details of the solution, let me point out that you would have no such issues if you normalized your database, as #Larnu pointed out in the comment section of your question.
Your
DECLARE #messageIds VARCHAR(2000) = (SELECT ProcessMessageIds FROM INSERTED)
statement assumes that there will be a single value to be assigned to #messageIDs and, as you have pointed out, this is not necessarily true.
Solution 1: Join with INSERTED rather than load it into a variable
INSERT INTO t1
SELECT ...
FROM t2
JOIN T3
ON ...
JOIN INSERTED
ON ...
and then you can reach INSERTED.ProcessMessageIds without issues. This will no longer assume that a single value was used.
Solution 2: cursors
You can use a CURSOR, as you have already pointed out, but it's not a very good idea to use cursors inside a trigger, see https://social.msdn.microsoft.com/Forums/en-US/87fd1205-4e27-413d-b040-047078b07756/cursor-usages-in-trigger-in-sql-server?forum=aspsqlserver
Solution 3: insert a single line at a time
While this would not require a change in your trigger, it would require a change in how you insert and it would increase the number of db requests necessary, so I would advise you not to choose this approach.
Solution 4: normalize
See https://www.simplilearn.com/tutorials/sql-tutorial/what-is-normalization-in-sql
If you had a proper table rather than a table of composite values, you would have no such issues and you would have a much easier time to process the message ids in general.
Summary
It would be wise to normalize your tables and perform the refactoring that would be needed afterwards. It's a great effort now, but you will enjoy its fruits. If that's not an option, you can "act as if it was normalized" and choose Solution 1.
As pointed out in the answers, joining with the INSERTED table solved my problem.
SELECT INTAB.Id,
SUBSTRING(value, 1,CHARINDEX(':', value) - 1)AS MessageId,
SUBSTRING(value, CHARINDEX(':', value) + 1, LEN(value)-SUBSTRING(value,0,CHARINDEX(value,':'))) AS DeviceId
FROM INSERTED AS INTAB
CROSS APPLY STRING_SPLIT(ProcessMessageids,',')
I never used "CROSS APPLY" before, thank you.

Avoid inserting duplicate records in SQL Server

I haven't been able to find an answer to this. Suppose I have the following table/query:
The table:
create table ##table
(
column1 int,
column2 nvarchar(max)
)
The query (in a real life scenario the condition will be more complex):
declare #shouldInsert bit
set #shouldInsert = case when exists(
select *
from ##table
where column2 = 'test') then 1 else 0 end
--Exaggerating a possible delay:
waitfor delay '00:00:10'
if(#shouldInsert = 0)
insert into ##table
values(1, 'test')
If I run this query twice simultaneously then it's liable to insert duplicate records (enforsing a unique constraint is out of the question because the real-life condition is more involved than the mere "column1" uniqueness across the table)
I see two possible solutions:
I run both concurrent transactions in serializable mode, but it will create a deadlock (first a shared lock in select then an x-lock in insert - deadlock).
In the select statement I use the query hints with(update, tablock) which will effectively x-lock the entire table, but it will prevent other transactions from reading data (something I'd like to avoid)
Which is more acceptable? Is there a third solution?
Thanks.
If you can, you should put a UNIQUE constraint (or index) on whatever column(s) it is that is defining the uniqueness.
With this, you might still get the "OK, doesn't exist yet" response for your initial check for two separate processes - but one of the two will be first and get his row inserted, while the second will get a "unique constraint violated" exception back from the database.
Regardless how "involved" your "real-life condition" is you have two options: enforce UNIQUE or deal with multiple records. Any work-around will likely be fragile.
For example your delay hack is pretty useless if you need to add another DB server or overwhelming load slows down the execution of individual threads
One of the ways you could allow for multiple copies of a should-be-unique value is to create another table that can act as a queue and doesn't enforce uniqueness and a serial worker to dequeue it. Or change the data structure to allow for 1-to-many and pick the first one when querying. Still a hack but at least not terribly "creative" and it can't break
declare #shouldInsert bit
set #shouldInsert = case when exists(
select *
from ##table
where column2 = 'test') then 1 else 0 end
--Exaggerating a possible delay:
waitfor delay '00:00:10'
truncate table #temp
if(#shouldInsert = 0)
insert into #temp
values(1, 'test')
--if records is not available in ##table then data will be inserted from #temp table to ##table
insert into ##table
select * from #temp
except
select * from ##table

Select and Delete in the same transaction using TOP clause

I have table in which the data is been continuously added at a rapid pace.
And i need to fetch record from this table and immediately remove them so i cannot process the same record second time. And since the data is been added at a faster rate, i need to use the TOP clause so only small number of records go to business logic for processing at the time.
I am using the below query to
BEGIN TRAN readrowdata
SELECT
top 5 [RawDataId],
[RawData]
FROM
[TABLE] with(HOLDLOCK)
WITH q AS
(
SELECT
top 5 [RawDataId],
[RawData]
FROM
[TABLE] with(HOLDLOCK)
)
DELETE from q
COMMIT TRANSACTION readrowdata
I am using the HOLDLOCK here, so new data cannot insert into the table while i am performing the SELECT and DELETE operation. I used it because Suppose if there are only 3 records in the table now, so the SELECT statement will get 3 records and in the same time new record gets inserted and the DELETE statement will delete 4 records. So i will loose 1 data here.
Is the query is ok in performance term? If i can improve it then please provide me your suggestion.
Thank you
Personally, I'd use a different approach. One with less locking, but also extra information signifying that certain records are currently being processed...
DECLARE #rowsBeingProcessed TABLE (
id INT
);
WITH rows AS (
SELECT top 5 [RawDataId] FROM yourTable WHERE processing_start IS NULL
)
UPDATE rows SET processing_start = getDate() WHERE processing_start IS NULL
OUTPUT INSERTED.RowDataID INTO #rowsBeingProcessed;
-- Business Logic Here
DELETE yourTable WHERE RowDataID IN (SELECT id FROM #rowsBeingProcessed);
Then you can also add checks like "if a record has been 'beingProcessed' for more than 10 minutes, assume that the business logic failed", etc, etc.
By locking the table in this way, you force other processes to wait for your transaction to complete. This can have very rapid consequences on scalability and performance - and it tends to be hard to predict, because there's often a chain of components all relying on your database.
If you have multiple clients each running this query, and multiple clients adding new rows to the table, the overall system performance is likely to deteriorate at some times, as each "read" client is waiting for a lock, the number of "write" clients waiting to insert data grows, and they in turn may tie up other components (whatever is generating the data you want to insert).
Diego's answer is on the money - put the data into a variable, and delete matching rows. Don't use locks in SQL Server if you can possibly avoid it!
You can do it very easily with TRIGGERS. Below mentioned is a kind of situation which will help you need not to hold other users which are trying to insert data simultaneously. Like below...
Data Definition language
CREATE TABLE SampleTable
(
id int
)
Sample Record
insert into SampleTable(id)Values(1)
Sample Trigger
CREATE TRIGGER SampleTableTrigger
on SampleTable AFTER INSERT
AS
IF Exists(SELECT id FROM INSERTED)
BEGIN
Set NOCOUNT ON
SET XACT_ABORT ON
Begin Try
Begin Tran
Select ID From Inserted
DELETE From yourTable WHERE ID IN (SELECT id FROM Inserted);
Commit Tran
End Try
Begin Catch
Rollback Tran
End Catch
End
Hope this is very simple and helpful
If I understand you correctly, you are worried that between your select and your delete, more records would be inserted and the first TOP 5 would be different then the second TOP 5?
If that so, why don't you load your first select into a temp table or variable (or at least the PKs) do whatever you have to do with your data and then do your delete based on this table?
I know that it's old question, but I found some solution here https://www.simple-talk.com/sql/learn-sql-server/the-delete-statement-in-sql-server/:
DECLARE #Output table
(
StaffID INT,
FirstName NVARCHAR(50),
LastName NVARCHAR(50),
CountryRegion NVARCHAR(50)
);
DELETE SalesStaff
OUTPUT DELETED.* INTO #Output
FROM Sales.vSalesPerson sp
INNER JOIN dbo.SalesStaff ss
ON sp.BusinessEntityID = ss.StaffID
WHERE sp.SalesLastYear = 0;
SELECT * FROM #output;
Maybe it will be helpfull for you.

SQL Server Trigger loop

I would like to know if there is anyway I can add a trigger on two tables that will replicate the data to the other.
For example:
I have a two users tables, users_V1 and users_V2, When a user is updated with one of the V1 app, it activate a trigger updating it in users_V2 as well.
If I want to add the same trigger on the V2 table in order to update the data in V1 when a user is updated in V2, will it go into an infinite loop? Is there any way to avoid that.
I don't recommend explicitly disabling the trigger during processing - this can cause strange side-effects.
The most reliable way to detect (and prevent) cycles in a trigger is to use CONTEXT_INFO().
Example:
CREATE TRIGGER tr_Table1_Update
ON Table1
FOR UPDATE AS
DECLARE #ctx VARBINARY(128)
SELECT #ctx = CONTEXT_INFO()
IF #ctx = 0xFF
RETURN
SET #ctx = 0xFF
-- Trigger logic goes here
See this link for a more detailed example.
Note on CONTEXT_INFO() in SQL Server 2000:
Context info is supported but apparently the CONTEXT_INFO function is not. You have to use this instead:
SELECT #ctx = context_info
FROM master.dbo.sysprocesses
WHERE spid = ##SPID
Either use TRIGGER_NESTLEVEL() to restrict trigger recursion, or
check the target table whether an UPDATE is necessary at all:
IF (SELECT COUNT(1)
FROM users_V1
INNER JOIN inserted ON users_V1.ID = inserted.ID
WHERE users_V1.field1 <> inserted.field1
OR users_V1.field2 <> inserted.field2) > 0 BEGIN
UPDATE users_V1 SET ...
I had the exact same problem. I tried using CONTEXT_INFO() but that is a session variable and so it works only the first time! Then next time a trigger fires during the session, this won't work. So I ended up with using a variable that returns Nest Level in each of the affected triggers to exit.
Example:
CREATE TRIGGER tr_Table1_Update
ON Table1
FOR UPDATE AS
BEGIN
--Prevents Second Nested Call
IF ##NESTLEVEL>1 RETURN
--Trigger logic goes here
END
Note: Or use ##NESTLEVEL>0 if you want to stop all nested calls
One other note -- There seems to be much confusion in this article about nested calls and recursive calls. The original poster was referring to a nested trigger where one trigger would cause another trigger to fire, which would cause the first trigger to fire again, and so on. This is Nested, but according to SQL Server, not recursive because the trigger is not calling/triggering itself directly. Recursion is NOT where "one trigger [is] calling another". That is nested, but not necessarily recursive. You can test this by enabling/disabling recursion and nesting with some settings mentioned here: blog post on nesting
I'm with the no triggers camp for this particular design scenario. Having said that, with the limited knowledge I have about what your app does and why it does it, here's my overall analysis:
Using a trigger on a table has an advantage of being able to act on all actions on the table. That's it, your main benefit in this case. But that would mean you have users with direct access to the table or multiple access points to the table. I tend to avoid that. Triggers have their place (I use them a lot), but it's one of the last database design tools I use because they tend to not know a lot about their context (generally, a strength) and when used in a place where they do need to know about different contexts and overall use cases, their benefits are weakened.
If both app versions need to trigger the same action, they should both call the same stored proc. The stored proc can ensure that all the appropriate work is done, and when your app no longer needs to support V1, then that part of the stored proc can be removed.
Calling two stored procs in your client code is a bad idea, because this is an abstraction layer of data services which the database can provide easily and consistently, without your application being worried about it.
I prefer to control the interface to the underlying tables more - with either views or UDFs or SPs. Users never get direct access to a table. Another point here is that you could present a single "users" VIEW or UDF coalescing the appropriate underlying tables without the user even knowing about - perhaps getting to the point where there is not even any "synchronization" necessary, since new attributes are in an EAV system if you need that kind of pathological flexibility or in some other different structure which can still be joined - say OUTER APPLY UDF etc.
You're going to have to create some sort of loopback detection within your trigger. Perhaps using an "if exists" statement to see if the record exists before entering it into the next table. It does sound like it will go into an infinite loop the way it's currently set up.
Avoid triggers like the plague .... use a stored procedure to add the user. If this requires some design changes then make them. Triggers are the EVIL.
Try something like (I didn;t bother with thecreate trigger stuff as you clearly already know how to write that part):
update t
set field1 = i.field1
field2 = i.field2
from inserted i
join table1 t on i.id = t.id
where field1 <> i.field1 OR field2 <> i.field2
Recursion in triggers, that is, one trigger calling another, is limited to 32 levels
In each trigger, just check if the row you wish to insert already exists.
Example
CREATE TRIGGER Table1_Synchronize_Update ON [Table1] FOR UPDATE AS
BEGIN
UPDATE Table2
SET LastName = i.LastName
, FirstName = i.FirstName
, ... -- Every relevant field that needs to stay in sync
FROM Table2 t2
INNER JOIN Inserted i ON i.UserID = t2.UserID
WHERE i.LastName <> t2.LastName
OR i.FirstName <> t2.FirstName
OR ... -- Every relevant field that needs to stay in sync
END
CREATE TRIGGER Table1_Synchronize_Insert ON [Table1] FOR INSERT AS
BEGIN
INSERT INTO Table2
SELECT i.*
FROM Inserted i
LEFT OUTER JOIN Table2 t2 ON t2.UserID = i.UserID
WHERE t2.UserID IS NULL
END
CREATE TRIGGER Table2_Synchronize_Update ON [Table2] FOR UPDATE AS
BEGIN
UPDATE Table1
SET LastName = i.LastName
, FirstName = i.FirstName
, ... -- Every relevant field that needs to stay in sync
FROM Table1 t1
INNER JOIN Inserted i ON i.UserID = t1.UserID
WHERE i.LastName <> t1.LastName
OR i.FirstName <> t1.FirstName
OR ... -- Every relevant field that needs to stay in sync
END
CREATE TRIGGER Table2_Synchronize_Insert ON [Table2] FOR INSERT AS
BEGIN
INSERT INTO Table1
SELECT i.*
FROM Inserted i
LEFT OUTER JOIN Table1 t1 ON t1.UserID = i.UserID
WHERE t1.UserID IS NULL
END

SQL Server READPAST hint

I'm seeing behavior which looks like the READPAST hint is set on the database itself.
The rub: I don't think this is possible.
We have table foo (id int primary key identity, name varchar(50) not null unique);
I have several threads which do, basically
id = select id from foo where name = ?
if id == null
insert into foo (name) values (?)
id = select id from foo where name = ?
Each thread is responsible for inserting its own name (no two threads try to insert the same name at the same time). Client is java.
READ_COMMITTED_SNAPSHOT is ON, transaction isolation is specifically set to READ COMMITTED, using Connection.setTransactionIsolation( Connection.TRANSACTION_READ_COMMITTED );
Symptom is that if one thread is inserting, the other thread can't see it's row -- even rows which were committed to the database before the application started -- and tries to insert, but gets a duplicate-key-exception from the unique index on name.
Throw me a bone here?
You're at the wrong isolation level. Remember what happens with the snapshot isolation level. If one transaction is making a change, no other concurrent transactions see that transaction. Period. Other transactions only will see your changes once you have committed, but only if they START after your commit. The solution to this is to use a different isolation level. Wrap your statements in a transaction and SET TRANSACTION LEVEL SERIALIZABLE. This will ensure that your other concurrent transactions work as if they were all run serially, which is what you seem to want here.
Sounds like you're not wrapping the select and insert into a transaction?
As a solution, you could:
insert into foo (col1,col2,col3) values ('a','b','c')
where not exists (select * from foo where col1 = 'a')
After this, ##rowcount will be 1 if can check if a row was inserted.
SELECT SCOPE_IDENTITY()
should do the trick here...
plus wrapping into a transaction like previous poster mentioned.
The moral of this story is fully explained in my blog post "You can't hold onto nothing" but the short version of this is that you want to use the HOLDLOCK hint. I use the pattern:
INSERT INTO dbo.Foo(Name)
SELECT TOP 1
#name AS Name
FROM (SELECT 1 AS FakeColumn) AS FakeTable
WHERE NOT EXISTS (SELECT * FROM dbo.Foo WITH (HOLDLOCK)
WHERE Name=#name)
SELECT ID FROM dbo.Foo WHERE Name=#name

Resources