I have an application that, when a new user is added to a location, it is assigned a sequential number. So, for example, the first user at Location 01 would be assigned 01-0001, the second user 01-0002, etc.
While it is simple enough for me to find the max user number at any time for that location and +1, my issue is that I need this to be thread/collision safe.
While its not super common, I don't want one query to find the max() number while another query is in the process of adding that number at that same moment. (it has happened before in my original app, though only twice in 5 years.)
What would be the best way to go about this? I would prefer not to rely on a unique constraint as that would just throw an error and force the process to try it all again.
You can use
BEGIN TRAN
SELECT #UserId = MAX(UserId)
FROM YourTable WITH(UPDLOCK, HOLDLOCK, ROWLOCK)
WHERE LocationId = #LocationId
--TODO: Increment it or initialise it if no users yet.
INSERT INTO YourTable (UserId, Name)
VALUES (#UserId, #Name)
COMMIT
Only one session at a time can hold an update lock on a resource so if two concurrent sessions try to insert a row for the same location one of them will be blocked until the other one commits. The HOLDLOCK is to give serializable semantics and block the range containing the max.
This is a potential bottleneck but this is by design because of the requirement that the numbers be sequential. Better performing alternatives such as IDENTITY do not guarantee this. In reality though it sounds as though your application is fairly low usage so this may not be a big problem.
Another possible issue is that the ID will be recycled if the user that is the current max for a location gets deleted but this will be the same for your existing application by the sounds of it so I assume this is either not a problem or just doesn't happen.
You can use a sequence object described here.
Create a new sequence is very simple, for example you can use this code
create sequence dbo.UserId as int
start with 1
increment by 1;
With sequence object you don't need to be aware about any collision. Sequence will return always next value in every time you get it with NEXT VALUE FOR statement, like in this code
select next value for dbo.UserId;
The next value will be return correctly even if your rollback transaction or even if you get next value in separate, paralel transactions.
Related
I need to generate the progressive number of a invoices, avoiding gaps in the sequence:
At beginning I thought it was quite easy as
SELECT MAX(Docnumber)+1 as NewDocNumber
from InvoicesHeader
but since it takes some time to build the "insert into InvoiceHeader" query and another request could arrive, assigning to both Invoices the same NewDocNumber
I'm now thinking to avoid to generate the DocNumber in advanced and changed query to:
INSERT INTO InvoicesHeader (InvoiceID,..., DocNumber,...)
SELECT #InvoiceID,..., MAX(Docnumber)+1,... FROM InvoicesHeader
but although (it should) solve some problems, it is still thread unsafe and not suitable for race conditions:
adding TABLOCK or UPDLOCK, in this way:
BEGIN TRANSACTION TR1
INSERT INTO InvoicesHeader WITH (TABLOCK)
(InvoiceID,..., DocNumber,...)
SELECT #InvoiceID,..., MAX(Docnumber)+1,... FROM InvoicesHeader
COMMIT TRANSACTION TR1
Will solve the issue?
Or better to use ISOLATION LEVEL, NEXT VALUE FOR or other solution?
You already having thread safe generation of sequence generation in SQL Server. Read about Create Sequence. It is available starting from SQL Server 2012. It is better to use as sequence is generated outside the transaction scope.
Sequence numbers are generated outside the scope of the current
transaction. They are consumed whether the transaction using the
sequence number is committed or rolled back.
You can get next value from the sequence. We have been using sequences for generating order numbers and we have not found issues, when multiple order nubmers are generated in parallel.
SELECT NEXT VALUE FOR DocumentSequenceNumber;
Updated, based on comments, if you have four different documenttypes, I would suggest you to first generate sequence and then concatenate with a specific document type. It will be easier for you to understand. At the end of the year, you can restart the sequence using ALTER SEQUENCE
RESTART [ WITH ] The next value that will be returned by
the sequence object. If provided, the RESTART WITH value must be an
integer that is less than or equal to the maximum and greater than or
equal to the minimum value of the sequence object. If the WITH value
is omitted, the sequence numbering restarts based on the original
CREATE SEQUENCE options.
What happens if while a transaction is executing, another client computer does an insert?
I have two client computers and one DB. These computers have the same program on them. Let's say we have order table and salesorder# column, this column is UNIQUE.
If both computers execute at the same time I know SQL server will select one of the transactions and make the other one wait. So this transactions does following
#ordernumber= SELECT TOP Salesorder# +1 .
INSERT INTO order (salesorder#,dateship, user) VALUES (#ordernumber,GETDATE(),1,)
I Believe that if both happened at the same time, it would just choose one of then , run completely, and then do the same for the other one. Is that correct?
What happens in a different scenario. If the transaction Begins, and another INSERT ( not a TRANSACTION Just INSERT statement) is requested after after the SELECT Statement but before the INSERT happens.
What will SQL Server do in that situation? Is this even possible?
One word: DON'T DO THIS!! This WILL fail - for sure.
Assuming you have a table with the highest number of 100 - now you have two (or several) concurrent user requests to do an insert.
Both requests will first read the highest number (100) and each will increment it by +1 - so both have a value of 101 internally. The SELECT statement will only do a quick shared lock - but both SELECT will work and will read the value of 100. Whether or not they're inside a transaction makes no difference here - just because there's a transaction doesn't stop a second SELECT from happening.
The INSERT will in fact create an exclusive lock - but only on the row being inserted (NOT the whole table!). There's nothing from stopping both concurrent requests from inserting their new row, and both will attempt to insert a row with salesorder# value of 101. If you do have a unique constraint on this, one of the requests will fail since the other request has already inserted a value of 101.
This is one of the many reason you should NOT handle giving out sequential ID's manually - let the database handle it, by using an INT IDENTITY column, or a SEQUENCE object.
#marc_s thanks for you help , I believe found a way to follow your recommendation (since you crearly know what you are talking about ) and the same time have salesnumbers in a way I wanted.
Our software architecture centers in a gateway . So we have clients , sending command to a gateway and then the gateway communicates with SQL SERVER. What I will do, is to prevent the gateway from sending an INSERT command to the same table at the same time by giving priority to clients differently.
For example: if we have clients compuerta let's call it 1,2,3 . Do the same action at the same time. (I know is technically impossible that this happens ) I can give the priority to 1 then once 1 is finished 2 will execute and then 3 . I know this is not a solution thru SQL because but this way I will prevent a fail insert, and I will be able to keep IDENTITY for salesorder#. Limiting the times that a fail insert would happen to almost 0.
Once again thanks for all the help and explainations a per why not to INSERT the way I wanted to do before. Any books recommended for reading on DB information?
I have a table called ticket, and it has a field called number and a foreign key called client that needs to work much like an auto-field (incrementing by 1 for each new record), except that the client chain needs to be able specify the starting number. This isn't a unique field because multiple clients will undoubtedly use the same numbers (e.g. start at 1001). In my application I'm fetching the row with the highest number, and using that number + 1 to make the next record's number. This all takes place inside of a single transaction (the fetching and the saving of the new record). Is it true that I won't have to worry about a ticket ever getting an incorrect (duplicate) number under a high load situation, or will the transaction protect from that possibility? (note: I'm using PostgreSQL 9.x)
without locking the whole table on every insert/update, no. The way transactions work on PostgreSQL means that new rows that appear as a result of concurrent transactions never conflict with each other; and thats exactly what would be happening.
You need to make sure that updates actually cause the same rows to conflict. You would basically need to implement something similar to the mechanic used by PostgreSQL's native sequences.
What I would do is add another column to the table referenced by your client column to represent the last_val of the sequence's you'll be using. So each transaction would look sort of like this:
BEGIN;
SET TRANSACTION SERIALIZABLE;
UPDATE clients
SET footable_last_val = footable_last_val + 1
WHERE clients.id = :client_id;
INSERT INTO footable(somecol, client_id, number)
VALUES (:somevalue,
:client_id,
(SELECT footable_last_val
FROM clients
WHERE clients.id = :client_id));
COMMIT;
So that the first update into the clients table fails due to a version conflict before reaching the insert.
You do have to worry about duplicate numbers.
The typical problematic scenario is: transaction T1 reads N, and creates a new row with N+1. But before T1 commits, another transaction T2 sees N as the max for this client and creates another new row with N+1 => conflict.
There are many ways to avoid this; here is a simple piece of plpgsql code that implements one method, assuming a unique index on (client,number). The solution is to let the inserts run concurrently but in the event of a unique index violation, retry with refreshed values until it's accepted (it's not a busy loop, though, since concurrent inserts are blocked until other transactions commit)
do
$$
begin
loop
BEGIN
-- client number is assumed to be 1234 for the sake of simplicity
insert into the_table(client,number)
select 1234, 1+coalesce(max(number),0) from the_table where client=1234;
exit;
EXCEPTION
when unique_violation then -- nothing (keep looping)
END;
end loop;
end$$;
This example is a bit similar to the UPSERT implementation from the PG documentation.
It's easily transferable into a plpgsql function taking the client id as input.
In SQL Server, if a transaction involving the inserting of a new row gets rolled back, a number is skipped in the identity field.
For example, if the highest ID in the Foos table is 99, then we try to insert a new Foo record but roll back, then ID 100 gets 'used up' and the next Foo row will be numbered 101.
Is there any way this behaviour can be changed so that identity fields are guaranteed to be sequential?
What you are after will never work with identity columns.
They are designed to "give out" and forget, by-design so that they don't cause waits or deadlocks etc. The property allows IDENTITY columns to be used as a sequence within a highly transactional system with no delay or bottlenecks.
To make sure that there are no gaps means that there is NO WAY to implement a 100-insert per second system because there would be a very long queue to figure out if the 1st insert was going to be rolled back.
For the same reason, you normally do not want this behaviour, nor such a number sequence for a high volume table. However, for very infrequent, single-process tables (such as invoice number by a single process monthly), it is acceptable to put a transaction around a MAX(number)+1 or similar query, e.g.
declare #next int
update sequence_for_tbl set #next=next=next+1
.. use #next
SQL Identity (autonumber) is Incremented Even with a Transaction Rollback
Is there a way to keep 2 sequences synchronized in Postgres?
I mean if I have:
table_A_id_seq = 1
table_B_id_seq = 1
if I execute SELECT nextval('table_A_id_seq'::regclass)
I want that table_B_id_seq takes the same value of table_A_id_seq
and obviously it must be the same on the other side.
I need 2 different sequences because I have to hack some constraints I have in Django (and that I cannot solve there).
The two tables must be related in some way? I would encapsulate that relationship in a lookup table containing the sequence and then replace the two tables you expect to be handling with views that use the lookup table.
Just use one sequence for both tables. You can't keep them in sync unless you always sync them again and over again. Sequences are not transaction safe, they always roll forwards, never backwards, not even by ROLLBACK.
Edit: one sequence is also not going to work, doesn't give you the same number for both tables. Use a subquery to get the correct number and use just a single sequence for a single table. The other table has to use the subquery.
My first thought when seeing this is why do you really want to do this? This smells a little spoiled, kinda like milk does after being a few days expired.
What is the scenario that requires that these two seq stay at the same value?
Ignoring the "this seems a bit odd" feelings I'm getting in my stomach you could try this:
Put a trigger on table_a that does this on insert.
--set b seq to the value of a.
select setval('table_b_seq',currval('table_a_seq'));
The problem with this approach is that is assumes only a insert into table_a will change the table_a_seq value and nothing else will be incrementing table_a_seq. If you can live with that this may work in a really hackish fashion that I wouldn't release to production if it was my call.
If you really need this, to make it more robust make a single interface to increment table_a_seq such as a function. And only allow manipulation of table_a_seq via this function. That way there is one interface to increment table_a_seq and you should also put
select setval('table_b_seq',currval('table_a_seq')); into that function. That way no matter what, table_b_seq will always be set to be equal to table_a_seq. That means removing any grants to the users to table_a_seq and only granting them execute grant on the new function.
You could put an INSERT trigger on Table_A that executes some code that increases Table_B's sequence. Now, every time you insert a new row into Table_A, it will fire off that trigger.