Ensuring unique numbers from a sql server database - sql-server

I have an application that uses incident numbers (amongst other types of numbers). These numbers are stored in a table called "Number_Setup", which contains the current value of the counter.
When the app generates a new incident, it number_setup table and gets the required number counter row (counters can be reset daily, weekly, etc and are stored as int's). It then incremenets the counter and updates the row with the new value.
The application is multiuser (approximately 100 users at any one time, as well as sql jobs that run and grab 100's of incident records and request incident numbers for each). The incident table has some duplicate incident numbers where they should not be duplicate.
A stored proc is used to retrieve the next counter.
SELECT #Counter = counter, #ShareId=share_id, #Id=id
FROM Number_Setup
WHERE LinkTo_ID=#LinkToId
AND Counter_Type='I'
IF isnull(#ShareId,0) > 0
BEGIN
-- use parent counter
SELECT #Counter = counter, #ID=id
FROM Number_Setup
WHERE Id=#ShareID
END
SELECT #NewCounter = #Counter + 1
UPDATE Number_Setup SET Counter = #NewCounter
WHERE id=#Id
I've now surrounded that block with a transaction, but I'm not entirely sure it' will 100% fix the problem, as I think there's still shared locks, so the counter can be read anyway.
Perhaps I can check that the counter hasn't been updated, in the update statement
UPDATE Number_Setup SET Counter = #NewCounter
WHERE Counter = #Counter
IF ##ERROR = 0 AND ##ROWCOUNT > 0
COMMIT TRANSACTION
ELSE
ROLLBACK TRANSACTION
I'm sure this is a common problem with invoice numbers in financial apps etc.
I cannot put the logic in code either and use locking at that level.
I've also locked at HOLDLOCK but I'm not sure of it's application. Should it be put on the two SELECT statements?
How can I ensure no duplicates are created?

The trick is to do the counter update and read in a single atomic operation:
UPDATE Number_Setup SET Counter = Counter+1
OUTPUT INSERTED.Counter
WHERE id=#Id;
This though does not assign the new counter to #NewCounter, but instead returns it as a result set to the client. If you have to assign it, use an intermediate table variable to output the new counter INTO:
declare #NewCounter int;
declare #tabCounter table (NewCounter int);
UPDATE Number_Setup SET Counter = Counter+1
OUTPUT INSERTED.Counter INTO #tabCounter (NewCounter)
WHERE id=#Id
SELECT #NewCounter = NewCounter FROM #tabCounter;
This solves the problem of making the Counter increment atomic. You still have other race conditions in your procedure because the LinkTo_Id and share_id can still be updated after the first select so you can increment the counter of the wrong link-to item, but that cannot be solved just from this code sample as it depends also on the code that actualy updates the shared_id and/or LinkTo_Id.
BTW you should get into the habbit of name your fields with consistent case. If they are named consistently then you must use the exact match case in T-SQL code. Your scripts run fine now just because you have a case insensitive collation server, if you deploy on a case sensitive collation server and your scripts don't match the exact case of the field/tables names errors will follow galore.

have you tried using GUIDs instead of autoincrements as your unique identifier?

If you have the ablity to modify your job that gets mutiple records, I would change the thinking so that your counter is an identity column. Then when you get the next record you can just do an insert and get the ##identity of the table. That would ensure that you get the biggest number. You would also have to do a dbccReseed to reset the counter instead of just updating the table when you want to reset the identity. The only issue is that you'd have to do 100 or so inserts as part of your sql job to get a group of identities. That may be too much overhead but using an identity column is a guarenteed way to get unique numbers.

I might be missing something, but it seems like you are trying to reinvent technology that has already been solved by most databases.
instead of reading and updating from the 'Counter' column in the Number_Setup table, why don't you just use an autoincrementing primary key for your counter? You'll never have a duplicate value for a primary key.

Related

Insert from select or update from select with commit every 1M records

I've already seen a dozen such questions but most of them get answers that doesn't apply to my case.
First off - the database is am trying to get the data from has a very slow network and is connected to using VPN.
I am accessing it through a database link.
I have full write/read access on my schema tables but I don't have DBA rights so I can't create dumps and I don't have grants for creation new tables etc.
I've been trying to get the database locally and all is well except for one table.
It has 6.5 million records and 16 columns.
There was no problem getting 14 of them but the remaining two are Clobs with huge XML in them.
The data transfer is so slow it is painful.
I tried
insert based on select
insert all 14 then update the other 2
create table as
insert based on select conditional so I get only so many records and manually commit
The issue is mainly that the connection is lost before the transaction finishes (or power loss or VPN drops or random error etc) and all the GBs that have been downloaded are discarded.
As I said I tried putting conditionals so I get a few records but even this is a bit random and requires focus from me.
Something like :
Insert into TableA
Select * from TableA#DB_RemoteDB1
WHERE CREATION_DATE BETWEEN to_date('01-Jan-2016') AND to_date('31-DEC-2016')
Sometimes it works sometimes it doesn't. Just after a few GBs Toad is stuck running but when I look at its throughput it is 0KB/s or a few Bytes/s.
What I am looking for is a loop or a cursor that can be used to get maybe 100000 or a 1000000 at a time - commit it then go for the rest until it is done.
This is a one time operation that I am doing as we need the data locally for testing - so I don't care if it is inefficient as long as the data is brought in in chunks and a commit saves me from retrieving it again.
I can count already about 15GBs of failed downloads I've done over the last 3 days and my local table still has 0 records as all my attempts have failed.
Server: Oracle 11g
Local: Oracle 11g
Attempted Clients: Toad/Sql Dev/dbForge Studio
Thanks.
You could do something like:
begin
loop
insert into tablea
select * from tablea#DB_RemoteDB1 a_remote
where not exists (select null from tablea where id = a_remote.id)
and rownum <= 100000; -- or whatever number makes sense for you
exit when sql%rowcount = 0;
commit;
end loop;
end;
/
This assumes that there is a primary/unique key you can use to check if a row int he remote table already exists in the local one - in this example I've used a vague ID column, but replace that with your actual key column(s).
For each iteration of the loop it will identify rows in the remote table which do not exist in the local table - which may be slow, but you've said performance isn't a priority here - and then, via rownum, limit the number of rows being inserted to a manageable subset.
The loop then terminates when no rows are inserted, which means there are no rows left in the remote table that don't exist locally.
This should be restartable, due to the commit and where not exists check. This isn't usually a good approach - as it kind of breaks normal transaction handling - but as a one off and with your network issues/constraints it may be necessary.
Toad is right, using bulk collect would be (probably significantly) faster in general as the query isn't repeated each time around the loop:
declare
cursor l_cur is
select * from tablea#dblink3 a_remote
where not exists (select null from tablea where id = a_remote.id);
type t_tab is table of l_cur%rowtype;
l_tab t_tab;
begin
open l_cur;
loop
fetch l_cur bulk collect into l_tab limit 100000;
forall i in 1..l_tab.count
insert into tablea values l_tab(i);
commit;
exit when l_cur%notfound;
end loop;
close l_cur;
end;
/
This time you would change the limit 100000 to whatever number you think sensible. There is a trade-off here though, as the PL/SQL table will consume memory, so you may need to experiment a bit to pick that value - you could get errors or affect other users if it's too high. Lower is less of a problem here, except the bulk inserts become slightly less efficient.
But because you have a CLOB column (holding your XML) this won't work for you, as #BobC pointed out; the insert ... select is supported over a DB link, but the collection version will get an error from the fetch:
ORA-22992: cannot use LOB locators selected from remote tables
ORA-06512: at line 10
22992. 00000 - "cannot use LOB locators selected from remote tables"
*Cause: A remote LOB column cannot be referenced.
*Action: Remove references to LOBs in remote tables.

SQL Server : incrementing non identity int column by procedure call

I have a column in DB table which has to be increment when let's say some item is selected. But it can be selected parallel and for any records it has to start from 0. My solution is to increment the value from DB procedure, but can I be sure that the first procedure manages to increment the value before another procedure want to load the value to increment? I mean:
t0 Value is 10
t1 Procedure1 valueToInc = Value
t2 Procedure2 valueToInc = Value
t3 Procedure1 valueToInc ++
t4 Procedure2 valueToInc ++
t5 Value = 11
t6 Value = 11
Value written back from Procedure1 is 11 but from Procedure2 is obviously also 11 (need to secure 12 there).
I have also checked identity (property) and sequence (Transact-SQL) but nothing seems to be suitable for me.
Edit
What I´m trying to solve is that I have a console application - TCP server and MSSQL database, where I have a User table. Each time the single user wants to login, I have to increment users loginCount field. Any parallelization here should not be possible or is manageable from code, I know, but it was told me that I have to hande parallel acces by database, so not just to use update query. I have it as job interview project...
I wanted to make understanding easier by my first explanation, but it won´t work.
You can just use
UPDATE Users
SET LoginCount = ISNULL(LoginCount,0) + 1
WHERE UserId = #UserId
This is entirely safe under conditions of concurrency.
Use a transaction with transaction isolation level equal to SERIALIZABLE.
SERIALIZABLE
Statements cannot read data that has been modified but not yet committed by other transactions.
No other transactions can modify data that has been read by the current transaction until the current transaction completes.
Other transactions cannot insert new rows with key values that would fall in the range of keys read by any statements in the current transaction until the current transaction completes.
Don't load the Value to increment it: increment it, then select it (within the transaction). This will lock the table/row (depending) from updates/selects of other transactions.

safe tsql Numbering int techniques

Hello stackOverflowers !
i was wondering if theres a way to get in a safe way, series of numbers in transactions just like the identity.My only purpose is for grouping rows in tables and i don't mean row_number().
i've came up with this simple query, is this safe?
this table has its own identity key
declare #mynextSecuenceNumber int
select #mynextSecuenceNumber=isnull(max(secuencenumber+1),1) from mytable
insert into mytable (productID,customer,secuencenumber) values (#someval,#anotherval,#mynextSecuenceNumber)
EDIT
THE BACKGROUND
the reason for doing this is the next:
first i'm recieving autoparts for car services then i generate a ticket for that recepcion(i can recieve one,two,three auto parts) later on i can continue on reciving autoparts for that specific car service from the same autopart provider or from a different provider but i want to be able to re generate the event or the ticket otherwise i'll end up querying the service and all the the autoparts associated with that or the provider and i wont know the event what i recived in that operation and on top of that i need another specific id for all those autoparts associated with that car service.
by the way i'm on sql server 2008
heads up
using identity as secuence number can be messy cus transactions will increment the value after rolling back and other issues so be aware of that thanks to the approach privided as my acepted answer i can find another way who gets along with transactions its the first to appear on the link
Here's a scalable recommendation from Microsoft when SQL 2012 or higher isn't an option, but you need to manage sequence numbers without identities in the target table. This creates a separate table to track the sequence numbers, let's identity do some of the heavy lifting, and keeps the table size minimal by cleaning up. If this load becomes too much, you can schedule something for the cleanup during off-peak time.
-- Create table for tracking the sequence
create table <tablename> (
SeqID int identity(1,1) primary key
)
GO
-- Create procedure to return next sequence value
create procedure GetNewSeqVal_<tablename>
#NextValue int output
as
begin
declare #NewSeqValue int
set nocount on
insert into <tablename> DEFAULT VALUES
set #NewSeqValue = scope_identity()
delete from <tablename> with (readpast)
set #NextValue = #NewSeqValue
end
go
-- Get next sequence
declare #seqId int
exec GetNewSeqVal_<tablename> #NextValue = #seqId OUTPUT
For more info: http://blogs.msdn.com/b/sqlcat/archive/2006/04/10/sql-server-sequence-number.aspx

SQL Server - Neutralizing a trigger during SP execution

I have two tables, Orders and App.
App is a "helper" table which is populated according to Orders, and then passes the information on via web service to smart phones.
In order to populate App, we have created a parameterized stored procedure which runs at specific times, fluidly passing data from Orders to App.
But some updates to Orders are not caught by this stored procedure, so we were asked to create a trigger on Orders which executes this SP in these specific instances. This, too, works fine.
The problem starts when updates arrive from smart phones to the table App. The same parameterized SP runs "in reverse" to update the fields in Orders, and this works well - except that doing so can fire our supposedly selective trigger, resulting in redundant updates. To demonstrate:
New row in Orders > SP > Row is written in App > App updated by application > SP > Corresponding row in Orders is updated > Trigger catches this update, firing the SP again.
In this chain, only the last step is a problem.
I have tried using DISABLE TRIGGER and ENABLE TRIGGER within the SP to avoid this problem, but this is risky business and certainly cannot be the best possible way.
The solution I'm working on now is by using a field which is updated during application updates to Orders, but is not updated at any other time. For instance:
UPDATE Orders
SET Orders.StartTime = getdate(),
Orders.EndTime = CASE ... END,
Orders.Unique_Field = X
WHERE Orders.ID = #APPID
In standard updates to Orders, the field Unique_Field is not included in any INSERT or UPDATE statements. However, in some updates from App, this field may remain NULL.
My question is: What is the proper and safe way to tell my trigger to ignore any updates that arrive from my SP?
At present, my trigger looks like this:
AFTER UPDATE, INSERT
NOT FOR REPLICATION
AS
BEGIN
DECLARE #BUILDORDERCHECK AS DATETIME
DECLARE #ORDERDATECHECK AS DATETIME
DECLARE #ORDERNO AS INT
DECLARE #CHECKER AS TINYINT
SELECT #BUILDORDERCHECK = I.UpdateRecordDate,
#ORDERDATECHECK = I.OrderDate,
#ORDERNO = I.OrderNo,
#CHECKER = CASE WHEN NOT EXISTS (SELECT Unique_Field FROM Inserted) THEN 1 ELSE 0 END
FROM Inserted I
IF #BUILDORDERCHECK IS NOT NULL
AND #ORDERDATECHECK >= dateadd(day,-2,getdate())
AND #CHECKER = 1
-- Does not fire from BuildOrder
-- Does not fire on tasks older than 2 days
BEGIN
EXECUTE [dbo].[Asp_Apper;1] 0, -- CallCode, DO NOT CHANGE
1, -- Auto,
1, -- AOK,
0, -- CancelMsg,
0, -- TrailerNo
1 -- RejectMsg
END
END
#BUILDORDERCHECK and #ORDERDATECHECK work fine and behave as expected, but I need to find the right way to tell my trigger to check and see if Unique_Field was included in the update statement without being entangled by NULLS. As I said, Unique_Field can be updated by the SP to a value of NULL, so simply checking for NULL doesn't work.
Thanking you all in advance for any thoughts...
EDIT: It's already been pointed out that this trigger seems to ignore cases where more than one row is updated, which is accurate. Usually, we wouldn't build triggers like this; but in this case, updates to Orders are only ever row-by-row, and never in groups. The only time that this isn't the case is when the SP runs, which we want to ignore anyway.
I would use the CONTEXT_INFO and SET CONTEXT_INFO, something like this:
In the trigger, add a check at the top that bails out if a particular context value is set:
IF ISNULL(CONTEXT_INFO(),0x0) = 0x49204C696B6520426967204275747473
RETURN
And then in the (parts of) the stored procedures where you want to take actions that are ignored, just set that same value:
SET CONTEXT_INFO 0x49204C696B6520426967204275747473;
--Code that shouldn't cause the trigger to fire
SET CONTEXT_INFO 0x0
Which keeps things nicely contained (unlike disabling the trigger which has global effects)
Also, I know you've already stated in comments that this trigger only needs to work for single row update but it would be an automatic failure in code review for me for any trigger that doesn't properly deal with multiple rows existing in inserted (or at the very least, checks the number of rows and gives a clear error message if the requirement of single row updates hasn't been fulfilled)

Stored Procedure - loop through results without cursor

Everywhere I look I see that in order to loop through results you have to use a cursor and in the same post someone saying cursors are bad don't use them (which has always been my philosophy) but now I am stuck. I need to loop through a result set!
Here's the situation. I need to come up with a list of ProductIDs that have 2 different statuses set to a specific value. I start the stored procedure, run the query that finds my products that meet the criteria.
So, now I have a list of ProductIDs that I need to run through my validation process:
16050
16052
41817
48255
Now I need for each of those products (there may be 1 there may be 1000, i don't know) to check a whole list of conditions:
Is a specific field = 'SIMPLE'? if so, perform a bunch of other queries and make sure everything is good
If it is not 'SIMPLE' then run a whole other set of queries and make sure that information is all good.
Is another field = 'YES'? if so, perform a bunch of other queries, if it is not, then do other queries.
Is a cursor what I need to use? Is there some other way to do what I need that I just am not seeing?
Thanks,
Leslie
I ended up using a WHILE loop that I can pass each ProductID into a series of checks!!
declare #counter int
declare #productKey varchar(20)
SET #counter = (select COUNT(*) from ##Magento)
while (1=1)
begin
SET #productKey = (select top 1 ProductKey from ##Magento)
print #productKey;
delete from ##Magento Where ProductKey = #productKey
SET #counter-=1;
IF (#counter=0) BREAK;
end
go
It's hard to say without knowing the specifics of your process, but one approach is to create a function that performs your logic and call that.
eg:
delete from yourtable
where productid in (select ProductID from FilteredProducts)
and dbo.ShouldBeDeletedFunction(ProductID) = 1
In general, cursors are bad, but there are always exceptions. Try to avoid them by thinking in terms of sets, rather than the attributes of an individual record.

Resources