Most of my databases use IDENTITY columns as primary keys. I am writing a change log/audit trail in the database and want to use the ID BIGINT to keep trap of the changes sequentially.
While BIGINT is pretty big, it will run out of numbers one day and my design will fail to function properly at that point. I have been aware of this problem with my other ID columns and intended to eventually convert to GUIDs/UUIDs as I have used on Postgres in the past.
GUIDs take 16 bytes and BIGINT takes 8. For my current task, I would like to stay with BIGINT for the space savings and the sequencing. Under Postgres, I created a custom sequence with the first two digits as the current year and a fixed number of digits as the sequence within the year. The sequence generator automatically reset the sequence when the year changed.
SQL Server 2008 has no sequence generator. My research has turned up some ideas most of which involve using a table to maintain the sequence number, updating that within a transaction, and then using that to assign to my data in a separate transaction.
I want to write an SP or function that will update the sequence and return me the new value when called from a trigger on the target table before a row is written. There are many ideas but all seem to talk about locking issues and isolation problems.
Does anyone have a suggestion on how to automate this ID assignment, protect the process from assigning duplicates in a concurrent write, and prevent lock latency issues?
The stored procedure is prone to issues like blocking and deadlocks. However, there are ways around that.
For now, why not start the ID off at the bottom of the negative range?
CREATE TABLE FOO
(
ID BIGINT IDENTITY(-9223372036854775808, 1)
)
That gives you a range from -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807.
Are you really going to eat up 2^63 + 2^63 numbers?
If you are still committed to the other solution, I can give you a piece of working code. However, application locks and Serializable isolation has to be used.
It is still prone to timeouts or blocking depending upon the timeout setting and the server load.
In short, 2012 introduced sequences. That is basically what you want.
Related
I'm teaching a Microsoft SQL Server course and a student asked if there was a practical reason why we would ever set the auto increment as something other than 1,1 (seed starting at 1, increment by 1). Is there a practical reason why we would ever set the seed to something higher or increment by a value other than 1? Shouldn't it not matter as long as that value is unique?
If there is no practical reason, why do we have the option to set those values for Identity in Microsoft SQL Server?
There are a lot of practical reasons for having a configurable start value :
You may want to insert a few predefined records with well-known IDs, eg Missing, Unknown and Not Applicable records in a dimension or lookup table should probably have predefined IDs. New rows should get IDs outside the range of the predefined numbers.
After loading or replicating data with existing ID values, new records should get IDs that don't conflict with the imported data. The easiest way to do this is by setting the starting ID somewhere above the maximum imported ID.
TRUNCATE TABLE resets the IDENTITY value. To avoid generating duplicate IDs you need to reseed the table with DBCC CHECKIDENT and set the current value to something other than 1
There are certainly dozens of other reasons.
If you are using a signed integer and start at 1, you're only using half the available range. Default should really be to start at minimum for the data type, for example -2 billion for 32 bit int.
In some cases, you may want to combine data from multiple tables. In that case each table should keep a separate range of ids. You could have each table start at a different number to prevent overlaps. For example start one at 1 and another one at 1 billion. Or you could use odd numbers for one (1,2) and even numbers for the other (2,2).
I'm making a program that has two different tables(well more, but those are the ones I have an issue with). One called SYNPaymentHistory, and the other one called OTHERSPaymentHistory
They have almost the same columns, except the SYNPaymentHistory includes an "ID" number, for each Syndicate. The Other's table is for any random payment the company receives other than from the Syndicates
I made a page in which a person fills out a payment application, and when all that is done, it should print out a receipt. Receipts have a SerialNb, which is a Column found in both tables (it's an INT column with Identity Specification that increases by 1, on every input).
My issue is that I want the SerialNb to be synchronized between both of them.
Ex: say I just filled out a payment application from a Syndicate, SerialNb on the top of the receipt should say 5001. If I want to fill a payment application from tickets the company earned due to a party, I'd want that receipt to have the SerialNb of 5002.
Is there some way to link 2 columns that are from 2 different tables? I think a WHILE Loop can half-solve the issue, if one of them auto-increases by 1, and the other has a WHILE loop that, if i = SYNHistoryPayment.SerialNb, then i = i + 1 (i being OTHERSHistoryPayment) but it wouldn't work out the other way, because SYNHistoryPayment would end up not caring about OTHERSHistoryPayment's values.
Is it, in any way possible, related to diagrams? I couldn't properly understand the usage of diagrams so I'm hoping that's not the way to work it through.
If you need any additional information, I'd love to Edit in the info needed.
Programs used:
Visual Studio 2015,
SQL Server Management Studio 2014
SQL Server has a feature to address just this issue. It is called a Sequence. You create a sequence and at any time you can sql server to give you the next value in the sequence. Each request is guaranteed to have a unique result in increasing order.
Create a sequence -- when you want to make a row in either table get the next value from the sequence and use that.
In addition to solving your problem using a sequence also solves the problem of multiple instances of an application running at the same time. You don't have to worry -- each instance of an application will get it's own number.
https://msdn.microsoft.com/en-us/library/ff878091.aspx
Requirement:
To count the number of times a procedure has executed
From what I understand so far, sys.dm_exec_procedure_stats can be used for approximate count but that's only since the last service restart. I found this link on this website relevant but I need count to be precise and that should not flush off after the service restart.
Can I have some pointers on this, please?
Hack: The procedure I need to keep track of has a select statement so returns some rows that are stored in a permanent table called Results. The simplest solution I can think of is to create a column in Results table to keep track of the procedure execution, select the maximum value from this column before the insert and add one to it to increment the count. This solution seems quite stupid to me as well but the best I could think of.
What I thought is you could create a sequence object, assuming you're on SQL Server 2012 or newer.
CREATE SEQUENCE ProcXXXCounter
AS int
START WITH 1
INCREMENT BY 1 ;
And then in the procedure fetch a value from it:
declare #CallCount int
select #CallCount = NEXT VALUE FOR ProcXXXCounter
There is of course a small overhead with this, but doesn't cause similar blocking issue that could happen with using a table because sequences are handled outside transaction.
Sequence parameters: https://msdn.microsoft.com/en-us/library/ff878091.aspx
The only way I can think of keeping track of number of executions even when the service has restarted , is to have a table in your database and insert a row to that table inside your procedure everytime it is executed.
Maybe add a datetime column as well to collect more info about the execution. And a column for user who executed etc..
This can be done, easily and without Enterprise Edition, by using extended events. The sqlserver.module_end event will fire, set your predicates correctly and use a histogram target.
http://sqlperformance.com/2014/06/extended-events/predicate-order-matters
https://technet.microsoft.com/en-us/library/ff878023(v=sql.110).aspx
To consume the value, query the histogram target (under the reviewing target output examples).
In SQL Server, if a transaction involving the inserting of a new row gets rolled back, a number is skipped in the identity field.
For example, if the highest ID in the Foos table is 99, then we try to insert a new Foo record but roll back, then ID 100 gets 'used up' and the next Foo row will be numbered 101.
Is there any way this behaviour can be changed so that identity fields are guaranteed to be sequential?
What you are after will never work with identity columns.
They are designed to "give out" and forget, by-design so that they don't cause waits or deadlocks etc. The property allows IDENTITY columns to be used as a sequence within a highly transactional system with no delay or bottlenecks.
To make sure that there are no gaps means that there is NO WAY to implement a 100-insert per second system because there would be a very long queue to figure out if the 1st insert was going to be rolled back.
For the same reason, you normally do not want this behaviour, nor such a number sequence for a high volume table. However, for very infrequent, single-process tables (such as invoice number by a single process monthly), it is acceptable to put a transaction around a MAX(number)+1 or similar query, e.g.
declare #next int
update sequence_for_tbl set #next=next=next+1
.. use #next
SQL Identity (autonumber) is Incremented Even with a Transaction Rollback
In Oracle there is a mechanism to generate sequence numbers e.g.;
CREATE SEQUENCE supplier_seq
MINVALUE 1
MAXVALUE 999999999999999999999999999
START WITH 1
INCREMENT BY 1
CACHE 20;
And then execute the statement
supplier_seq.nextval
to retrieve the next sequence number.
How would you create the same functionality in MS SQL Server ?
Edit: I'm not looking for ways to automaticly generate keys for table records. I need to generate a unique value that I can use as an (logical) ID for a process. So I need the exact functionality that Oracle provides.
There is no exact match.
The equivalent is IDENTITY that you can set as a datatype while creating a table. SQLSERVER will automatically create a running sequence number during insert.
The last inserted value can be obtained by calling SCOPE_IDENTITY() or by consulting the system variable ##IDENTITY (as pointed out by Frans)
If you need the exact equivalent, you would need to create a table and then write a procedure to retun the next value and other operations. See Marks response on pitfalls on this.
Edit:
SQL Server has implemented the Sequence similar to the Oracle. Please refer to this question for more details.
How would you implement sequences in Microsoft SQL Server?
Identity is the best and most scalable solution, BUT, if you need a sequence that is not an incrementing int, like 00A, 00B, 00C, or some special sequence, there is a second-best method. If implemented correctly, it scales OK, but if implemented badly, it scales badly. I hesitate to recommend it, but what you do is:
You have to store the "next value" in a table. The table can be a simple, one row, one column table with just that value. If you have several sequences, they can share the table, but you might get less contention by having separate tables for each.
You need to write a single update statement that will increment that value by 1 interval. You can put the update in a stored proc to make it simple to use and prevent repeating it in code in different places.
Using the sequence correctly, so that it will scale reasonably (no, not as well as Identitiy :-) requires two things: a. the update statement has a special syntax made for this exact problem that will both increment and return the value in a single statement; b. you have to fetch the value from the custom sequence BEFORE the start of a transaction and outside the transaction scope. That is one reason Identity scales -- it returns a new value irrespective of transaction scope, for any attempted insert, but does not roll back on failure. That means that it won't block, and also means you'll have gaps for failed transactions.
The special update syntax varies a little by version, but the gist is that you do an assignment to a variable and the update in the same statement. For 2008, Itzik Ben-Gan has this neat solution: http://www.sqlmag.com/Articles/ArticleID/101339/101339.html?Ad=1
The old-school 2000 and later method looks like this:
UPDATE SequenceTable SET #localVar = value = value + 5
-- change the tail end to your increment logic
This will both increment and return you the next value.
If you absolutely cannot have gaps (resist that requirement :-) then it is technically possible to put that update or proc in side the rest of your trnsaction, but you take a BIG concurrency hit as every insert waits for the prior one to commit.
I can't take credit on this; I learned it all from Itzik.
make the field an Identity field. The field will get its value automatically. You can obtain the last inserted value by calling SCOPE_IDENTITY() or by consulting the system variable ##IDENTITY
The SCOPE_IDENTITY() function is preferred.
As DHeer said there is absolutely no exact match. If you try to build your own procedure to do this you will invariably stop your application from scaling.
Oracle's sequences are highly scalable.
OK, I take it back slightly. If you're really willing to focus on concurrency and you're willing to take numbers out of order as is possible with a sequence, you have a chance. But since you seem rather unfamiliar with t-sql to begin with, I would start to look for some other options when (porting an Oracle app to MSSS - is that what you're doing)
For instance, just generate a GUID in the "nextval" function. That would scale.
Oh and DO NOT use a table for all the values, just to persist your max value in the cache. You'd have to lock it to ensure you give unique values and this is where you'll stop scaling. You'll have to figure out if there's a way to cache values in memory and programmatic access to some sort of lightweight locks- memory locks, not table locks.
I wish that SQL Server had this feature. It would make so many things easier.
Here is how I have gotten around this.
Create a table called tblIdentities. In this table put a row with your min and max values and how often the Sequence number should be reset. Also put the name of a new table (call it tblMySeqNum). Doing this makes adding more Sequence Number generators later fairly easy.
tblMySeqNum has two columns. ID (which is an int identity) and InsertDate (which is a date time column with a default value of GetDate()).
When you need a new seq num, call a sproc that inserts into this table and use SCOPE_IDENTITY() to get the identity created. Make sure you have not exceeded the max in tblIdentities. If you have then return an error. If not return your Sequence Number.
Now, to reset and clean up. Have a job that runs as regularly as needed that checks all the tables listed in tblIdentites (just one for now) to see if they need to be reset. If they have hit the reset value or time, then call DBCC IDENT RESEED on the name of the table listed in the row (tblMySeqNum in this example). This is also a good time to clear our the extra rows that you don't really need in that table.
DON'T do the cleanup or reseeding in your sproc that gets the identity. If you do then your sequence number generator will not scale well at all.
As I said, it would make so many things easier of this feature was in SQL Server, but I have found that this work around functions fairly well.
Vaccano
If you are able to update to SQL Server 2012 you can use SEQUENCE objects. Even SQL Server 2012 Express has support for sequences.
CREATE SEQUENCE supplier_seq
AS DECIMAL(38)
MINVALUE 1
MAXVALUE 999999999999999999999999999
START WITH 1
INCREMENT BY 1
CACHE 20;
SELECT NEXT VALUE FOR supplier_seq
SELECT NEXT VALUE FOR supplier_seq
SELECT NEXT VALUE FOR supplier_seq
SELECT NEXT VALUE FOR supplier_seq
SELECT NEXT VALUE FOR supplier_seq
Results in:
---------------------------------------
1
(1 row(s) affected)
---------------------------------------
2
(1 row(s) affected)
---------------------------------------
3
(1 row(s) affected)
---------------------------------------
4
(1 row(s) affected)
---------------------------------------
5
(1 row(s) affected)
Just take care to specify the right data type. If I hadn't specified it the MAXVALUE you've provided wouldn't be accepted, that's why I've used DECIMAL with the highest precision possible.
More on SEQUENCES here: http://msdn.microsoft.com/en-us/library/ff878091.aspx
This might have already been answered a long time ago... but from SQL 2005 onwards you can use the ROW_NUMBER function... an example would be:
select ROW_NUMBER() OVER (ORDER BY productID) as DynamicRowNumber, xxxxxx,xxxxx
The OVER statement uses the ORDER BY for the unique primary key in my case...
Hope this helps... no more temp tables, or strange joins!!
Not really an answer, but it looks like sequences are coming to SQLServer in 2012.
http://www.sql-server-performance.com/2011/sequence-sql-server-2011/
Not an exact answer but addition to some existing answers
SCOPE_IDENTITY (Transact-SQL)
SCOPE_IDENTITY, IDENT_CURRENT, and ##IDENTITY are similar functions
because they return values that are inserted into identity columns.
IDENT_CURRENT is not limited by scope and session; it is limited to a
specified table. IDENT_CURRENT returns the value generated for a
specific table in any session and any scope. For more information, see
IDENT_CURRENT (Transact-SQL).
It means two different sessions can have a same identity value or sequence number so to avoid this and get unique number for all sessions use IDENT_CURRENT
Exactly because of this
IDENT_CURRENT is not limited by scope and session; it is limited to a specified table.
we need to use SCOPE_IDENTITY() because scope identity will give us unique number generated in our session, and uniqueness is provided by identity itself.