Best practice for writing with primary key constraint?

Best practice for writing with primary key constraint? - sql-server

Before each Update/Insert statement, should I :
IF...EXIST to test the primary key
Just let a transaction fail if primary key is already there (and rely on ##rowcount if I
have some logic related to primary key already being there)
TRY ... CATCH an error (raised by the Update/Insert statement itself or have a trigger test primary key and raise errors)
Other solutions ?
How do you write with primary key constraint ?

My preferred method for single-row upsert is:
BEGIN TRANSACTION;
UPDATE dbo.t WITH (HOLDLOCK, SERIALIZABLE)
SET ...
WHERE [key] = #key;
IF ##ROWCOUNT = 0
BEGIN
INSERT dbo.t ...
END
COMMIT TRANSACTION;
If you believe you will much more often be performing an insert, you can swap the logic around so you try that first:
BEGIN TRANSACTION;
INSERT dbo.t ...
SELECT #key, ...
WHERE NOT EXISTS
(
SELECT 1 FROM dbo.t WITH (UPDLOCK, SERIALIZABLE)
WHERE [key] = #key
);
IF ##ROWCOUNT = 0
BEGIN
UPDATE dbo.t SET val = #val WHERE [key] = #key;
END
COMMIT TRANSACTION;
Some background:
Please stop using this UPSERT anti-pattern
Checking for potential constraint violations before entering TRY/CATCH
So, you want to use MERGE, eh?

What you describe is often called an "UPSERT" (in case you need a Google term for further research).
We use MERGE statements, since they allow us to specify both actions in one statement.
However, the syntax is a bit complex and there are some gotchas (don't forget to use HOLDLOCK, etc.), so we have abstracted away the actual SQL generation into an InsertOrUpdate(table, fieldsAndValuesToUpdate, keyFieldsAndValues) helper method in our source code. This also allows us to change the implementation later, if required.
When writing SQL code manually, I use IF...EXISTS (inside a transaction and also with HOLDLOCK), since it's easier to read and easier to write.

That depends on the situation.
Suppose you would write an insert with a where not exists clause and when ##rowcount = 0 you would do an update because this row already seems to exist.
If this is the most performant way to do it, that depends on your data.
if you would know that for example in 80% of the cases the insert would succeed, then this approach would actually perform very good.
If it seems that most of the times an update is needed, then you could turn the code around, do the update and then check the ##rowcount.
This only works off course if you can determine before you start if you will have mostly updates or mostly inserts.
The advantage of this method (certainly when you do update first) is that you do not need to check each row first with an if...exists first, you just do you insert/update and find out after if it worked or not. And because you know before that the insert or update will succeed most of the times, you gain performance

Related

Number of SQL operation in IF NOT EXISTS statement vs SQL server engine Primary key check in SQL Server [duplicate]

This question already has answers here:
Constraints check: TRY/CATCH vs Exists()
(5 answers)
Only inserting a row if it's not already there
(7 answers)
Closed 2 years ago.
Consider a table A with a primary key, column1, column2.
Now let column1 be the primary key which is of type uniqueidentifier.
Now if you want to avoid duplicate inserts into the table using a stored procedure, which would be the better way?
Case #1:
IF NOT EXISTS (SELECT 1 FROM TABLE A WHERE column1 = #insertValue)
BEGIN
INSERT INTO Table A
....
END
or case #2:
TRY
BEGIN
INSERT INTO TABLE A
END
CATCH
BEGIN
IF(ErrorNumber() = 2627)--Erro number of primary key violation
BEGIN
---------- deal with the error
END
END
I feel case #2 is better, because only 1 time SQL check operation is done, but in case#1 it is happening 2 times.
I understand when we create a primary key on a table, SQL checks for duplicates before insert, using this existing mechanism of SQL check for duplication isn't it good? instead of using IF NOT EXISTS
Please add your answer, please explain the answer in depth considering all the parameters
Guys please help me in understanding in terms of sql operation on a table. Objective here is to understand the internal mechanism of sql primary key violation check,

You can use the following block when data modifications are applied:
SET XACT_ABORT, NOCOUNT ON;
BEGIN TRY
BEGIN TRANSACTION;
-- CODE BLOCK GOES HERE
COMMIT TRANSACTION;
END TRY
BEGIN CATCH
IF ##TRANCOUNT > 0
BEGIN
ROLLBACK TRANSACTION
END;
-- GET ERRORS DETAILS OR THROW ERROR
END CATCH;
SET XACT_ABORT, NOCOUNT OFF;
In addition, if your primary key is UNIQUEIDENTIFER and you are generating the new values with NEWID() function you can be calm - the CATCH block will not be executed.
But, using such value as primary key is not good, because:
it takes more space than number types like (smallint, int, bigint)
in case NEWID() is used to generate the values, they will be inserted in random pages of your index which will lead to fragmentation; in case of IDENTITY column, the new rows are inserted in the last page; you can although fix this using NEWSEQUENTIALID

Lock database table for just a couple of sentences

Suppose a table in SQLServer with this structure:
TABLE t (Id INT PRIMARY KEY)
Then I have a stored procedure, which is constantly being called, that works inserting data in this table among other kind of things:
BEGIN TRAN
DECLARE #Id INT = SELECT MAX(Id) + 1 FROM t
INSERT t VALUES (#Id)
...
-- Stuff that gets a long time to get completed
...
COMMIT
The problem with this aproach is sometimes I get a primary key violation because 2 or more procedure calls get and try to insert the same Id on the table.
I have been able to solve this problem adding a tablock in the SELECT sentence:
DECLARE #Id INT = SELECT MAX(Id) + 1 FROM t WITH (TABLOCK)
The problem now is sucessive calls to the procedure must wait to the completion of the transaction currently beeing executed to start their work, allowing just one procedure to run simultaneosly.
Is there any advice or trick to get the lock just during the execution of the select and insert sentence?
Thanks.

TABLOCK is a terrible idea, since you're serialising all the calls (no concurrency).
Note that with an SP you will retain all the locks granted over the run until the SP completes.
So you want to minimise locks except for where you really need them.
Unless you have a special case, use an internally generated id:
CREATE TABLE t (Id INT IDENTITY PRIMARY KEY)
Improved performance, concurrency etc. since you are not dependent on external tables to manage the id.
If you have existing data you can (re)set the start value using DBCC
DBCC CHECKIDENT ('t', RESEED, 100)
If you need to inject rows with a value preassigned, use:
SET IDENTITY_INSERT t ON
(and off again afterwards, resetting the seed as required).
[Consider whether you want this value to be the primary key, or simply unique.
In many cases where you need to reference a tables PK as a FK then you'll want it as PK for simplicity of join, but having a business readable value (eg, Accounting Code or OrderNo+OrderLine is completely valid) : that's just modelling]

How to THROW exception in inner transaction without ending SQL Server stored procedure execution?

My objective is to throw an exception back to the caller but continue execution of the SQL Server stored procedure. So, in essence, what I'm trying to accomplish is a try..catch..finally block, even though SQL Server has no concept of a try..catch..finally block, to my knowledge.
I have a sample stored procedure to illustrate. It's just an example I came up with off the top of my head, so please don't pay too much attention to the table schema. Hopefully, you understand the gist of what I'm trying to carry out here. Anyway, the stored proc contains an explicit transaction that throws an exception within the catch block. There's further execution past the try..catch block but it's never executed, if THROW is executed. From what I understand, at least in SQL Server, THROW cannot distinguish between inner and outer transactions or nested transactions.
In this stored procedure, I have two tables: Tbl1 and Tbl2. Tbl1 has a primary key on Tbl1.ID. Tbl2 has a foreign key on EmpFK that maps to Tbl1.ID. EmpID has a unique constraint. No duplicate records can be inserted into Tbl1. Both Tbl1 and Tbl2 have primary key on ID and employ identity increment for auto-insertion. The stored proc has three input parameters, one of which is employeeID.
Within the inner transaction, a record is inserted in Tbl1 -- a new employee ID is added. If it fails, the idea is the transaction should gracefully error out but the stored proc should still continue running until completion. Whether table insert succeeds or fails, EmpID will be employed later to fill in EmpFk.
After the try..catch block, I perform a lookup of Tbl1.ID, via the employeeID parameter that's passed into the stored proc. Then, I insert a record into TBl2; Tbl1.ID is the value for Tbl2.EmpFK.
(And you might be asking "why use such a schema? Why not combine into one table with such a small dataset?" Again, this is just an example. It doesn't have to be employees. You can pick anything. It's just a widget. Imagine Tbl1 may contain a very, very large data set. What's set in stone is there are two tables which have a primary key / foreign key relationship.)
Here's the sample data set:
Tbl1
ID EmpID
1 AAA123
2 AAB123
3 AAC123
Tbl2
ID Role Location EmpFK
1 Junior NW 1
2 Senior NW 2
3 Manager NE 2
4 Sr Manager SE 3
5 Director SW 3
Here's the sample stored procedure:
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE PROCEDURE [dbo].[usp_TestProc]
#employeeID VARCHAR(10)
,#role VARCHAR(50)
,#location VARCHAR(50)
AS
BEGIN
SET NOCOUNT ON;
DECLARE #employeeFK INT;
BEGIN TRY
BEGIN TRANSACTION MYTRAN;
INSERT [Tbl1] (
[EmpID]
)
VALUES (
#employeeID
);
COMMIT TRANSACTION MYTRAN;
END TRY
BEGIN CATCH
IF ##TRANCOUNT > 0
BEGIN
ROLLBACK TRANSACTION MYTRAN;
END;
THROW; -- Raises exception, exiting stored procedure
END CATCH;
SELECT
#employeeFK = [ID]
FROM
[Tbl1]
WHERE
[EmpID] = #employeeID;
INSERT [Tbl2] (
[Role]
,[Location]
,[EmpFK]
)
VALUES (
#role
,#location
,#employeeFK
);
END;
So, again, I still want to return the error to the caller to, i.e. log the error, but I don't wish for it to stop stored procedure execution cold in its tracks. It should continue on very similarly to a try..catch..finally block. Can this be accomplished with THROW or I must use alternative means?
Maybe I'm mistaken but isn't THROW an upgraded version of RAISERROR and, going forward, we should employ the former for handling exceptions?
I've used RAISERROR in the past for these situations and it's suited me well. But THROW is a more simpler, elegant solution, imo, and may be better practice going forward. I'm not quite sure.
Thank you for your help in advance.

What's set in stone is there are two tables which have a primary key /
foreign key relationship.
Using THROW in an inner transaction is not the way to do what you want. Judging from your code, you want to insert a new employee, unless that employee already exists, and then, regardless of whether the employee already existed or not, you want to use that employee's PK/id in a second insert into a child table.
One way to do this is to split the logic. This is psuedocode for what I mean:
IF NOT EXISTS(Select employee with #employeeId)
INSERT the new employee
SELECT #employeeFK like you are doing.
INSERT into Table2 like you are doing.
If you still need to raise an error when an #employeeId that already exists is passed, you can put an ELSE after the IF, and populate a string variable, and at the end of the proc, if the variable was populated, then throw/raise an error.

How do I make Check Constraints run before statement

I am running into a problem where my check constraints are correctly stopping commands from executing but my Identity column value increases. I guess this is because the check occurs after the statement runs and the transaction gets rolled back due to the check failing. This leaves the identity value incremented by 1.
Is there a way to run the constraint check before the SQL statement gets executed?
CREATE TABLE TestTable
(
Id INT IDENTITY(1,1) PRIMARY KEY(Id),
Name VARCHAR(100)
)
INSERT INTO TestTable VALUES ('Type-1'),('Type-2'),('Type-55'),('Type-009')
--Add a check constraint so nobody can edit this without doing serious work
ALTER TABLE TestTable WITH NOCHECK ADD CONSTRAINT [CHECK_TestTable_READONLY] CHECK(1=0)
--This fails with the constraint as expected
INSERT INTO TestTable VALUES('This will Fail')
INSERT INTO TestTable VALUES('This will again....')
--Check the Id, it was incremented...
SELECT (IDENT_CURRENT( 'TestTable' ) ) As CurrentIdentity

When I had to do the same thing in the past I created a trigger that just threw an exception on insert and delete. this has several advantages, most importantly is that it prevents updates and deletes and you could give a custom exception message explaining what you did there and why, its an extremely bad habit to just put illogical constraints and hope that 3 months from now people would understand whats going on there and know they should ask you about it. It also prevents the Id counter from being incremented if its that important. If it is important, I would also not use auto increment and just set the ID number manually, since even if you are using these triggers you could always have an accidental syntax error or any other error after you disabled them and tried to add a value.
create trigger PreventChanges
on TestTable
FOR INSERT, UPDATE, DELETE
as
begin
throw 51000, 'DO NOT change anything in that table unless you really have to! in order to do so pleasae talk to GER (or just disable and reenable this trigger)',1
and

It sounds like you're intending to use the identity column for something it's not meant for. But to answer your question, could you not just manually code up some SQL Server IF statements to test your data before the insert happens (perhaps in a stored procedure)? I wouldn't know how to make this dynamic to 'fit all constraints on any table', but the process would do what you want - prevent the INSERT from firing. Though, if your constraints change, then you would have to change the procedure too.
e.g.,
IF 1 = 0 -- or use any of your constraints here...
BEGIN
-- nest more IFs if you have multiple check-constraints...
INSERT INTO TestTable
VALUES ('This will not increase your identity number since 1 does not equal 0')
END

TSQL ID generation

I have a question regarding locking in TSQL. Suppose I have a the following table:
A(int id, varchar name)
where id is the primary key, but is NOT an identity column.
I want to use the following pseudocode to insert a value into this table:
lock (A)
uniqueID = GenerateUniqueID()
insert into A values (uniqueID, somename)
unlock(A)
How can this be accomplished in terms of T-SQL? The computation of the next id should be done with the table A locked in order to avoid other sessions to do the same operation at the same time and get the same id.

If you have custom logic that you want to apply in generating the ids, wrap it up into a user defined function, and then use the user defined function as the default for the column. This should reduce concurrency issue similarly to the provided id generators by deferring the generation to the point of insert and piggy backing on the insert locking behavior.
create table ids (id int, somval varchar(20))
Go
Create function GenerateUniqueID()
returns int as
Begin
declare #ret int
select #ret = max(isnull(id,1)) * 2 from ids
if #ret is null set #ret = 2
return #ret
End
go
alter table ids add Constraint DF_IDS Default(dbo.GenerateUniqueID()) for Id

There are really only three ways to go about this.
Change the ID column to be an IDENTITY column where it auto increments by some value on each insert.
Change the ID column to be a GUID with a default constraint of NEWID() or NEWSEQUENTIALID(). Then you can insert your own value or let the table generate one for you on each insert.
On each insert, start a transaction. Then get the next available ID using something like select max(id)+1 . Do this in a single sql statement if possible in order to limit the possibility of a collision.
On the whole, most people prefer option 1. It's fast, easy to implement, and most people understand it.
I tend to go with option 2 with the apps I work on simply because we tend to scale out (and up) our databases. This means we routinely have apps with a multi-master situation. Be aware that using GUIDs as primary keys can mean your indexes are routinely trashed.
I'd stay away from option 3 unless you just don't have a choice. In which case I'd look at how the datamodel is structured anyway because there's bound to be something wrong.

You use the NEWID() function and you do not need any locking mechanism
You tell a column to be IDENTITY and you do not need any locking mechanism
If you generate these IDs manually and there is a chance parallel calls could generate the same IDs then something like this:
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
#NextID = GenerateUniqueID()
WHILE EXISTS (SELECT ID FROM A WHERE ID = #NextID)
BEGIN
#NextID = GenerateUniqueID()
END
INSERT INTO A (ID, Text) VALUES (#NextID , 'content')
COMMIT TRANSACTION

#Markus, you should look at using either IDENTITY or NEWID() as noted in the other answers. if you absolutely can't, here's an option for you...
DECLARE #NewID INT
BEGIN TRAN
SELECT #NewID = MAX(ID) + 1
FROM TableA (tablockx)
INSERT TableA
(ID, OtherFields)
VALUES (#NewID, OtherFields)
COMMIT TRAN

If you're using SQL2005+, you can use the OUTPUT clause to do what you're asking, without any kind of lock (The table Test1 simulates the table you're inserted into, and since OUTPUT requires a temp table and not a variable to hold the results, #Result will do that):
create table test1( test INT)
create table #result (LastValue INT)
insert into test1
output INSERTED.test into #result(test)
select GenerateUniqueID()
select LastValue from #result

Just to update an old post. It is now possible with SQL Server 2012 to use a feature called Sequence. Sequences are created in much the same way a function and it is possible to specify the range, direction(asc, desc) and rollover point. After which it's possible to invoke the NEXT VALUE FOR method to generate the next value in the range.
See the following documentation from Microsoft.
http://technet.microsoft.com/en-us/library/ff878091.aspx

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight