Detailed error message for violation of Primary Key constraint in sql2008? - sql-server

I'm inserting a large amount of rows into an empty table with a primary key constraint on one column.
If there is a duplicate key error, is there any way to find out the value of the key (or row) that caused the error?
Validating the data prior to the insert is sadly not something I can do right now.
Using SQL 2008.
Thanks!
Doing the count(*) / group by thing is something I'm trying to avoid, this is an insert of hundreds of millions of rows from hundreds of different DB's (some of which are on remote servers)...I don't have the time or space to do the insert twice.
The data is supposed to be unique from the providers, but unfortunately their validation doesn't seem to work correctly 100% of the time and I'm trying to at least see where it's failing so I can help them troubleshoot.
Thank you!

There's not a way of doing it that won't slow your process down, but here's one way that will make it easier. You can add an instead-of trigger on that table for inserts and updates. The trigger will check each record before inserting it and make sure it won't cause a primary key violation. You can even create a second table to catch violations, and have a different primary key (like an identity field) on that one, and the trigger will insert the rows into your error-catching table.
Here's an example of how the trigger can work:
CREATE TRIGGER mytrigger ON sometable
INSTEAD OF INSERT
AS BEGIN
INSERT INTO sometable SELECT * FROM inserted WHERE ISNUMERIC(somefield) = 1 FROM inserted;
INSERT INTO sometableRejects SELECT * FROM inserted WHERE ISNUMERIC(somefield) = 0 FROM inserted;
END
In that example, I'm checking a field to make sure it's numeric before I insert the data into the table. You'll need to modify that code to check for primary key violations instead - for example, you might join the INSERTED table to your own existing table and only insert rows where you don't find a match.

The solution would depend on how often this happens. If it's <10% of the time then I would do the following:
Insert the data
If error then do Bravax's revised solution (remove constraint, insert, find dup, report and kill dup, enable constraint).
This means it's only costing you on the few times an error occurs.
If this is happening more often then I'd look at sending the boys over to see the providers :-)

Revised:
Since you don't want to insert twice, could you:
Drop the primary key constraint.
Insert all data into the table
Find any duplicates, and remove them
Then re-add the primary key constraint
Previous reply:
Insert the data into a duplicate of the table without the primary key constraint.
Then run a query on it to determine rows which have duplicate values for the rpimary key column.
select count(*), <Primary Key>
from table
group by <Primary Key>
having count(*) > 1

Use SSIS to import the data and have it check for this as part of the data flow. That is the best way to handle. SSIS can send the bad records to a table (that you can later send to the vendor to help them clean up their act) and process the good ones.

I can't believe that SSIS does not easily address this "reality", because, let's face it, oftentimes you need and want to be able to:
See if a record exists with a certain unique or primary key
If it does not, insert it
If it does, either ignore it or update it.
I don't understand how they would let a product out the door without this capability built-in in an easy-to-use manner. Like, say, set an attribute of a component to automatically check this.

Related

Insert records from temp table where record is not already preset fails

I am trying to insert from temporary table into regular one but since there is data in temp table sharing the same values for a primary key of the table I am inserting to, it fails with primary key constraint being violated. That is expected so I am working around it by inserting only the rows that have the primary key not already present in table I am inserting to.
I tried both EXISTS and NOT IN approach, I checked examples showcasing both, confirmed both works in SQL server 2014 in general, yet I am still getting the following error:
Violation of PRIMARY KEY constraint 'PK_dbo.InsuranceObjects'. Cannot
insert duplicate key in object 'dbo.InsuranceObjects'. The duplicate
key value is (3835fd7c-53b7-4127-b013-59323ea35375).
Here is the SQL in NOT IN variance I tried:
print 'insert into InsuranceObjects'
INSERT INTO $(destinDB).InsuranceObjects
(
Id, Value, DefInsuranceObjectId
)
SELECT Id, InsuranceObjectsValue, DefInsuranceObjectId
FROM #vehicle v
WHERE v.Id NOT IN (SELECT Id FROM $(destinDB).InsuranceObjects) -- prevent error when running scrypt multiple times over
GO
If not apparent:
Id is the primary key in question.
$(destinDB) is command line variable. Different from TSQL variable.
It allows me to define the target database and instance in convenient
script based level or even multiple scripts based level. Its used in
multiple variations throughout the code and has so far performed
perfectly. The only downside is you have to run in CMD mode.
when creating all temp tables USE $(some database) is also used so
it's not an issue
I must be missing something completely obvious but it's driving me nuts that such a simple query fails. What is worse, when I try running select without insert part, it returns ALL the records from temp table despite me having confirmed there are duplicates that should fail the NOT IN part in where clause.
I suspect the issue is that you have duplicate ID values in your temp table. Please check the values there as it would cause the issue you are seeing.

Resetting the primary key to 1

I have a script for microsoft sql server database which has hundreds of tables and tables contains data as well. This is the database of a web application.what I want to do is to delete the previous records and reset the primary key to 1 or 0.
I have tried
`DBCC CHECKIDENT ('dbo.tbl',RESEED,0); `
but it does not work for me as in most of the tables the primary key is not identity.
I can not truncate the table as its primary key is being used as FK in many other tables.
I have also tried to add the identity specification in the primary key of the table and run the checkident query and then changing it back to non-identity spec, but after adding the record again it starts from where it left.
Making changes in the code is not an option for me.
please help.
According with your question I am not sure about the main objective, Why? If you need truncate a lot of tables and change their structures to have an Identity property why you can't disabled the FK? . In the past I have used an standard process for rebuild a table and migrate all the information, this represent a group of steps, I would try to help you but you should follow the next steps.
Steps:
1) Disable FK for alter the structure of your tables. You can get the solution for this task in the next link:
Temporarily disable all foreign key constraints
2) Alter the table with the new property Identity, this is a classic process of ALTER TABLE xxxxxx.
3) Execute the syntax that previously posted :
DBCC CHECKIDENT ('dbo.tbl',RESEED,0);
Try to follow this path and if you have any problem only ask us.
You can not truncate table that have relation. You shoud remove relation firstly.
My understanding of this question:
You have a database with tables that you want to empty and next have them use primary key values starting at 0 or 1.
Some of these tables use an identity value and you already have a solution for those (you know you can find out which columns have an identity by using the sys.columns view? Look for the is_identity column).
Some tables do not use an identity but get their pk values from an unknown source, which we can't modify.
The only solution I see, is creating an after insert trigger (or modifying) on those tables that subtracts from the new pk value.
E.g.: your "hidden generator" will generate a next value 5254, but you want the next pk value to become one:
CREATE TRIGGER trg_sometable_ai
ON sometable
AFTER INSERT
AS
BEGIN
UPDATE st
SET st.pk_col = st.pk_col - 5253
FROM sometable AS st
INNER JOIN INSERTED AS i
ON i.pk_col = th.pk_col
END
You'll have to determine the next value and thus the "subtract value" for each table.
If the code also inserts child records into tables with a foreign key to this table, and uses the previously generated value, you have to modify those triggers as well...
This is a "last resort" solution and something I would recommend against in any scenario that has other options. Manipulating primary key values is generally not a good idea.

SQL Server - simple discard of duplicate keys/ rows when inserting

I'm feeding data into SQL Server database and 1 out of every 1000 records is a duplicate due to matters outside my control. It's an exact duplicate - the entire record, the unique identifier -- everything.
I know this can solved with an 'updated' rather than insert step ... or 'on error, update' instead of insert, perhaps.
But is there a quick and easy way to make SQL Server ignore these duplicates? I haven't made an index/ unique constraint yet -- but if I did that, I don't want a 'duplicate' key value breaking or interrupting the ETL/ data flow process. I just SQL Server to keep executing the insert query. Is there a way to do this?
Just add a WHERE NOT EXISTS to the statement you're executing -
INSERT INTO table VALUES('123', 'blah') WHERE NOT EXISTS(select top 1 from table where unique_identifier_column = '123')
Just to be clear for anyone else hitting this issue, for the best performance and a slight chance of losing an insert, one should define primary key in the table and use IGNORE_DUP_KEY = ON.
If you're looking for a duplicate record on every field just use the distinct clause in your select:
Insert into DestinationTable
Select Distinct *
From SourceTable
EDIT:
I misinterpreted your question. You're trying to find a low impact way to prevent adding a record that already exists in your DestinationTable.
If you want your inserts to remain fast, one way to do it is to add an identity column to your table as the primary key. Let your duplicate records get added, but then run a maintenance routine on down or slow time that checks all records added since the last check and deletes any added duplicates. Otherwise, there is no easy way... you will have to check on every insert.

How to emulate a BEFORE INSERT trigger in T-SQL / SQL Server for super/subtype (Inheritance) entities? [duplicate]

This question already has answers here:
How can I do a BEFORE UPDATED trigger with sql server?
(9 answers)
Closed 2 years ago.
This is on Azure.
I have a supertype entity and several subtype entities, the latter of which needs to obtain their foreign keys from the primary key of the super type entity on each insert. In Oracle, I use a BEFORE INSERT trigger to accomplish this. How would one accomplish this in SQL Server / T-SQL?
DDL
CREATE TABLE super (
super_id int IDENTITY(1,1)
,subtype_discriminator char(4) CHECK (subtype_discriminator IN ('SUB1', 'SUB2')
,CONSTRAINT super_id_pk PRIMARY KEY (super_id)
);
CREATE TABLE sub1 (
sub_id int IDENTITY(1,1)
,super_id int NOT NULL
,CONSTRAINT sub_id_pk PRIMARY KEY (sub_id)
,CONSTRAINT sub_super_id_fk FOREIGN KEY (super_id) REFERENCES super (super_id)
);
I wish for an insert into sub1 to fire a trigger that actually inserts a value into super and uses the super_id generated to put into sub1.
In Oracle, this would be accomplished by the following:
CREATE TRIGGER sub_trg
BEFORE INSERT ON sub1
FOR EACH ROW
DECLARE
v_super_id int; //Ignore the fact that I could have used super_id_seq.CURRVAL
BEGIN
INSERT INTO super (super_id, subtype_discriminator)
VALUES (super_id_seq.NEXTVAL, 'SUB1')
RETURNING super_id INTO v_super_id;
:NEW.super_id := v_super_id;
END;
Please advise on how I would simulate this in T-SQL, given that T-SQL lacks the BEFORE INSERT capability?
Sometimes a BEFORE trigger can be replaced with an AFTER one, but this doesn't appear to be the case in your situation, for you clearly need to provide a value before the insert takes place. So, for that purpose, the closest functionality would seem to be the INSTEAD OF trigger one, as #marc_s has suggested in his comment.
Note, however, that, as the names of these two trigger types suggest, there's a fundamental difference between a BEFORE trigger and an INSTEAD OF one. While in both cases the trigger is executed at the time when the action determined by the statement that's invoked the trigger hasn't taken place, in case of the INSTEAD OF trigger the action is never supposed to take place at all. The real action that you need to be done must be done by the trigger itself. This is very unlike the BEFORE trigger functionality, where the statement is always due to execute, unless, of course, you explicitly roll it back.
But there's one other issue to address actually. As your Oracle script reveals, the trigger you need to convert uses another feature unsupported by SQL Server, which is that of FOR EACH ROW. There are no per-row triggers in SQL Server either, only per-statement ones. That means that you need to always keep in mind that the inserted data are a row set, not just a single row. That adds more complexity, although that'll probably conclude the list of things you need to account for.
So, it's really two things to solve then:
replace the BEFORE functionality;
replace the FOR EACH ROW functionality.
My attempt at solving these is below:
CREATE TRIGGER sub_trg
ON sub1
INSTEAD OF INSERT
AS
BEGIN
DECLARE #new_super TABLE (
super_id int
);
INSERT INTO super (subtype_discriminator)
OUTPUT INSERTED.super_id INTO #new_super (super_id)
SELECT 'SUB1' FROM INSERTED;
INSERT INTO sub (super_id)
SELECT super_id FROM #new_super;
END;
This is how the above works:
The same number of rows as being inserted into sub1 is first added to super. The generated super_id values are stored in a temporary storage (a table variable called #new_super).
The newly inserted super_ids are now inserted into sub1.
Nothing too difficult really, but the above will only work if you have no other columns in sub1 than those you've specified in your question. If there are other columns, the above trigger will need to be a bit more complex.
The problem is to assign the new super_ids to every inserted row individually. One way to implement the mapping could be like below:
CREATE TRIGGER sub_trg
ON sub1
INSTEAD OF INSERT
AS
BEGIN
DECLARE #new_super TABLE (
rownum int IDENTITY (1, 1),
super_id int
);
INSERT INTO super (subtype_discriminator)
OUTPUT INSERTED.super_id INTO #new_super (super_id)
SELECT 'SUB1' FROM INSERTED;
WITH enumerated AS (
SELECT *, ROW_NUMBER() OVER (ORDER BY (SELECT 1)) AS rownum
FROM inserted
)
INSERT INTO sub1 (super_id, other columns)
SELECT n.super_id, i.other columns
FROM enumerated AS i
INNER JOIN #new_super AS n
ON i.rownum = n.rownum;
END;
As you can see, an IDENTIY(1,1) column is added to #new_user, so the temporarily inserted super_id values will additionally be enumerated starting from 1. To provide the mapping between the new super_ids and the new data rows, the ROW_NUMBER function is used to enumerate the INSERTED rows as well. As a result, every row in the INSERTED set can now be linked to a single super_id and thus complemented to a full data row to be inserted into sub1.
Note that the order in which the new super_ids are inserted may not match the order in which they are assigned. I considered that a no-issue. All the new super rows generated are identical save for the IDs. So, all you need here is just to take one new super_id per new sub1 row.
If, however, the logic of inserting into super is more complex and for some reason you need to remember precisely which new super_id has been generated for which new sub row, you'll probably want to consider the mapping method discussed in this Stack Overflow question:
Using merge..output to get mapping between source.id and target.id
While Andriy's proposal will work well for INSERTs of a small number of records, full table scans will be done on the final join as both 'enumerated' and '#new_super' are not indexed, resulting in poor performance for large inserts.
This can be resolved by specifying a primary key on the #new_super table, as follows:
DECLARE #new_super TABLE (
row_num INT IDENTITY(1,1) PRIMARY KEY CLUSTERED,
super_id int
);
This will result in the SQL optimizer scanning through the 'enumerated' table but doing an indexed join on #new_super to get the new key.

SQL server trigger question

I am by no means a sql programmer and I am trying to accomplish something that I am pretty sure has been done a million times before.
I am trying to auto generate a customer number in sql every time a new customer is inserted, but the trigger (or sp?) will only work if at least the first name, last name and another value called case number is entered. If any of these fields are missing, the system generates an error. If the criteria is met, the system generates and assigns a unique id to that customer that begins with letters GL- and then uses 5 digit number so a customer John Doe would be GL-00001 and Jane Doe would be GL-00002.
I am sorry if I am asking too much but I am basically a select insert update guy and nothing more so thanks in advance for any help.
If I were in this situation, I would:
--Alter the table(s) so that first name, last name and case number are required (NOT NULL) columns. Handle your checks for required fields on the application side before submitting the record to the database.
--If it doesn't already exist, add an identity column to the customer table.
--Add a persisted computed column to the customer table that will format the identity column into the desired GL-00000 format.
/* Demo computed column for customer number */
create table #test (
id int identity,
customer_number as 'GL-' + left('00000', 5-len(cast(id as varchar(5)))) + cast(id as varchar(5)) persisted,
name char(20)
)
insert into #test (name) values ('Joe')
insert into #test (name) values ('BobbyS')
select * from #test
drop table #test
This should satisfy your requirements without the need to introduce the overhead of a trigger.
So what do you want to do? generate a customer number even when these fields arn't populated?
Have you looked at the SQL for the trigger? You can do this in SSMS (SQL Server Managment Studio) by going to the table in question in the Object Explorer, expanding the table and then expanding triggers.
If you open up the trigger you'll see what it does to generate the customer number. If you are unsure on how this code works, then post the code for the trigger up.
If you are making changes to an existing system i'd advise you to find out any implications that changing the way data is inputted works.
For example, others parts of the application may depend on all of the initial values being populated, so after changing the trigger to allow incomplete data to be added, you may inturn break something else.
You have probably a unique constraint and/or NOT NULL constraints set on the table.
Remove/Disable these (for example with the SQL-Server Management Console in Design Mode) and then try again to insert the data. Keep in mind, that you will probably not be able to enable the constraints after your insert, since you are violating conditions after the insert. Only disable or reomve the constraints, if you are absolutely sure that they are unecessary.
Here's example syntax (you need to know the constraint names):
--disable
ALTER TABLE customer NOCHECK CONSTRAINT your_constraint_name
--enable
ALTER TABLE customer CHECK CONSTRAINT your_constraint_name
Caution: If I were you, I'd rather try to insert dummy values for the not null columns like this:
insert into customers select afield , 1 as dummyvalue, 2 as dummyvalue from your datasource
A very easy way to do this would be to create a table of this sort of structure:
CustomerID of type in that is a primary key and set it as identity
CustomerIDPrfix of type varchar(3) which stores GL- as a default value.
Then add your other fields and set them to NOT NULL.
If that way is not acceptable and you do need to write a trigger check out these two articles:
http://msdn.microsoft.com/en-us/library/aa258254(SQL.80).aspx
http://www.kodyaz.com/articles/sql-trigger-example-in-sql-server-2008.aspx
Basiclly it is all about getting the logic right to check if the fields are blank. Experiment with a test database on your local machine. This will help you get it right.

Resources