SSIS Error: Violation of primary key constraint. cannot insert duplicate key in object - sql-server

I'm working with a team to resolve a SSIS package failure. The package contains four Sequence container and each container have some set of sql tasks which truncate a target table and insert data from a source to target. At times package fails with error: Violation of primary key constraint. cannot insert duplicate key in object even though there is no violation as the table is empty when we start the load. Please provide suggestions on how to troubleshoot the issue
Note: Source and destination have some difference in structure. Source tables containd PK on only one int column. Destination table contains one more additional PK which is a default value. I dont understand why we need a constrant on a default column value.

It sounds like even though the target table is empty when you run the SSIS package, the rows you're inserting themselves contain duplicate data. (i.e. if your PK is called [ID], you're trying to insert more than one of the same [ID] into the table)

Related

Using pandas to_sql to append data frame to an existing table in sql server gives IntegrityError

I tried to append my pandas dataframe to an existing data table in sql server like below. All my column names in the data are absolutely identical to the database table.
df.to_sql(table_name,engine,schema_name,index=False,method='multi',if_exists='append',chunksize=100)
But it failed and I got error like below:
IntegrityError: ('23000', "[23000] [Microsoft][ODBC Driver 17 for SQL Server]
[SQL Server]Cannot insert explicit value for identity column in table 'table_name'
when IDENTITY_INSERT is set to OFF. (544) (SQLParamData)")
I have non clue what that means and what I should do to make it work. It looks like the issue is IDENTITY_INSERT is set to OFF?. Appreciate if anyone can help me understand why and what potentially I can do. Thanks.
In Layman's terms, the data frame consists of primary key values and this insert is not allowed in the database as the INDENTITY_INSERT is set to OFF. This means that the primary key will be generated by the database itself. Another point is that probably the primary keys are repeating in the dataframe and the database and you cannot add duplicate primary keys in the table.
You have two options:
First: Check in the database, which column is your primary key column or identity column, once identified remove that column from your dataframe and then try to save it to the database.
SECOND: Turn on the INDENTITY INSERT SET IDENTITY_INSERT Table1 ON and try again.
If your dataframe doesn't consists of unique primary keys, you might still get another error.
If you get error after trying both of the option, kindly update your question with the table schema and the dataframe value using df.head(5)

Insert records from temp table where record is not already preset fails

I am trying to insert from temporary table into regular one but since there is data in temp table sharing the same values for a primary key of the table I am inserting to, it fails with primary key constraint being violated. That is expected so I am working around it by inserting only the rows that have the primary key not already present in table I am inserting to.
I tried both EXISTS and NOT IN approach, I checked examples showcasing both, confirmed both works in SQL server 2014 in general, yet I am still getting the following error:
Violation of PRIMARY KEY constraint 'PK_dbo.InsuranceObjects'. Cannot
insert duplicate key in object 'dbo.InsuranceObjects'. The duplicate
key value is (3835fd7c-53b7-4127-b013-59323ea35375).
Here is the SQL in NOT IN variance I tried:
print 'insert into InsuranceObjects'
INSERT INTO $(destinDB).InsuranceObjects
(
Id, Value, DefInsuranceObjectId
)
SELECT Id, InsuranceObjectsValue, DefInsuranceObjectId
FROM #vehicle v
WHERE v.Id NOT IN (SELECT Id FROM $(destinDB).InsuranceObjects) -- prevent error when running scrypt multiple times over
GO
If not apparent:
Id is the primary key in question.
$(destinDB) is command line variable. Different from TSQL variable.
It allows me to define the target database and instance in convenient
script based level or even multiple scripts based level. Its used in
multiple variations throughout the code and has so far performed
perfectly. The only downside is you have to run in CMD mode.
when creating all temp tables USE $(some database) is also used so
it's not an issue
I must be missing something completely obvious but it's driving me nuts that such a simple query fails. What is worse, when I try running select without insert part, it returns ALL the records from temp table despite me having confirmed there are duplicates that should fail the NOT IN part in where clause.
I suspect the issue is that you have duplicate ID values in your temp table. Please check the values there as it would cause the issue you are seeing.

Can one alter a PostgresSql table to have an autogenerated keys after the table has values?

Is it possible to only alter a table to make an existing column a serial auto generated key, without adding a new column? Sorry if this question is a bit newbie-ish for PostgreSQL, I'm more a SQL Server person but moving to PostgreSQL..
In a nut shell the program will copying an existing SQL Server database into PostgreSQL. With the desire to have a mirrored DB in PostgreSQL as the source from SQL Server with the only caveat one may selectively include/exclude any table or column as desired, or do everything...
Given the process copies all values, thought one should be able create the keys after the copy has finished just as one may do in SQL Server. Thought PostgreSQL would have a comparable methods as SQL Server's SET INSERT_IDENTITY [ON|OFF] so one may override the auto generated key with a desired value. Not seeing an equivalent in PostgreSQL. So my fallback is to create the mirrored records in Postgres without keys any keys and then alter the tables. But it seems to fix up the table as desired one has create a new column, but doing this break or cause a headache fixing up the RI for PK/FK relationships.
Any suggestions? Thanks in advance.
In PostgreSQL, the auto-generated key is always overridden if you insert an explicit value for it. If you don't specify a value (omit the column), or specify the keyword DEFAULT, a generated key is used.
Given table
CREATE TABLE t1 (id serial primary key, dat text);
then both these will get a generated key from sequence t1_id_seq:
INSERT INTO t1 (dat) VALUES ('fred');
INSERT INTO t1 (id, dat) VALUES (DEFAULT, 'bob');
This will instead provide its own value:
INSERT INTO t1 (id, dat) VALUES (42, 'joe');
You are responsible for ensuring that the provided value doesn't conflict with existing data, or with future values the identity sequence will generate. PostgreSQL will not notice that you manually inserted a row with id 42 and skip when its own sequence counter gets to that point.
Usually what you do is load with provided values, then reset the sequence to the max of all keys already in the table, so it keeps counting from there for new local inserts.

Moving history data from a database to another through SSIS package

I have two databases, I want to move some history data from a fact table to another database, the destination table is exactly the same as the source table including all the constrains.
I use a SSIS package to transfer the data as below:
first use OLE DB Source to select the data from the source for the required period.
load it to temp table using OLE DB Destination into the second database.
Then load it to the final table using Execute SQL Task
but I get below error
Error: Violation of PRIMARY KEY constraint
'PK__Financia__362B520524BEA57A'. Cannot insert duplicate key in
object 'Fact.FinancialTransactions'. The duplicate key value is
(100001 , 2010012, Dec 31 2010 12:00AM, 65, 88).
How do I get around this issue? I want to keep the constrains in the destination table.
You may want to add one more execute SQL task which disables the constraints for some time period.But problem is if you disable primary key,you cant do any operations on a table,so dropping also can be an option ,but recreating again will take time.So best option would be to fix the error or rebuild your index by below
ALTER TABLE t1 REBUILD WITH (IGNORE_DUP_KEY = ON)
this will allow duplicates to be ignored.more info here.Can I set ignore_dup_key on for a primary key?

Is it possible to determine what caused a duplicate key values

I have a job who makes replication between two databases and after I insert new value in first table tbl_company_market_service_phone_number I get this error from the job ,and the value not exists in table wich is replicated data (second table).
Error:Cannot insert duplicate key row in object 'tbl_company_market_service_phone_number' with
unique index 'IX_tbl_company_market_service_phone_number_fld_uk'. The duplicate key value is
(65b763ac-6f8f-4fe6-b76c-02a75b71dbe1).
How can I find what key value is (65b763ac-6f8f-4fe6-b76c-02a75b71dbe1) ? Or find out the cause of job faill.
So it is not in the replicated data
Did you check the source table
It is telling you table name tbl_company_market_service_phone_number
And that is the name of stated name of the source table

Resources