Moving history data from a database to another through SSIS package

Moving history data from a database to another through SSIS package - sql-server

I have two databases, I want to move some history data from a fact table to another database, the destination table is exactly the same as the source table including all the constrains.
I use a SSIS package to transfer the data as below:
first use OLE DB Source to select the data from the source for the required period.
load it to temp table using OLE DB Destination into the second database.
Then load it to the final table using Execute SQL Task
but I get below error
Error: Violation of PRIMARY KEY constraint
'PK__Financia__362B520524BEA57A'. Cannot insert duplicate key in
object 'Fact.FinancialTransactions'. The duplicate key value is
(100001 , 2010012, Dec 31 2010 12:00AM, 65, 88).
How do I get around this issue? I want to keep the constrains in the destination table.

You may want to add one more execute SQL task which disables the constraints for some time period.But problem is if you disable primary key,you cant do any operations on a table,so dropping also can be an option ,but recreating again will take time.So best option would be to fix the error or rebuild your index by below
ALTER TABLE t1 REBUILD WITH (IGNORE_DUP_KEY = ON)
this will allow duplicates to be ignored.more info here.Can I set ignore_dup_key on for a primary key?

Related

Using pandas to_sql to append data frame to an existing table in sql server gives IntegrityError

I tried to append my pandas dataframe to an existing data table in sql server like below. All my column names in the data are absolutely identical to the database table.
df.to_sql(table_name,engine,schema_name,index=False,method='multi',if_exists='append',chunksize=100)
But it failed and I got error like below:
IntegrityError: ('23000', "[23000] [Microsoft][ODBC Driver 17 for SQL Server]
[SQL Server]Cannot insert explicit value for identity column in table 'table_name'
when IDENTITY_INSERT is set to OFF. (544) (SQLParamData)")
I have non clue what that means and what I should do to make it work. It looks like the issue is IDENTITY_INSERT is set to OFF?. Appreciate if anyone can help me understand why and what potentially I can do. Thanks.

In Layman's terms, the data frame consists of primary key values and this insert is not allowed in the database as the INDENTITY_INSERT is set to OFF. This means that the primary key will be generated by the database itself. Another point is that probably the primary keys are repeating in the dataframe and the database and you cannot add duplicate primary keys in the table.
You have two options:
First: Check in the database, which column is your primary key column or identity column, once identified remove that column from your dataframe and then try to save it to the database.
SECOND: Turn on the INDENTITY INSERT SET IDENTITY_INSERT Table1 ON and try again.
If your dataframe doesn't consists of unique primary keys, you might still get another error.
If you get error after trying both of the option, kindly update your question with the table schema and the dataframe value using df.head(5)

I am using ssis and I am getting this error "Violation of PRIMARY KEY constraint .Cannot insert duplicate key in object . The duplicate key value is

I am new to SSIS and I am getting an error message . Can anyone help me ? There are not duplicates in my data
The error message is
An OLE DB record is available. Source: "Microsoft SQL Server Native Client 11.0" Hresult: 0x80040E2F Description: "Violation of PRIMARY KEY constraint 'PK_DimCourse'. Cannot insert duplicate key in object 'dbo.DimCourse'. The duplicate key value is (CS1301).".
My current table looks like this
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[DimCourse](
[CourseCode] [nvarchar](10) NOT NULL,
[SubjectCode] [nvarchar](10) NOT NULL,
[CourseNumber] [nvarchar](10) NOT NULL,
[CourseTitle] [nvarchar](50) NOT NULL,
[Level1] [nvarchar](20) NOT NULL,
[Level2] [nvarchar](20) NOT NULL,
CONSTRAINT [PK_DimCourse] PRIMARY KEY CLUSTERED
(
[CourseCode] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]

I know I am too late to answer. I was getting the similar error message, found out what was the problem, and solved it. I hope it will help the future readers.
Scenario:
I have a source table with dbo.Codes with primary key
combination(CodeID, CodeName).
In order to create my destination
table, I have used the SQL Server-->Database-->Tasks-->Generate
Scripts, option.
Then, in the SSIS package, I simply used the OLE DB
Source and OLE DB Destination element. It was returning
error: "Violation of PRIMARY KEY constraint 'pkCodes'. Cannot insert
duplicate key in object 'dbo.Codes'. The duplicate key value is
(106, da7A).".
What I have tried to solve:
I have tried to use the sql command as source: Select CodeID, CodeName from dbo.Codes GROUP BY CodeID, CodeName. It was still returning error. I got confused at this point.
After searching online, I found a tips to add a SSIS Toolbox-->Common-->Sort element between OLE DB Source and OLE DB Destination. I checked the option "Remove rows with duplicate sort values" in the Sort element. I was still getting error.
Then, I had enabled the data viewer between OLE DB Source and Sort element. I could see there are two rows: 106, da7a and 106, da7A in the source table.
What is the real problem?
My source table coulmn CodeName is case sensitive but my CodeName column in the destination table is not case sensitive. This has occured because, sql server generate script option, Script Collation is set to false by default.
Solution that worked for me:
I recreated my destination table with Script Collation option: True, which made my destination column case sensitive and it has solved my problem.

A record with CourseCode = 'CS1301' already exists in the target and the same CourseCode will be inserted by SSIS from the source which leads to a duplicate.
Either that or your target does not contain CourseCode = 'CS1301' yet, but your source data contains two rows both with the same CourseCode. By inserting the duplicate into the target environment will also lead to a duplicate.
I would suggest querying your source data for CourseCode = 'CS1301' to see if you find two rows. If there is only one row, query your target data for CourseCode = 'CS1301'. If there is also one row, compare and depending on the situation you probably have to delete one of them.

While on the production server a "perfectly good" package can fail days, weeks, or months later with a false duplicate primary key error. In my experience this happens when a package has the development standard which is something like EncryptWithUserKey. This can be avoided by following MS Standard practice by encrypting the package, (or perhaps setting as do not save sensitive date which I didn't bother trying). It is also good practice to parameterize connections (configure connection manager dynamically).

SSIS Error: Violation of primary key constraint. cannot insert duplicate key in object

I'm working with a team to resolve a SSIS package failure. The package contains four Sequence container and each container have some set of sql tasks which truncate a target table and insert data from a source to target. At times package fails with error: Violation of primary key constraint. cannot insert duplicate key in object even though there is no violation as the table is empty when we start the load. Please provide suggestions on how to troubleshoot the issue
Note: Source and destination have some difference in structure. Source tables containd PK on only one int column. Destination table contains one more additional PK which is a default value. I dont understand why we need a constrant on a default column value.

It sounds like even though the target table is empty when you run the SSIS package, the rows you're inserting themselves contain duplicate data. (i.e. if your PK is called [ID], you're trying to insert more than one of the same [ID] into the table)

How to import to a SQL Server table and start primary key where left off

I can't seem to find an answer to this. I am very new to SQL Server. I have been trying to set up a database to be updated daily for a website.
There is a .CSV file produced daily. I have set up a script to copy the file, edit the text and import the file into a table in SQL Server 2012.
There are 16 fields in the .CSV file. I have a 17th field in the table I import it into.
The 17th field is the Primary Key which I have set to autoincrement.
My problem is this:
I'm implementing this as a new process. This is already set up and in operation on an older server. The older server was using MySql. The Primary Key was left off at 81,720,024.
I have set the Primary Key field to autoincrement with a seed of 81720024.
Every time I update the table I truncate the table first and the import from a staging table. The Primary Key always starts at 81720024. I need to have it increment from the last entry it had. Please help!

Try deleting from the table instead of truncating.

Changing column constraint null/not null = rowguid replication error

I have a database running under Sql server 2005 with merge replication. I want to change some of the FK columns to be 'not null' as they should always have a value. SQL server won't let me do that though, this is what it says:
Unable to modify table. It is invalid to drop the default constraint
on the rowguid column that is used by
merge replication. The schema change
failed during execution of an internal
replication procedure. For corrective
action, see the other error messages
that accompany this error message. The
transaction ended in the trigger. The
batch has been aborted.
I am not trying to change the constraints on the rowguid column at all, only on another column that is acting as a FK. Other columns I want to set to be not null because the record doesn't make any sense without that information (i.e. on a customer, the customer name).
Questions:
Is there a way to update columns to be 'not null' without turning off replication then turning it back on again?
Is this even the best way to do this - should I be using a constraint instead?

Apparently SSMS makes changes to tables by dropping them and recreating them. So just needed to make the changes using T-SQL statement.
ALTER TABLE dbo.MyTable ALTER COLUMN MyColumn nvarchar(50) NOT NULL

You need to script out your change in T-SQL statements as SQL Server Management Studio will look to drop and re-create the table, as opposed to simply adding the additional column.
You will also need to add the new column to your Publications.
Please note that changing a column in this manner can be detrimental to the performance of Replication. Dependent on the size of the table you are altering, can lead to a lot of data being replicated. Consider that although your table modification can be performed in a single statement, if 1 million rows are affected then 1 million updates will be generated at the Subscriber, NOT a single update statement as is commonly thought.
The hands on, improved performance approach.......
To perform this exercise you need to:
Backup your Replication environment by scripting out your entire configuration.
Remove the table from Replication at
both Publishers/Subscribers
Add the column at each
Publisher/Subscriber.
Apply the Update locally at each
Publisher/Subscriber.
Add the table back into Replication.
Validate that transactions are being
Replicated.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight