Table lock in sql server - sql-server

I have a table which is getting modified by SSIS package, this ssis package is running 20+ packages in parallel and all of the packages are inserting values in same table via stored procedure and this table should have distinct records. Since all packages are running in parallel few records are getting duplicate values.
My question if I set lock on this table, and 2 process/packages try to insert into table at the same time which will get 1st priority and if table 1 get the priority 1st then would I got a error message for table 2 or it will wait till table 1 release the lock.
Implementing lock on table would have performance impact (My initial thought process, I am ready to get challenged on it).
Can anyone suggest a solution for the problem of getting duplicate records from multiple processes.
Thanks

Suppose that your duplication issue is really caused by not locking the table.
Here are a few idea that might help:
Change TransactionOption for all the 20+ tasks ( assuming you are using 20+ 'Execute Package Task' in one package to run other 20+ packages) from 'Supported' to 'Required';
Instead of parallel running all tasks, put them in sequence. I would suggest to run them in sequence any way to derterminate the real cause of the duplicate.
Edit:
- If I were you, I will to try to load all 20+ package to 20+ seperated Staging tables, and when all packages are finished, then load all staging tables to your destination table in sequence and it should be very easy to verify duplication now since all staging tables are local.

Related

SSIS stuck in validation when running package (SQL Server 2016)

I've got a very basic SSIS package with a task in it that has a simple Dataflow in it. Source is a table in a database in one SQL Server instance and destination is an identical table in a different instances database. I've done this kind of thing tons of times, but this time, I'm getting a weird occurance.
When I run the package, it freezes on the validation of the destination after 50% has been reached. I've turned off referential integrity for the table, and i'm doing a fast load keeping identity fields and nulls, with table lock and check contraints ticked. The table only contains about 200,000 records and should be fast to copy.
Anyone know what could be causing this?
As an extra bit of information, If I execute the task directly it runs fine..
Well, that was very odd. It turns out that the previous Task (an Execute SQL) was messing things up. I was turning off referential integrity for the table, which should not have adversely affected the dataflow. Removing this sorted out the problem. To be honest I don't need it in this case, it was just copied from a previous package that did need it.
Weird.

SISS doesn't finish insert when there is 34.000 rows, my DB may have some kind of limitation?

I created a package in my SISS that makes a insert in differents databases, the table is the same.
But my package doesn't finish when I have a lot of rows, it keeps running forever.
I have noticed that for a database located in another instance this problem doesn't occurs, I don't know, maybe my database doesn't allow a huge insert?
I did some test trying insert less rows and my package work well. I thing that my problem may be some role in my db ...
Flow of integration
Please let me know what is the source of the data.
Thanks

SSIS locking table while updating it

I have an SSIS package which when runs, updates a table. It is using a staging table and subsequently, uses slowly changing dimension table to load data into the warehouse. We have set it up as a SQL Agent job and it runs every two hours.
The isolation level of the package is serializable. The database isolation level is read committed.
The issue is that when this job runs, this job blocks that table and therefore, clients cannot run any reports. It blanks it out.
So what would be the best option for me to avoid it? clients need to see that data, meanwhile, we need to update the table every two hours.
Using Microsoft SQL Server 2012 (SP3-GDR) (KB4019092) - 11.0.6251.0 (X64)
Thanks.
You're getting "lock escalation". It's a feature, not a bug. 8-)
SQL Server combines large numbers of smaller locks into a table lock to improve performance.
If INSERT performance isn't an issue, you can do your data load in smaller chunks inside of transactions and commit after each chunk.
https://support.microsoft.com/en-us/help/323630/how-to-resolve-blocking-problems-that-are-caused-by-lock-escalation-in
Another option is to give your clients/reports access to a clone of your warehouse table.
Do your ETL into a table that no one else can read from, and when it is finished, switch the table with the clone.

SQL Server: Archiving old data

I have a database that is getting pretty big, but the client is only interested in the last 2 years' data. But they would like to keep the older data "just-in-case".
Now we would like to archive the data to a different server over a WAN.
My plan is to create a stored proc to:
Copy all data from lookup tables, tables containing master data and foreign key tables over to the archive server.
Copy data from transactional tables over to the archive DB.
Delete transactional data from master db that's older than 2 years.
Although the approach will teoretically meet our needs, the 2 main problems are:
Performace: I'm copying the data over via SQL Linked Servers. Some of the big tables are really slow as it needs to compare which records exist and then update them, and the records that doesn't exists needs to be created. Seems like it will run in 3-4 hours.
We need to copy the tables in the correct sequence to prevent foreign key violations, and also the tables that have a relationship to itself (eg. Customers table with a ParentCustomer field), needs to be transferred without the ParentCustomer and then the ParentCustomer needs to be updated to prevent FK violations. Thus it becomes difficult to auto generate my Insert and Update statements (I would like to auto generate my statements as far as possible).
I just feel there might be a better way of archiving data that I do not yet know about. SSIS might be an option, but not sure if it will prevent my existing challenges. I don't know much about SSIS, so I might need to find some material to study it if that's the way to go.
I believe you need a batch process that will run as a scheduled task; perhaps every night. There are two options, which you have already discussed:
1) SQL Agent Job, which executes a Stored Procedure. The stored procedure will use Linked Servers.
2) SQL Agent Job, which will execute an SSIS package.
I believe you could benefit from a combination of both approaches, which would avoid Linked Serverd. Here are the steps:
1) An SQL Agent Job executes an SSIS package, which transfers the data to be archived from the live database to the copy database. This should be done in a specific sequence to avoid foreign key violations.
2) Once the SSIS package has executed the transfer, then it executes a stored procedure on the live database deleting the information that is over two years old. The stored procedure will not require any linked servers.
You will have to use transactions to make sure duplicate data is not archived. For example, if the SSIS package fails then the transaction should be rolled back and the Stored Procedure should not be executed.
You can use table partitions to create separate partitions for relevant date ranges.

Using temporary tables in SSIS flow fails

I have an ETL process which extracts ~40 tables from a source database (Oracle 10g) to a SQL Server (2014 developer edition) Staging environment. My process for extraction:
Determine newest row in staging
Select all newer rows from source
Insert results into #TEMPTABLE
Merge results from #TEMPTABLE to Staging
This works on a package by package basis both from Visual Studio locally and executing from SSISDB on the SQL Server.
However I am grouping my Extract jobs into one master package for ease of execution and flow to the transform stage. Only approximately 5 of my packages use temporary tables, the others are all trunc and load, but wanted to move some more to this method. When i run the master package anything using a temporary table fails. Because of pretty large log files, its hard to pinpoint the actual error but so far all it tells me is that the #TEMPTABLE can't be found and/or the status is VS_ISBROKEN.
Things i have tried:
Set all relevant components to delay validation = false
Master package has ExecuteOutOfProcess = true
Increased my tempdb capacity far exceeding my needs
A thought i had was the RetainSameConnection = true on my Staging database connection - could this be the cause? I would try to create separate connections for each, but assumed the ExecuteOutOfProcess would take care of this for me.
EDIT
I created the following scenario:
Package A (Master package containing Execute Package Task references only)
Package B (Uses temp tables)
Package C (No temp tables)
Executing Package B on it's own completes successfully. All temp table usage is contained within this package - there is no requirement for Package C to see the temp table created by Package B.
Executing Package C completes successfully.
Executing Package A, C completes successfully, B fails.
UPDATE
The workaround was to create a package level connection for each package that uses temporary tables, thus ensuring that each package held its own connection. I have raised a connect issue with Microsoft as i believe that as the parent package opens the connection it should inherit and retain throughout any child packages.
Several suggestions to your case.
Set RetainSameCoonection=true. This will allow you to work safely with TEMP tables in SSIS packages.
Would not use ExecuteOutOfProcess, it will increase your RAM footprint since every Child pack will start in its process, and decrease performance - add process start lag. This used in 32-bit environments to overcome 2 GB limit, but on x64 it is no longer necessary.
Child package execution does not inherit connection object instances from its Parent, so the same connection will not be spanned across all of your Child packages.
SSIS Packages with Temp table operations are more difficult to debug (less obvious), so pay attention to testing.

Resources