I have an SSIS package that seems to be having concurrency issues.
I have set it up in a way that each of the 6 containers can be run all at the same time, or individually. They all run a SQL stored procedure and store data into a table object.
The first task is a SQL task that gets a list of clients from the database. The others are all foreach loops (for each client). When I run the package for ALL containers, it seems to fail after 1 loop of the 2nd and 3rd container. There si nothing in the output/debug, other than "Package has started". The first loop completes quite quickly for every client (< 10 seconds), whereas the others take about 2-3 minutes to run for each client (a lot more data).
If I run the package for a single for each loop, it completes without issue (it iterates 7 times). It only fails after 1 loop, if the other containers are also running. The first task that gets the client IDs stores them into a table, however there are 2 variables that the table data is stored in for each loop (Client ID and Client Name). My thinking is that once the first container is done (i.e. finished all 7 clients), the values in the variables have changed and the other loops fail.
http://i.imgur.com/AJwrLNF.png
I cannot read the SSIS tasks. The reason one does things in parallel in SSIS is to gain performance by using the built in parallelizing features of both SSIS and SQL Server. I once had to process millions of rows and by doing it in parralel I got it done in the window available to me. If you are processing 7 clients (and the tasks seem to be very different) you can be sure all kinds of locking (and probably deadlocks) are taking place. Just do it sequentially.
Related
I have about 7 projects deployed on a SQL Server. Each one contains a MasterPackage which run all the child packages of that project. The issue is that I want all 7 projects to run in parallel, starting at the same time, but as it is right now, they get queued up and start one after another. Can I make all the projects start at the same time?
You can always schedule packages' executions by the means of SQL Server Agent jobs. You will probably have to create a separate job for each project, but after that, whatever schedule you pick for them should be followed.
Just keep in mind that, if packages push a lot of data through, server might not cope with the total workload, so parallel execution might be slower than a serialised one.
I have a package that is setup to concatenate a bunch of individual values from a table, pass those values into an IN statement to then form a SQL string, then pass that string into a dataflow task that pulls data from another server. The server I am querying only allows me to pull 1k records at a time, so this process has to usually loop anywhere from 10-100 times depending on what our volume is like for the day.
Whenever I initially run the package, it performs fine. But slowly over time the DtsDebugHost.exe *32 will gradually accumulate more and more memory usage until the package crashes. This happens both in visual studio, and in command prompt when I execute in dtexec. How can I stop DtsDebugHost.exe *32 from hogging all my memory and crashing?
Go to task manager and kill DtsDebughost.exe it will solve the issue
I have a SQL Server agent job running every 5 minutes with SSIS package from SSIS Catalog, that package does:
DELETE all existing data ON OLTP_DB
Extract data from Production DB
DELETE all existing data on OLAP_DB and then
Extract data transformed from OLTP_DB into OLAP_DB ...
PROBLEM
That job I mentioned above is hanging randomly for some reason that I don't know,
I just realize using the activity monitor, every time it hangs it shows something like:
and if I try to run any query against that database it does not response just say executing.... and nothing happen until I stop the job.
The average running time for that job is 5 or 6 minutes, but when it hangs it can stay running for days if I donĀ“t stop it. :(
WHAT I HAVE DONE
Set delayValidation : True
Improved my queries
No transactions running
No locking or blocking (I guess)
Rebuild and organize index
Ran DBCC FREEPROCCACHE
Ran DBCC FREESESSIONCACHE
ETC.....
My settings:
Recovery Mode Simple
SSIS OLE DB Destination
1-Keep Identity (checked)
2-Keep Nulls (checked)
3-Table lock (checked)
4-Check constraints (unchecked)
rows per batch (blank)
Maximum insert commit size(2147483647)
Note:
I have another job running a ssis package as well (small) in the same instance but different databases and when the main ETL mentioned above hangs then this small one sometimes, that is why I think the problem is with the instance (I guess).
I'm open to provide more information as need it.
Any assistance or help would be really appreciated!
As Joeren Mostert said, it's showing CXPACKET which means that it's executing some work in parallel. (cxpacket)
It's also showing ASYNC_NETWORK_IO (async_network_io) which means it's also transfering data to the network.
There could be many reasons. Just a few more hints:
- Have you checked if network connection is slow? - What is the size of the data being transfered vs the speed of the network? - Is there an antivirus running that could slow the data transfer?
My guess is that there is lots of data to transfer and that it's taking a long time. I would think either I/O or network but since you have an asyn_network_io that takes most of the cumulative wait time, I would go for network.
As #Jeroen Mostert and #Danielle Paquette-Harvey Said, By doing right click using the activity monitor I could figure out that I had an object that was executing in parallel (for some reason in the past), to fix the problem I remove the parallel structure and put everything to run in one batch.
Now it is working like a charm!!
Before:
After:
I work with an environment that uses Merge Replication to publish a dozen publications to 6 a dozen subscribers every 10 minutes. When certain jobs are running simultaneously, deadlocks and blocking is encountered and the replication process is not efficient.
I want to create a SQL Server Agent Job that runs a group of Merge Replication Jobs in a particular order waiting for one to finish before the next starts.
I created an SSIS package that started the jobs in sequence, but it uses sp_start_job and when run it immediately starts all the jobs so they are running together again.
A side purpose is to be able to disable replication to a particular server instead of individually disabling a dozen jobs or temporarily disabling replication completely to avoid 70+ individual disablings.
Right now, if I disable a Merge Replication job, the SSIS package will still start and run it anyway.
I have now tried creating an SSIS package for each Replication Job and then creating a SQL Server Agent job that calls these packages in sequence. That job takes 8 seconds to finish while the individual packages it is calling (starting a replication job) takes at least a minute to finish. In other words, that doesn't work either.
The SQL Server Agent knows when a Replication job finishes! Why doesn't an SSIS package or job step know? What is the point of having a control flow if it doesn't work?
Inserting waits is useless. the individual jobs can take anywhere from 1 second to an hour depending on what needs replicating.
May be I didn't see real problem but it is naturally that you need synchronization point and there are many ways to create it.
For example you could still run jobs simultaneously but let first job lock a resource that is needed for second, that will wait till resource will be unlocked. Or second job can listen log table in loop (with wait for a "minute" and self cancel after "an hour")...
I'm working with an SSIS package that itself calls multiple SSIS packages and hangs periodically during execution.
This is a once-a-day package that runs every evening and collects new and changed records from our census databases and migrates them into the staging tables of our data warehouse. Each dimension has its own package that we call through this package.
So, the package looks like
Get current change version
Load last change version
Identify changed values
a-z - Move changed records to staging tables (Separate packages)
Save change version for future use
All of those are execute SQL tasks except for the moving records tasks which are twenty some execute package tasks (data move tasks), which are executed somewhat in parallel. (Max four at a time.)
The strange part is that it almost always fails when executed by the SQL agent (using a proxy user) or dtexec, but never fails when I run the package through Visual Studio. I've added logging so that I can see where it stops, but it's inconsistent.
We didn't see any of this while working in our development / training environments, but the volume of data is considerably smaller. I wonder if we're just doing too much at once.
I may - to test - execute the tasks serially through the SQL Server agent to see if it's a problem with a package calling a package , but I'd rather not do this because we have a relatively short time in the evening to do this for seven database servers.
I'm slightly new to SSIS, so any advice would be appreciated.
Justin