In a package I have two loop containers that run fine one after the other. Each has its own variable name used to iterate over and load two different sets of Excel files to the same table. As far as I can tell there is no overlap between the packages so I thought to speed things up by running them in parallel.
When starting the package however (manually in SSIS), the containers look like they execute but then after a few seconds the entire package shows as complete without any errors, and none of the loop containers or subsequent tasks did anything.
The package log only shows validation completed for each of the loop containers.
Is there some switch somewhere to make two loop containers play nicely?
Here is what it looks like:
Place the two loops and their corresponding script tasks (via precedence constraints) in a sequence container. Connect the Create Table script task to the sequence container. Then connect the sequence container to D Product Family data flow.
Note: disabling a task won't affect operation as SSIS will just skip over the disabled task(s) and go to the next one until all tasks have been completed.
Related
I am trying to run parallel processes to read Excel Files into an OLEDB Destination. However on runtime, SSIS doesn't show errors though it simply stops and states:
"Package Execution completed. Click here to switch to design mode, or select Stop Debugging from the Debug Menu".
No rows have been inserted with the parallel processes and I can't find the root cause of this 'completion' in the messages list. I've provided a screenshot as an example:
The MaxConcurrentExecutables is set to 5, the Run64Bit property is set to True (False didn't change anything), and the EngineThreads property is set to 1.
Could anyone help on this problem?
SSIS cannot read the same file simultaneously. Yes, you are running into a locking issue.
The solution is to use one data connection and one data flow. In the data flow, read from the file, then add a multicast, which will allow you to duplicate the data flow as many times you want. From there, merge the tasks that are occurring in both data flows into one.
The net effect is that you will have one data flow; one data source; one multicast; two data pipelines where you can do some transformations; and two data destinations.
I'm not 100% sure if this is true, but I think I know the reason why it fails.
The reason why it suddenly 'stops' executing, could be due to the fact that once SSIS reads from an Excel File to import data, it 'locks' the Excel File. The second Data Flow Task open or access the file since it's already opened by Data Flow Task. See image below.
If someone could confirm this, it would be greatly appreciated!
I've to load data from multiple files in to a table thorough for each loop container in SSIS. If any one of the file got error-ed out then the package stops execution.
Now, i've to move the error-ed file to a different path and continue to process the remaining files.
Any Suggestion?
Look at properties of containers and tasks. There are settings to determine how you want to handle errors. They can be ignored or stop execution of the package.
You can also look at constraints to use different paths depending on success or failure.
Lastly you can look at error handling via events.
Between those three topics you should be able to do whatever you want. There is plenty of blogs, faqs and examples available online.
When you connect arrows to different tasks you can right click and the arrow you drag and it will give you different options. One of the options on the arrows will be continue even on error.
I'm looking for ideas on how to automatically track the job that calls the package.
We have some genric packages that are called from different jobs, each job passes in different file paths as parameters and therefore processes very different size files depending on the path.
In the package I have some custom auditing setup which basically tracks the package start time and end time, and therefore the duration of execution. I want to be able to also track the job that called the package so if the package is running long, I can determine which job called it.
Also note I would prefer this automatic using possibly some sort of system variable or such, so that human error is not an issue. I also want these auditing tasks built into all of our packages as a template, so I would prefer not to use a user variable either - as different packages may use different variables.
Just looking for some ideas - appreciate any input
We use parent and child packages instead of different jobs calling the same package. You could send the information about which parent called it to the child package and then in the child package records that data to a table along with the start date and end date.
Our solution has a whole meta database that records all the details through logging of each step. The parent tells the child which configuration to use and log details against that configuration. The jobs call the parent package - never the child package (which doesn't have a configuration in the config table as it is always configured through variables sent in by the parent package. No human intervention necessary (except initial development or research when a failure occurs) needed.
Edit for existing jobs.
Consider that jobs can have multiple steps. Make the first step a SQL script that inserts the auditing information into a table including the start time of the package, the name of the job that called it and thename of the ssispacakge being called. Then the second step calls the SSIS package and then make the last step a SQL script that inserts the same data only with the end datetime.
A simple way to do this is to set up a variable on your SSIS package as a varchar. Set the value to the value of the variable to #[System::ParentContainerGUID] using an expression when it starts. SQL Agent won't set the value, so when run as an individual job it will be an empty string. But if called by another package it will contain the GUID of the calling package. You can test for that value. You can use a precedence contraint to control the program logic.
We have packages that run as a part of a big program but sometimes we need to run them individually. Each package has an email on failure task but we only want that to execute when the package is run individually. When it is part of the big run we collect the names of all packages that error and send them as one email from the master package. We don't want individual emails and a summary email going out on the same run.
I have one SSIS Package that must run as Proxy A and another that must run as Proxy B. I would love to have the first package run, and, as one of its tasks, execute the second package. Is this possible?
Thanks a lot!
You could have the first package use sp_start_job to kick off a job that is set up to run the second package. If this is "fire-and-forget", that's all you need to do. If you need to wait until it's completed, things get more messy - you'd have to loop around calling (and parsing the output of) sp_help_jobactivity
and use WAITFOR DELAY until the run completes.
This is also more complex if you need to determine the actual outcome of running the second package.
I have a polling service that checks a directory for new files, if there is a new file I call SSIS.
There are instances where I can't have SSIS run if another instance of SSIS is already processing another file.
How can I make SSIS run sequentially during these situations?
Note: parallel SSIS's running is fine in some circumstances, while in others not, how can I achieve both?
Note: I don't want to go into WHEN/WHY it can't run in parallel at times, but just assume sometimes it can and sometimes it can't, the main idea is how can I prevent a SSIS call IF it has to run in sequence?
If you want to control the flow sequentially, think of a design like where you can enqueue requests (for invoking SSIS) to a queue data structure. At a time, only the top request from the queue will be processed. As soon as that request completes, next request can be dequeued.