SSIS - Best Practices When SSIS Hangs - sql-server

Last night my SSIS hung. I'm not really sure why. This morning I identified which package hung based on the output. I'm looking at sp_who2, but I can't see any processes that are running under the user the runs the jobs.
I'm wondering what I should be doing when my SSIS just hangs. It's still currently running, but doesn't seem to be running anything.

Start off with deploying the package to SSISDB and running it from there. If you haven't already installed the SSISDB catalog more information on this can be found here. After this enable logging in the package and review the results, specifically the phases in which the package is hanging. When doing this look for the PipelineComponentTime event which specifies how long each component took in a particular phase. A couple phases that may be of interest are the ProcessInput phase, which is where incoming records are processed, as well as the PrimeOutput that's where the data is placed into buffers and sent further down the the data flow. An overview of enabling logging is as follows.
Right Click anywhere on the Control Flow and press Logging...
Check the check-box next to the package in the Containers field to enable logging.
Choose where you want the logging records to stored using the Provider Type field. If you use the SSIS log provider for SQL Server the SYSSSISLOG table will be created in the database that's set as the Initial Catalog in the OLE DB Connection Manager that is used. On the Details pane select events that you will log. After selecting these click on the Advanced>> button to set the fields that will be logged.
Next check whichever components that you want to enable logging for. You'll want to do this for any components that you either suspect or have confirmed are encountering delays. If any Data Flow Tasks have logging enabled, the PipelineComponentTime event mentioned earlier will be available under the Details window on these.
For monitoring the package from SSIS catalog use the SSISDB DMVs. When doing this make sure that the Logging Level is set to at least basic when the package is executed. There are several ways to do this, with the easiest probably being from the GUI. Before executing the package on the Execute Package window in SSISDB, the Logging Level field can be found on the Advanced tab.
If the package is deployed to SSISDB and ran as a job in SQL Agent the logging can be set from the job step. Open the Job Step Properties window, go the the Configuration tab and then the Advanced tab where you'll see the Logging Level field.
There are many DMVs in SSISDB that hold details regarding package configuration and execution. catalog.event_messages, executable_statistics, and catalog.operation_messages are a few that will be helpful. For more insight on the components and where the delays are occurring I'd recommend catalog.execution_component_phases, which will require a logging level of either Performance or Verbose.
If the performance varies depending on what parameters are used within the package, use the execution_id from the instances of the slower executions to query the catalog.execution_parameter_values DMV to see what the parameters are set to in these executions.

Related

How will I know specifically when the tasks in my packages have run successfully in SSIS also how can I show that in a log tablet in SQL table

Let's there is a master package and several tasks run in it on a daily basis, I want to specifically determine when those tasks have finished like table load completed and cuble load completed, these steps run daily but I have to show this in a SQL table that this particular day table load started at this time and ended at this like etc
SSIS event handlers are the simplest means of turning an SSIS script
into a reliable system that is auditable, reacts appropriately to
error conditions, reports progress and allows instrumentation and
monitoring your SSIS packages. They are easy to implement, and provide
a great deal of flexibility. Rob Sheldon once again provides the easy,
clear introduction.
You can use on PostExecute and when the tasks runs successfully :
Or you can use containers and then use precedence constraints with success and failure

SSIS - what is being processed right now?

For a report, I need to know what sub-components (parts of the control flow) of a SSIS package are being processed right now. I know about the catalog view catalog.executable_statistics, but it seems a record is added there after the execution of a given execution path is finished.
Is there a way to check what execution paths already entered pre-execute phase and not yet entered the post-execute phase? In other words, what is the package working on right now?
We use SQL server 2016 Enterprise edition.
Edit:
I prefer a solution that would work with the default logging level.
One option is querying catalog.execution_component_phases which will display the most recently run execution phase of each sub-component within a Data Flow Task while the package is executing. This will let you see what component has started a phase such as PreExecute but hasn't yet begun a subsequent one like PostExecute or ReleaseConnections. To use this, you'll need to set the logging level at either Performance or Verbose as well.
As far as I know there isn't any logging out of the box that will tell you the exact state of all subcomponents in the package when executing it from SQL server.
I simply use an SQL task at the start of some steps that inserts, and when done, updates a row with specifics like start/end datetime, package name & amount of rows processed. You could add a column which specifies the subcomponents affected is this way.

Finding bottlenecks of ETL and Cube processing

I have an ETL and Cube solutions, which I process one after another in a SQL agent job.
In ETL I run 1 package, that in turn one by one runs all other packages.
Whole processing takes 10 hours.
For ETL:
How can I find out which package takes what amount of time to run within that one parent package, other than opening solution and record times?
For cube:
Here dimensions process fast. What do I measure here in order to find which part takes it so long? Maybe measures? How to track processing times of particular measure?
Maybe SQL Profiler will help? If so, is there a good article which describes which metrics there should I pay attention to?
To gather statistics about SSIS execution times, you can enable logging:
For package deployment model, you'll have to turn on logging in each package, go to SSIS > logging. In the dialogue choose the Pre and Post Execute events. Use a sql logging provide which will log to a system table called dbo.sysssislog. You'll need to join pre and post events on execution id.
For Project deployment model, it's probably already on. This can be configured in SSMS, Integration Services > SSISDB, right click and choose properties. Once you've executed the package, you can see the results in the standard reports. Right click the master package and choose Reports > Standard Reports > All Executions.
Lots more details on SSIS logging here: https://learn.microsoft.com/en-us/sql/integration-services/performance/integration-services-ssis-logging
For SSAS, I always tested this manually. Connect in SSMS, right click on each Measure group and do a process full (this assumes the dimensions have just been freshly processed.) The measures are more likely to be the cause of an issue because of the amount of data.
Once you understand which measure is slow, you can look at tuning the source query, if it has any complexity to it, or partitioning the measure group and doing incremental loading. Full processing could be scheduled periodically.
m

What is the structural behaviour of SSIS in it's Pre-execute phase of package execution ? Why the heavy packages hangs in Pre-Execute phase?

The poor production database schema directs me to do multiple left joins, Left join with range condition in on clause and character column joins.
Even though my packages is free from errors, it is hanging in Pre-execute phase.
I read many articles online which tells about how to prevent this. such as Delay validation, External metadata , ... etc.
So please help me to know what is the work of data flow,control flow and sql server engine in Pre-execute phase ?
It's always a good idea to enable logging in your SSIS packages. Without logging, it can be hard to determine exactly what SSIS did, especially if your packages are executed overnight by a task scheduler!
To enable; right click, on the control flow, and select logging... from the menu. A dialog will open. You can use this screen to configure where the data is logged (Windows Event Log, SQL Server, Text File, etc) and what is logged.
I would recommend you log everything. The output can be quite verbose. I'm afraid reading the SSIS logs is an acquired skill. There is a lot of detail, which you need, but that can make it hard to find the exact row(s) you are interested in. There is no shortcut here, you will need to roll your sleeves up and get stuck in.
Packages that contain a lot of connections can take a while to get through pre-execution. I've noticed packages with lots of connections to the file system are especially slow.
EDIT
I've just noticed I didn't actually answer the OPs question. So here goes...
Although SSIS appears to be frozen/hung it probably isn't. Log everything and review to fix. You can view the log in-flight, which helps.

Implications of using waitfor delay task in ssis package on scheduled server

I have a question regarding implications of using waitfor delay in an execute sql task on an ssis package. Here's what's going on: I have source data tables that due to the amount of data and linked server connection yada yada they are dropped and created every night. Before my package the utilizes this data runs I have a loop for container. In this container I have an execute sql task that checks to see my source tables exist and if they do not, it sends me and email via email task, then goes to an execute sql task that has a waitfor delay of 30 mins (before looping and checking for source tables again). Now I thought I was pretty slick with this design but others on my team are concerned because they do not know enough about this waitfor task. They are concerned that my package could possibly interfere with theirs, or slow down server, use resources etc....
From my google searches I didn't see anything that actually seemed like it would cause issues. Can anyone here speak to the implications of using this task?
SQL WAITFOR is ideal for this requirement IMO - I've been using it in production SSIS packages for years with no issues. You can monitor it via SSMS Activity Monitor and see that it doesnt consume any resources.

Resources