SSIS - what is being processed right now? - sql-server

For a report, I need to know what sub-components (parts of the control flow) of a SSIS package are being processed right now. I know about the catalog view catalog.executable_statistics, but it seems a record is added there after the execution of a given execution path is finished.
Is there a way to check what execution paths already entered pre-execute phase and not yet entered the post-execute phase? In other words, what is the package working on right now?
We use SQL server 2016 Enterprise edition.
Edit:
I prefer a solution that would work with the default logging level.

One option is querying catalog.execution_component_phases which will display the most recently run execution phase of each sub-component within a Data Flow Task while the package is executing. This will let you see what component has started a phase such as PreExecute but hasn't yet begun a subsequent one like PostExecute or ReleaseConnections. To use this, you'll need to set the logging level at either Performance or Verbose as well.

As far as I know there isn't any logging out of the box that will tell you the exact state of all subcomponents in the package when executing it from SQL server.
I simply use an SQL task at the start of some steps that inserts, and when done, updates a row with specifics like start/end datetime, package name & amount of rows processed. You could add a column which specifies the subcomponents affected is this way.

Related

How will I know specifically when the tasks in my packages have run successfully in SSIS also how can I show that in a log tablet in SQL table

Let's there is a master package and several tasks run in it on a daily basis, I want to specifically determine when those tasks have finished like table load completed and cuble load completed, these steps run daily but I have to show this in a SQL table that this particular day table load started at this time and ended at this like etc
SSIS event handlers are the simplest means of turning an SSIS script
into a reliable system that is auditable, reacts appropriately to
error conditions, reports progress and allows instrumentation and
monitoring your SSIS packages. They are easy to implement, and provide
a great deal of flexibility. Rob Sheldon once again provides the easy,
clear introduction.
You can use on PostExecute and when the tasks runs successfully :
Or you can use containers and then use precedence constraints with success and failure

SSIS - Best Practices When SSIS Hangs

Last night my SSIS hung. I'm not really sure why. This morning I identified which package hung based on the output. I'm looking at sp_who2, but I can't see any processes that are running under the user the runs the jobs.
I'm wondering what I should be doing when my SSIS just hangs. It's still currently running, but doesn't seem to be running anything.
Start off with deploying the package to SSISDB and running it from there. If you haven't already installed the SSISDB catalog more information on this can be found here. After this enable logging in the package and review the results, specifically the phases in which the package is hanging. When doing this look for the PipelineComponentTime event which specifies how long each component took in a particular phase. A couple phases that may be of interest are the ProcessInput phase, which is where incoming records are processed, as well as the PrimeOutput that's where the data is placed into buffers and sent further down the the data flow. An overview of enabling logging is as follows.
Right Click anywhere on the Control Flow and press Logging...
Check the check-box next to the package in the Containers field to enable logging.
Choose where you want the logging records to stored using the Provider Type field. If you use the SSIS log provider for SQL Server the SYSSSISLOG table will be created in the database that's set as the Initial Catalog in the OLE DB Connection Manager that is used. On the Details pane select events that you will log. After selecting these click on the Advanced>> button to set the fields that will be logged.
Next check whichever components that you want to enable logging for. You'll want to do this for any components that you either suspect or have confirmed are encountering delays. If any Data Flow Tasks have logging enabled, the PipelineComponentTime event mentioned earlier will be available under the Details window on these.
For monitoring the package from SSIS catalog use the SSISDB DMVs. When doing this make sure that the Logging Level is set to at least basic when the package is executed. There are several ways to do this, with the easiest probably being from the GUI. Before executing the package on the Execute Package window in SSISDB, the Logging Level field can be found on the Advanced tab.
If the package is deployed to SSISDB and ran as a job in SQL Agent the logging can be set from the job step. Open the Job Step Properties window, go the the Configuration tab and then the Advanced tab where you'll see the Logging Level field.
There are many DMVs in SSISDB that hold details regarding package configuration and execution. catalog.event_messages, executable_statistics, and catalog.operation_messages are a few that will be helpful. For more insight on the components and where the delays are occurring I'd recommend catalog.execution_component_phases, which will require a logging level of either Performance or Verbose.
If the performance varies depending on what parameters are used within the package, use the execution_id from the instances of the slower executions to query the catalog.execution_parameter_values DMV to see what the parameters are set to in these executions.

SSIS - Records not being picked up

I'm working with an SSIS that I inherited. There's a Data Flow that finds all records with status 0, and inserts them into a seperate table. The data flow uses a static query to pick up the new records.
I'm running into an issue when my SSIS runs, it does not insert the record into the destination table. It does however pick up many other records in the origin table.
To make things weirder, if I run this process from a job, there are always a few records (always the same ones) that don't get picked up. However if I run the job manually, they do get picked up.
I've checked and none of the records that we're picking up have nulls in any of the candidate keys. The error handling isn't called, so there isn't an error that occurs. I can insert the records into the destination table so it isn't a PK issue.
From the looks of it, these records are not seen by SSIS run by a job, but are when I run it manually. Anyone seen this issue before?
You should check if the Source task in the Data Flow is using a connection with an Expression. The value of the Expression change your source and it can be different when you run the Package in the debugging mode vs when the package is run in the job. You can configure the expression to use parameters that can be use by the job.
You can check easily if the Connection is using an Expression because it has a FX before the connection name as the following image.
Connection DBSource in the Connection Managers
So we finally found the solution to the issue by testing on prod (kids, don't try this at home). It looks like the solution is simply to delete the SSIS and reply it.
Had anyone else seen this issue where SSIS just runs funny and needs to be redeployed?

Finding bottlenecks of ETL and Cube processing

I have an ETL and Cube solutions, which I process one after another in a SQL agent job.
In ETL I run 1 package, that in turn one by one runs all other packages.
Whole processing takes 10 hours.
For ETL:
How can I find out which package takes what amount of time to run within that one parent package, other than opening solution and record times?
For cube:
Here dimensions process fast. What do I measure here in order to find which part takes it so long? Maybe measures? How to track processing times of particular measure?
Maybe SQL Profiler will help? If so, is there a good article which describes which metrics there should I pay attention to?
To gather statistics about SSIS execution times, you can enable logging:
For package deployment model, you'll have to turn on logging in each package, go to SSIS > logging. In the dialogue choose the Pre and Post Execute events. Use a sql logging provide which will log to a system table called dbo.sysssislog. You'll need to join pre and post events on execution id.
For Project deployment model, it's probably already on. This can be configured in SSMS, Integration Services > SSISDB, right click and choose properties. Once you've executed the package, you can see the results in the standard reports. Right click the master package and choose Reports > Standard Reports > All Executions.
Lots more details on SSIS logging here: https://learn.microsoft.com/en-us/sql/integration-services/performance/integration-services-ssis-logging
For SSAS, I always tested this manually. Connect in SSMS, right click on each Measure group and do a process full (this assumes the dimensions have just been freshly processed.) The measures are more likely to be the cause of an issue because of the amount of data.
Once you understand which measure is slow, you can look at tuning the source query, if it has any complexity to it, or partitioning the measure group and doing incremental loading. Full processing could be scheduled periodically.
m

Copy of Database catilog from one instance to another using SSIS

I've been searching all over for an example of doing this but I'm not finding it. I know it is possible because we had it working at one point but the resource that developed the process isn't currently available to fix the process which is currently corrupted beyond repair. In fact corrupted so badly we can't even get into take a look at what was there to build a copy of the process over again.
What we have is a 'Production_DB' and a 'Test_DB' which and the two are essentially the same. What was taking place is that a SSIS task was firing at the end of each work day and refreshing 'Test_DB' with the data that is in 'Production_DB'. In this way testing can take place and changed can be made to the test bed without any concern that it will get too far afield of the live data because each evening this data is brought back to exactly what is in production. Meanwhile for testing purposes all testing is begin measure against actual real life data examples so when processes are pointed at the production data set there is less chance of issues.
The problem we have is several months back we didn't realize it but the SSIS package and source files became corrupted beyond readability. So, now we are looking for a way to replace the package to restore the process, but as of yet I have not been able to find an example that I can use to build from.
We are on SQL Server 2008 R2.
If anyone has some references they can point me to it would be greatly appreciated!
Depending on the amount of tables and the SQL Server version you can use the import export wizard to identify prod as the source and test as the destination...use that wizard to create a task and SAVE the ending task (it should save as an SSIS package I believe). This will get you a quick way of making the SSIS package to copy the data over and you can even overwrite the destination data if you would like.
Right click the database > tasks > import data

Resources