SSIS Package last run date in a variable - sql-server

My SSIS package has an execute SQL task which has a query that needs a datetime filter at runtime.
The value of this filter is supposed to be the last datetime in which the package ran successfully.
What is the standard/optimal methodology to retrieve, persist and use this lastrun datetime?

For that kind of thing, I have a "config" table in the database to store the value. Then this can be read and updated each time the package runs. You could also use a text file, but that is not as secure.
Edit:
I achieve this by invoking a SQL Task at the end of the Package that calls a stored procedure. This SP accepts a bit parameter indicating success (1) or failure (0). The SP uses GetDate() to record the time that the Proc ran (which is when the Package finishes).

As DeanOC posted, I always have a step in my package that stores this kind of stuff. It can be as simple as a insert select current timestamp... kind of thing. or it may be the max of a timestamp column in the data I'm processing, so that next run I can filter by ...> StoredMaxTimestamp.

Related

SSIS User Defined Date or Default

I'm fairly new to SSIS and don't know all it's features and what tasks I can use to do things I want. I have found many Google and stackoverflow.com searches to help me get to know variables and parameters and how to set them etc.
BLUF (Bottom Line Up Front)
I have a view with data which I want to export to a file through a job that runs the package.
The data will be filtered by it's LastUpdatedDate field with datatype of DateTimeOffSet(7). The package should allow a user to run it with a specified dat or use a value from another table (SSISJobRun).
Structure
/*Employee Table*/
Id int
Name varchar(255)
LastUpdatedDate datetimeoffset(7)
/*SSISJobRun Table*/
Id int
JobTypeId int
RunDate datetimeoffset(7)
What I have Done
Currently, I'm using SSDT for VS 2015 to create my SSIS packages and deploy them to my SSIS Catalog.
I have a project with 1 package. The package contains:
a Data Flow Task named EmployeeExport; this task contains an OLE DB Source and a Flat File Destination
a package level parameter named Filename_Export (this is so that the file path can be changed when it's run by a user; the parameter has a default value configured within the Job that runs it daily
All this runs perfectly fine.
Problem
I also have set another package level parameter named LastUpdatedDate. The intent is to have who/what-ever runs the package to define a date. However, if the date is null (if I decide to use a string) or if the date is the default value 1899-12-30 00:00:00 (if I decide to use a date), I want to determine what date to use.
Specifically, if there is no real date supplied by the user, then I want to the date to be the latest RunDate. For that case I use the following code:
SELECT TOP 1 LastUpdatedDate
FROM SSISJobRun
WHERE JobTypeId = 1
ORDER BY LastUpdatedDate DESC
I've tried many different ways, and it works when I supply a date, but I couldn't get it to work when the date I gave was blank when I used a string or the default when I used a date.
Here's a few sources I've been looking through to figure out my issue
How to pass SSIS variables in ODBC SQLCommand expression?
How do I pass system variable value to the SQL statement in Execute SQL task?
http://mindmajix.com/ssis/how-to-execute-stored-procedure-in-ssis-execute-sql-task
.. and many more.
Once last note: this date will be used to run two tasks, so if there is a way to keep it global that would be great.
Lastly, I need to package to insert a row specifying when the the task was run into the SSISJobRun table.
Thank you.
Use a Execute SQL Task, paste
SELECT TOP 1 LastUpdatedDate
FROM SSISJobRun
WHERE JobTypeId = 1
ORDER BY LastUpdatedDate DESC
in the statement, and set the result to single row, in the Result page, choose the variable you set, and change the index to 0
And before the same task run the 2nd time (inside any foreach or for loop) within the same execution and does not get used anywhere within the package, this variable will remain the same value.
if you need to check, right click that Execute SQL task, Edit Breakpoints to post execution, then run the package, open watch window from Debug tab, drag and drop the variable into watch window, you should see the value.

Call sproc with temp tables from SSIS in visual studio 2012

I am new to SSIS and would appreciate any help on my issue below -
I have a stored procedure which does not take any input and returns a temp table. This is used for data validation and will be run every day thus, I need to create an SSIS package for the same (the requirement is such). I created an Execute SQL task where I have the result set in a variable of type Object but I want to add a condition of checking rowcount >1 and if yes then write to an excel file.
Thanks,
Hiral
Answer depends on whether you are using SSIS 2012+ or 2008.
Assuming you are with SSIS 2012+:
Approach - declare variable type int for rowcount. Process received Object variable in Foreach loop, more details and screenshots are available in this article. Inside loop - increment variable. Then you can do a conditional execution based on rowcount variable value > 0.
Alternatives - valid for SSIS 2008 as well:
Object with result set is nothing more than a .NET dataset. Create a Script task which counts rows in the first table of the dataset, and stores this in a variable. Then use this variable in package as described above.
Instead of using Execute SQL Script task for SP run, use DataFlow with OLE DB Source and specify the stored procedure as the data source. Then you can count the resulting rows, store value in a variable etc.
Change property Retainsameconnection =true for connection manager
Doing so you can access temp table created in previous step .
You could use a global temp table instead if you want the creation done in a separate step. But this comes with its own issues like being accessible to everyone/every process.

Custom Logging to SQL Server table in SSIS

I have a SSIS package that has three DataFlowTasks.
1st dataflow load data to destination table1
2nd dataflow load data to destination table2
3rd dataflow load data to destination table3
I configured logging by default to SQL Server table (ssiserrorlog) on error.
but this only has startdate and enddate details but I want to log the details to SQL Server custom error log table like the below.
How to do this process I am new to SSIS.
You can use the Row Count component in each data flow to get the number of rows loaded.
"Duration" is just the DateDiff between Start and End Date. You could even make it a computed column in your log table, if you're not content to just calculate it at query time.
Use RowCount Transformation in each of the dataflow task before loading into the Destination. And then use this value in the SSISLogging Table
As for as duration, since you know the starttime and endtime, use the DateDiff function.
In SSIS, you will have system variables - start and endtime. Use the System Variables to capture Start and End Time
A correct solution for the RowCount and Duration has been suggested, however, there are noted datatype issues between using the system::starttime variable to transfer Package or Event starttime custom logs from SSIS to SQL.
The user will have to create a user variable (ie. user::StartTime) and likely create an expression (depending on what is being used for EndTime) in order to solve this aspect of the problem.
https://sqljunkieshare.com/2011/12/09/ssis-package-logging-custom-logging-2008-r2-and-2012/

How can I minimize validation intervals when changing the SQL in ADO NET Source Tasks

Part of an SSIS package is the data import from an external database via a SQL command embedded into an ADO.NET Source Data Flow Source. Whenever I make even the slightest adjustment to the query (such as changing a column name) it takes ages (in that case 1-2 hours) until the program has finished validation. The query itself returns around 30,000 rows with 20 columns each.
Is there any way to cut these long intervals or is this something I have to live with?
I usually store the source queries in a table and the first part of my package would execute a select and store the query returned from the table in a package variable, which would then be used by the ADO.NET Source Data Flow. So In my package for the default value of the variable I usually have the query that is stored in the database along with a "where 1=2" at the end. Hence during design time it does execute the query but just returns the column metadata. Let me know if you have any questions.

error invoking procedure with multiple select commands

I have an SSIS package with a data flow task. The OLE DB source has an execute proc statement. It fails while saving with below error message.
an OLEDB record is available... The metadata could not be determined because the statement 'select appname....' in procedure is not compatible with the statement 'select appid....' in procedure
This proc has several select statements and returns the appropriate result set as per parameters passed. Any pointers to bypass this error?
So you're saying that the SP will return different meta data depending on the parameter passed? SSIS doesn't like this - it can't update the meta data dynamically at run time. i.e. if you create a package that splits or sorts on a certain column, then you run the SP and it doesn't return that column, or the same column is a different data type, what should SSIS do? It can't automatically work it out.
I suggest you create a data source for each possibility of result set returned and conditionally execute each on as required.
In short SP's returning optionally different datasets is often not a good idea, definitely not from an ETL perspective.
Here is some code that shows how to create dynamically built output, (you could use the same method with just one output), but you'll still face the same problems downstream.
http://www.codeproject.com/Articles/32151/How-to-Use-a-Multi-Result-Set-Stored-Procedure-in
I ran into this issue as well. In my case, the result returned looked identical no matter which branch was executed, the difference was just in how that result was obtained (including different source tables). I simply executed all the cases with a union, and each "where" clause included the conditions for its execution instead of using "if" logic to choose a query.

Resources