SSIS: Can I edit my recordset as I loop through it using the foreach statement? - sql-server

I am using SSIS to make a series of complicated data exports, and I've hit a roadblock. I've populated a package variable with a recordset. I'd like to use a foreach loop to iterate through each row in the recordset. I'd like to update one of the columns in each row based on some calculations which I've done inside a script task.
Is this possible? I know in C# the foreach collection is immutable, but I don't know if SSIS works the same way.
Unfortunately, I haven't found any good examples of using the for loop construct instead, which might be a potential solution.

When you put data into a recordset it's stored in an object variable. You can use the Foreach Loop Container - and loop the object variable. You then create some variables to hold each column for the row - and you then have a row-by-row ability to do whatever you please, be it a data flow task, sql statement, script task (c#) or anything else.
See http://www.sqlis.com/post/Shredding-a-Recordset.aspx for an illustrated example of how to do this and send an email for every row.

Related

SSIS - importing identical data from multiple databases

I want to copy and merge data from tables with identical structure (in a number of different source databases) to a single table of similar structure in a destination database. From time to time I need to add or remove a source database.
This is currently achieved using a Data Flow Task containing an OLEDB source with a SQL query within which there is a UNION for each of the databases I am extracting from. There is quite a lot of SQL within each UNION so, if I need to add fields, I need to add the same additional SQL to each UNION. Similarly, when I add or remove a source database I need to add or remove a UNION.
I was hoping that, rather than use such a UNION with a lot of duplicated code, I could, instead, use a Foreach Loop Container that executes SQL contained in a variable using parameters to substitute the name of the database and other database dependent items within the SQL on each iteration but I hit problems with that as I assume the Data Flow Task within the loop could not interpret the incoming fields because of the use what is effectively dynamic SQL.
Any suggestions as to how I might best achieve this without duplicating a lot of SQL?
It sounds like you have your loop figured out for moving from database to database. As long as the table schemas are identical (other than names as noted) from database to database, this should work for you.
Inside the For Each Loop container, create either a Script Task or an Execute SQL Task, whichever you're more comfortable working with.
Use that task to dynamically generate the SQL of your OLE DB Source query, changing the Customer Code prefix for each iteration. Assign the SQL text to a variable, either directly in a Script Task, or by assigning the Result Set of your Execute SQL Task (the result set being the query text) to a variable.
Inside your Data Flow Task, in the OLE DB Source, under Data Access Mode select "SQL Command from variable". Select the variable that you populated with your query in the last task.
You'll also need to handle changing the connection string between iterations, but, again, it sounds like you have a handle on that part already.

SSIS trying to only load file if all rows are good

I am trying to use a SSIS package to insert data from a file into a table but only if all the data in the file is good. I have read around and realise that I can split my good data and bad data with a conditional split.
However I cannot come up with a way to not write the good data if there is some bad data rows.
I can solve my problem use a staging table. I just thought I would ask if I am missing a more elegant way to do this within SSIS package rather than load then transform with TSQL.
Thanks
SSIS way allows wrapping actions in a Transaction. According to your task, you need to count bad rows in the dataflow, and if there is at least one bad row - do nothing i.e. rollback.
Below is how I would do it in Pure SSIS. Create a sequence and specify TransactionOption=Required on it, move your dataflow to the sequence. Add Count Rows transformation to your bad rows dataflow and store its result to some variable. After DataFlow inside sequence - create conditional task link where you check whether bad_rowcount variable > 0, and on the next - do little script task which raise an error to roll back transaction.
Pure SSIS - yes! Simpler than using staging table - not sure.

SQL Server 2008 - Save each "while loop" result to a different file

I'm using a double while loop to get a lot of results from several different tables. I get everything I need (500+ subjects, each with 1000+ rows), but each comes into a different grid. I would like to save each "while" result to a different .csv file. Is there any way
Might be possible to do using SQLCMD or BCP, but will be quite cumbersome to code, using quite a few variables and dynamic SQL.
If faced with this scenario, I would personally go with an SSIS package -
Use package variables to generate destination filenames dynamically
Use a for each loop container instead of the while loop
Put a dataflow task inside the container and use the Select code as source and the file from step one as the destination
It is pretty easy to do.

SSIS Foreach ADO Enumerator with Data Flow Task & Script

Been working on this all day and its driving me crazy. I'm new to SSIS (usually a C# programmer) so things don't work the way I'd expect them to :)
(tl;dr - how can I use an ADO ForEach Enumerator and pass the whole row as variable to the Data Task (which holds a Script component) inside?)
Scenario:
I am using a CDC Splitter component in a Data Task to retrieve and split out INSERTS, UPDATES and DELETES from the CDC tables within my database. I assign each change to a package level Object variable which is a recordset.
Taking just the INSERTS object variable I want to check for a condition on the provided string data, and if its true I want to email the change to someone.
To achieve this I dropped a ForEach container on the surface and hooked it up to the INSERTS Object variable. Then inside that container I dropped a Data Flow Task, and then within that Data Flow Task I dropped a Script Component.
The idea is if some condition is met for the current row, the row will be added to the Script Component's Output buffer. What's bugging me is that it seems as though the only way to achieve this is to map each column in the row to the ForEach loops variables collection. Every example I've seen shows variables mapped starting with Index 0 etc. However my row has a lot of columns and I don't want to create variables for every column, it seems wrong to do it this way.
So the question is.. within the ADO Enumerator ForEach container, how do I pass the current row (all the columns) to the Data Task inside (and in turn the Script Task within the Data Task) ?
Thanks for any advice...

How to insert a row into a dataset using SSIS?

I'm trying to create an SSIS package that takes data from an XML data source and for each row inserts another row with some preset values. Any ideas? I'm thinking I could use a DataReader source to generate the preset values by doing the following:
SELECT 'foo' as 'attribute1', 'bar' as 'attribute2'
The question is, how would I insert one row of this type for every row in the XML data source?
I'm not sure if I understand the question... My assumption is that you have n number of records coming into SSIS from your data source, and you want your output to have n * 2 records.
In order to do this, you can do the following:
multicast to create multiple copies of your input data
derived column transforms to set the "preset" values on the copies
sort
merge
Am I on the right track w/ what you're trying to accomplish?
I've never tried it, but it looks like you might be able to use a Derived Column transformation to do it: set the expression for attribute1 to "foo" and the expression for attribute2 to "bar".
You'd then transform the original data source, then only use the derived columns in your destination. If you still need the original source, you can Multicast it to create a duplicate.
At least I think this will work, based on the documentation. YMMV.
I would probably switch to using a Script Task and place your logic in there. You may still be able leverage the File Reading and other objects in SSIS to save some code.

Resources