Return dataset in dataflow - sql-server

Could I get ideas on retrieving the dataset using lookup method. Basically, my scenario as I have source data needs to lookup for other source table and on matching column from source I need to get all the records from other source data.
its a one to many relations. I tried Lookup but gives only one record on matching condition, OLE DB command don't retrieve any data as it will do only Insert/Update operations.
Thanks
prav

If you want to use a Lookup Component then the two columns you match on must be exact. To clarify, if you are doing a Lookup on a varchar-type column and only finding one match it may be because there is only one exact match - trying doing a SELECT..FROM..JOIN..WHERE statement to confirm. If there are matches but they aren't going through the Lookup check your source data after it comes out of the OLEDB source (it may need to be trimmed).
If exact matching isn't necessary, you could try Fuzzy Lookup which allows you to specific how close (by giving a percentage) you want the matching columns to be.

This has solved using the script component, which will prepare the sql script then execute so in a single hit I could get the full result set as it is not possible with lookup to retrieve result set. On matching look up will return only one row even though multiple key matches.
thanks
prav

Related

SSIS Join Recordset With Table

I have an SSIS package in which I'm reading the records from a Flat File and storing them in a recordset. Is it possible to compare the values in the recordset with the values in a database table and update the table?
I'm Using SQL Server 2008 R2 and Same version of SSIS.
Leran2002's answer in general is right, the most straight forward way is to have a lookup component set up to Redirect rows to no match output and use a destination and a OLE DB Command afterwards.
However depending on the size of the result sets, this might be slow, since the lookup component will check each row one-by-one and if your destination table has lots of records, this will take some time. Furthermore, depending on your cache settings in the lookup component, it can use lots of memory.
There are two more ways to achieve this:
Merge Join
Using your file source and your destination table as a source, you can use a Merge Join. The logic in the DFT is a bit more complex, but this more a set-based approach and with large result sets it is working better.
You'll have to implement the logic which record has to be updated, inserted, deleted or discarded from the file using a conditional split component.
I highly recommend this question (not exactly your problem, but a good comparison in my opinion): What are the differences between Merge Join and Lookup transformations in SSIS?
Staging table
Another way is to use a staging table to temporarily store the records from a file. In this case, your DFT just loads the records from a file into the staging table, then with one or more Execute SQL Task you can do the merging of the two data sets. (UPDATE, INSERT, DELETE, MERGE, you can use what fits your needs).
Usualy I use Lookup-component with option Redirect rows to no match output.
And after that you can use two rowsets which named Lookup No Match Output and Lookup Match Output.
PS. I have three articles about SSIS, but they in Russian (but there is a lot of SQL-scripts and pictures).
If it's interesting you, you can look the following link - https://habrahabr.ru/post/330618/

Query SQL database from Excel

I'm attempting to create a MS Query to return data from a SQL database based on a value from a cell in Excel. I have actually successful accomplished this, but only for 1 row. I cant figure out how to get it to copy-down to other rows.
I've created a connection as follows:
Notice that the SQL statement includes a parameter. The parameter is set to point to a specific cell:
I guess this makes sense as I'm only looking to return 1 value per row:The problem is that I have multiple lines to return values for. How do I return a value per row for multiple rows?
I've tried changing the cell reference in the Parameters dialog box, but this does not work as the Excel Table is designed to grow dynamically.
Excel data connections works in a way that every connection has only one SQL Query. So in order to do what you'r looking for, you will need to have many connections, and that's not the "best practice".
However, there are two ways you can solve this situation:
1. Make a single connection with all of the data and create a pivot table based on it. Then use VLOOKUP/INDEX to gather the data to your requested cells.
2. If the data is too big, you can use VBA code to create a smaller Query based on the cells you mentioned and then continue as described on the first option.
Good luck.

SSIS, splitting a single row into multiple rows

My problem is as follows. I have a CSV file (~100k rows) containting history information with the column format of:
ID1,History1,ID2,History2...ID110,History110
Each row may have anywhere between 0 and 110 history entries. Each separate entry requires a stored procedure to be called.
If there were a small number of possible entries per row, I imagine the way to do this would be to transform the data using a script, and send it to a unique path. Creating 110 paths would probably work, but isn't very elegant (and quite time consuming).
What would the best way to approach this be?
Just load the data (raw csv unchanged, one row per file line) into a staging table. Then, call a stored procedure that will use a string splitter to break up and loop over the staging table rows and call your other procedure for each history entry.
see: Arrays and Lists in SQL Server 2005 and Beyond
also see this previous answer: SQL comma delimted column => to rows then sum totals?
If you want to solve this in SSIS without the staging tables, you could create a destination script component. You could use switch statement or hashtable to lookup the right sproc to execute for the data row.
It is unclear whether this is a better solution then the staging table approach above; but it is an alternative.
I know you already accepted an answer, but couldn't you use an Unpivot task to achieve what you wanted to do here?

what happens to the resultset if the lookup table in the lookup task is empty?

Will there be any further resultset to process if the lookup table in the lookup task is empty?
sagar
It depends on how you have your Lookup transform configured. It also depends on which version of SSIS you're using. SSIS 2008 allows you to configure the lookup so that a failure either goes to the error output, or to an alternate success output.
In either case, if configured, the row would go to one of those two outputs. I do not know what would happen if you have no error or alternate output configured. I think you're probably correct.
If you're trying to fill in a value from the lookup table if a match is made, but to do nothing if there is no match, then you need to configure one of the other two outputs, then use a Union All transform to bring the two branches back together.

How to insert a row into a dataset using SSIS?

I'm trying to create an SSIS package that takes data from an XML data source and for each row inserts another row with some preset values. Any ideas? I'm thinking I could use a DataReader source to generate the preset values by doing the following:
SELECT 'foo' as 'attribute1', 'bar' as 'attribute2'
The question is, how would I insert one row of this type for every row in the XML data source?
I'm not sure if I understand the question... My assumption is that you have n number of records coming into SSIS from your data source, and you want your output to have n * 2 records.
In order to do this, you can do the following:
multicast to create multiple copies of your input data
derived column transforms to set the "preset" values on the copies
sort
merge
Am I on the right track w/ what you're trying to accomplish?
I've never tried it, but it looks like you might be able to use a Derived Column transformation to do it: set the expression for attribute1 to "foo" and the expression for attribute2 to "bar".
You'd then transform the original data source, then only use the derived columns in your destination. If you still need the original source, you can Multicast it to create a duplicate.
At least I think this will work, based on the documentation. YMMV.
I would probably switch to using a Script Task and place your logic in there. You may still be able leverage the File Reading and other objects in SSIS to save some code.

Resources