I need to extract data from a DB2 database to a SQL Server. I need to create my query based on a Excel file I have 176 records, which I need to create repeating queries & put in SQL server
So for example;
I have an Excel with a Number, From date, To date, and a Country
So the query should use these information from the records
SELECT *
FROM dbo.Test
WHERE Number = excel.Number1 AND Date BETWEEN excel.fromDate1 AND excel.toDate1 AND Country = excel.country1
And then another query with
SELECT *
FROM dbo.Test
WHERE Number = excel.Number2 AND Date BETWEEN excel.fromDate2 AND excel.toDate2 AND Country = excel.country2
Etc...
How should I do something like this in SSIS?
If needed I can put the DB2 and Excel data in MS SQL
You can proceed with the following approach:
Extract data rows from Excel and put it into SSIS Object Variable
Proceed with a Foreach loop to get each row from the Object Variable, parsing Object Variable to separate variables
Inject variable values into SQL Select command with Expressions
Perform Data Flow task based on SQL command, transform and put it into the target
Overall, your task seems to be feasible, but requires some knowledge on parsing Object Variable in Foreach Loop, and writing Variable Expressions.
Related
I have SSIS package with data flow task and execute SQL task components in For each loop container. Package flow is, Date flow task(flat file--> Conditional split to insert data into SQL server tables)-->Execute SQL task(perform some SQL operations on inserted data and insert the calculated values in one final analysis table) . File name is like name1_name2_yyyymmdd_1234.txt. I want to fetch the date from file name and insert that date value in table in SQL Server as FileDate. I am trying to do it using derived column but unable to figure our where will I save it so that it will be available in Insert statement in Execute SQL Task which is after Data flow task.
This should be done outside the dataflow but within the ForEach loop.
Pass in two parameters (package scope) to a script task. One with the #filename (read only) from the forloop and to store the #fileDate (read/write).
Split will create a 0-based array in which you only care about the third piece.
Dts.Variables["fileDate"].Value = DateTime.ParseExact(Dts.Variables["fileName"].Value.Split('_')[2]
,"yyyyMMdd", System.Globalization.CultureInfo.InvariantCulture);
Now you can use #fileDate anywhere you would like.
Objective: automate creating a table along with all its columns and data types given a SSIS source
My guess is:
1) Pointing Sources to a Destination to a SQL command
2) Using Select * into ... Problem is I don't know what the from equivalent of a source is
Alternative) Store results in Recordset and pass on to Execute SQL task. Problem then is how to access that result from execute sql task
I think you should use a Recordset Destination to store data into an System.Object Variable, Then use a Script Task (starts after that Data Flow Task is executed) in which you will select the System.Object Variable as ReadOnly Variable. and you will write your own code to insert the Recordset to SQL using System.Data.SqlClient.SQLCommand Object
You can refer to one of these links
Issues with SSIS Script Task and SQL Bulk Insert - C#
Insert DataTable into SQL Table in C#
If you need just the structure of table use this trick
select top 0 * into NewTable from YourTable
I have a stored procedure on a server which generates a table in my database. Then in ssis I'm querying some columns from that table and then I'm appending some dummy columns filled with static values. When I query the database I'm doing it so by holding the query into a variable (sql command from variable), in that query I am using a select a, b, c from X where #[User::variable1] = '' and #[User::variable2]='' for all 4.
My question is: I need to be able to change the value of those variables (variable1 to 4) for 48 different scenarios (or might be more than that), so manually replacing them would be a pain since it will lead to over 130 combos. If there a way in which I could pass the values from an excel file at runtime to the package?
ex:
column1 column2 column3 column4
12.03.2015 def ghi jkl
12.04.2015 456 789 012
..
..
And I need to loop through all columns in the excel file and the results should be exported to files.
What I described above I already made except for the part in which I can get the values for those 4 variables from the excel file. I need help only with this part.
Any help would be great.
Thank you,
Cristian
Yes, this is possible.
Create a Connection to Excel
Create a Transit table to store the excel content (obviously the column names)
Create a "Data Flow" Task to transfer the content from Excel into the Transit Table
Create an "Execute SQL Task"
Get one by one row from Transit table in a loop or Cursor
Dynamically create a SQL string with the value read from the Transit Table
Execute it by using sp_executesql.
Use "Result set" if you want to output any recordset
I have an OLEDB (SQL) data flow source (A) that pulls a result set from a stored procedure and throws the results into an OLEDB (Oracle) data flow destination (B).
Is there a way to capture an aggregate value from the dataset into a variable, all within the data flow task? Specifically, I'd want to capture the MAX(<DateValue>) from the entire dataset.
Otherwise, I'd have to pull the same data twice in a different data flow task, whether I point to A or in its new location, B.
EDIT: I already know how to do this in the Control Flow from an Execute SQL task. I'm asking because I'm curious to know if I can get this done in the Data Flow task since I'm already collecting the data there. Is there a way to grab an aggregate value in the Data Flow?
One way of doing it would be to add a multicast transform between the source and destination that also feeds into a script component.
Whilst an aggregate transform would also work this method avoids adding a blocking transform
Configure the Script Component as a destination, give it read/write access to the variable and then edit the script to be something like
//Instance level variable
DateTime? maxDate = null;
public override void PostExecute()
{
base.PostExecute();
if (maxDate.HasValue)
{
this.Variables.MaxDate = maxDate.Value;
}
System.Windows.Forms.MessageBox.Show(this.Variables.MaxDate.ToString());
}
public override void Input0_ProcessInputRow(Input0Buffer Row)
{
if (!Row.createdate_IsNull)
{
maxDate = Row.createdate < maxDate ? maxDate : Row.createdate;
}
}
U keep your current DFT as such in the control Flow (source to destination mapping as such)
In the control flow, add an Exceute SQL task, With the same source query with your desired MAX() function applied on it.
Eg:
--Let the given be Your source query.
SELECT ColumnA,
ColumnB,
ColumnC,
DateValue
FROM SourceA
--Your new query to calculate MAX() may be this.
SELECT MAX(DateValue)
FROM SourceA
Give the 2nd SQL in the execute SQL task.
In the package Add a variable of type int, in package level scope. (eg: name = intMax)
In the Execute SQL task, not the following.
a.general Tab
Result Set = Single Row
Sql Statement = SELECT MAX(DateValue) FROM SourceA
b.result set Tab
click ADD
ResultName = 0
variable Name = variable Name (eg: name = intMax)
Your required result will be available in the variable from here onwards.
I have a table and it has 500 rows. I want to retrieve only 10 rows and i want to insert into another table using control flow only. Through data flow task we can use OLEDB source and OLEDB destination. But i want result in such a way that by using execute sql task and for each loop. Is it possible to do in that way? My Idea is, get the set of ten records and and by using foreach loop iterate to every row and insert into the table by using execute sql task. The destination table need to create on the fly. I tried with some approach but not moving towards. Please find the image file.
Example taken from Northwind
Create variables (in variable collection) which represent the columns in the table which u ll create at runtime
Example :-
Customer_ID as string
Order_Id as int
Then u need to create Execute SQL Task and write the below query to select first 10 rows
Select top 10* from orders
Use FullResultSet and in Result Set configuration store the table rows in a variableName :- User::Result ResultName:0
Drop one Execute SQL Task and create a table on fly
IF OBJECT_ID('myOrders') IS not NULL
drop table myOrders
Create table myOrders
(OrderID int,
CustomerID varchar(50)
)
combine the 2 flows from Execute sql task and connect it to the Foreach loop
Drag a foreach loop .In collection use enumerator type as Foreach ADO Enumerator
In enumerator configuration select user::Result variable which stores the top 10 rows from the execute sql task and select the radio button " Rows in the first table"
In variable mapping ,map the column variables which u have created in the first step and the index will 0 for first column and 1 for 2nd column
Drag a execute sql task inside a foreach loop and write the below query :
Insert into myOrders( OrderID,CustomerID)
values
(?,?)
Map the parameters using parameter mapping configuration in execute sql task
VariableName : OrderID Direction : Input DataType=Long ParamterName=0
VariableName : CustomerID Direction : Input DataType=varchar ParamterName=1
I hope you are doing this on a "study-mode". There is no reason why to do this on the control flow over the data flow.
Anyway, your print screen is correct, I would just add another execute sql task in the beginning to create your destination table.
Then, your execute sql task should have the query to bring the 10 rows you want, its result set should be set to "Full result set" and on the resultset tab you should map the result set to a variable like this:
and configure your foreach loop container like this:
on each loop of the foreach you will have access to the values on the variables, then you can use another execute sql task to insert then on the new crated table