SSIS loop over companies - sql-server

I have a package SSIS for a company . Now I need to repeat the same work for 30 companies .
Company name
[Montreal$G_L_ entry]
[Dubai$G_L_ entry]
I would like like to loop over all companies . I get all the company names and put them in an ado record set. e.g.: SELECT DISTINCT CompanyName FROM YourTable in an execute sql task. create an ssis variable to hold the recordset .
How to make table name dynamic , I mean companyname$G_L_ entry and companyname will be changed dynamically ?

You need to use a Foreach container to iterate through the variable that holds the ADO recordset. That will assign each company name to a variable once each time through the loop. In the body of the loop container you do whatever tasks you want and us that variable as a parameter to the tasks.

Related

How to do an inner join rather than for each loop in SSIS?

On the ETL server I have a DW user table.
On the prod OLTP server I have the sales database. I want to pull the sales only for users that are present in the user table on the ETL server.
Presently I am using an execute SQL task to fetch the DW users into a SSIS System.Object variable. Then using a for each loop to loop through each item (userid) in this variable and via a data flow task fetch the OLTP sales table for each user and dump it into the DW staging table. The for each is taking long time to run.
I want to be able to do an inner join so that the response is quicker, but I cant do this since they are on separate servers. Neither can I use a global temp table to make the inner join, for the same reason.
I tried to collect the DW users into a comma separated string variable and then using it (via string_split) to query into OLTP, but this is also taking more time at the pre-execute phase (not sure why exactly) even for small number of users.
I also am aware of lookup transform but that too will result in all oltp rows to be brought into the dw etl server to test the lookup condition.
Is there any alternate approach to be able to do an inner join by taking the list of users into the source?
Note: I do not have write permissions on the OLTP db.
Based on the comments, I think we can use a temporary table to solve this.
Can you help me understand this restriction? "Neither can I use a global temp table to make the inner join, for the same reason."
The restriction is since oltp server and dw server are separate so can't have global temp table common to both servers. Hope makes sense.
The general pattern we're going to do is
Execute SQL Task to create a temporary table on the OLTP server
A Data Flow task to populate the new temporary table. Source = DW. Destination = OLTP. Ensure Delay Validation = True
Modify existing Data Flow. Modify source to be a query that uses the temporary table i.e. SELECT S.* FROM oltp.sales AS S WHERE EXISTS (SELECT * FROM #SalesPerson AS SP WHERE SP.UserId = S.UserId); Ensure Delay Validation = True
A long form answer on using temporary tables (global to set the metadata, regular thereafter)
I don't use temp table in SSIS
Temporary tables, live in tempdb. Your OLTP and DW connection managers likely do not point to tempdb. To be able to reference a temporary table, local or global, in SSIS you need to either define an additional connection manager for the same server that points explicitly at tempdb so you can use the drop down in the source/destination components (technically accurate but dumb). Or, you use an SSIS Variable to hold the name of the table and use the ~From Variable~ named option in source/destination component (best option, maximum flexibility).
Soup to nuts example
I will use WideWorldImporters as my OLTP system and WideWorldImportersDW as my DW system.
One-time task
Open SQL Server Management Studio, SSMS, and connect to your OLTP system. Define a global temporary table with a unique name and the expected structure. Leave your connection open so the table structure remains intact during initial development.
I used the following statement.
DROP TABLE IF EXISTS #SO_70530036;
CREATE TABLE #SO_70530036(EmployeeId int NOT NULL);
Keep track of your query because we'll use it later on but as I advocate in my SSIS answers, perform the smallest task, test that it works and then go on to the next. It's the only way to debug.
Connection Managers
Define two OLE DB Connection Managers. WWI_DW uses points to the named instance DEV2019UTF8 and WWI_OLTP points to DEV2019EXPRESS. Right click on WWI_OLTP and select Properties. Find the property RetainSameConnection and flip that from the default of False to True. This ensures the same connection is used throughout the package. As temporary tables go out of scope when the connection goes away, closing and reopening a connection in a package will result in a fatal error.
These two databases on different instances so we can't cheat and directly comingle data.
Variables
Define 4 variables in SSIS, all of type String.
TempTableName - I used a value of ##SO_70530036 but use whatever value you specified in the One-time task section.
QuerySourceEmployees - This will be the query you run to generate the candidate set of data to go into the temporary table. I used SELECT TOP (3) E.[WWI Employee ID] AS EmployeeId FROM Dimension.Employee AS E WHERE E.[Is SalesPerson] = CAST(1 AS bit);
QueryDefineTables - Remember the drop/create statements from the on-time task? We're going to use the essence of them but use the expression builder to let us dynamically swap the table name. I clicked the ellipses, ..., on the Expression section and used the following "DROP TABLE IF EXISTS " + #[User::TempTableName] + "; CREATE TABLE " + #[User::TempTableName] + "( EmployeeId int NOT NULL);" You should be able to copy the Value from the row and paste it into SSMS to confirm it works.
QuerySales - This is the actual query you're going to use to pull your filtered set of sales data. Again, we'll use the Expression to allow us to dynamically reference the temporary table name. The prettified version of the expression would look something like
"SELECT
SI.InvoiceID
, SI.SalespersonPersonID
, SO.OrderID
, SOL.StockItemID
, SOL.Quantity
, SOL.OrderLineID
FROM
Sales.Invoices AS SI
INNER JOIN
Sales.Orders AS SO
ON SO.OrderID = SI.OrderID
INNER JOIN
Sales.OrderLines AS SOL
ON SO.OrderID = SOL.OrderID
WHERE
EXISTS (SELECT * FROM " + #[User::TempTableName] + " AS TT WHERE TT.EmployeeID = SI.SalespersonPersonID);"
Again, you should be able to pull the Value from the three queries and run them independently and verify they work.
Execute SQL Task
Add an Execute SQL task to the Control Flow. I named mine SQL Create temporary table My Connection Manager is WWI_OLTP and I changed the SQLSourceType to Variable and the SourceVariable is User::QueryDefineTables
Every time your package runs, the first thing it will do is establish create the temporary table. Which is good because SSIS is a metadata driven ETL engine and the next two steps would fail if the table didn't exist.
Data Flow Task - Prime the pump
This data flow is where we'll transfer DW data back to the OLTP system so can filter in the source system.
Drag a Data Flow Task onto the Control Flow. I named mine DFT Load Temp and before you click into it, right click on the Task and find the DelayValidation property and change this from the default of False to True. Normally, a package validates all metadata before actual execution begins as the idea is you want to know everything is good before any data starts moving. Since we're using temporary tables, we need to tell the execution engine "trust us, it'll be ready"
Double click inside the Data Flow Task.
Add an OLE DB Source. I named mine OLESRC SourceEmployees I use the connection manager WWI_DW. My data access mode changes to SQL command from variable and then I select my variable User::QuerySourceEmployees
Add an OLE DB Destination. I named mine OLEDST TempTableName and double clicked to configure it. The Connection Manager is WWI_OLTP and again, since the table lives in tempdb, we can't select it from the drop down. Change the Data access mode to Table name or view name variable - fast load and then select your variable name User::TempTableName. Click the Mapping tab and ensure source columns map to destination columns.
Data Flow Task - Transfer data
Finally, we will pull our source data, nicely filtered against the data from our target system.
Add an OLE DB Source. I named it OLESRC QuerySales. The Connection Manager is WWI_OLTP. Data access mode again changes to SQL command from variable and the variable name is User::QuerySales
From here, do whatever else you need to do to make the magic happen.
Instead of having 270k rows with an unfiltered query
I have 67k as there are only 3 employees in the temporary table.
Reference package
But wait, there's more!
Close out visual studio, open it back up and try to touch something in the data flows. Suddenly, there are red Xs everywhere! Any time you close a data flow component, it fires a revalidate metadata operation and guess what, it can't do that as the connection to the temporary table is gone.
The package will run fine, it will not throw VS_NEEDSNEWMETADATA but editing/maintenance becomes a pain.
If you switched from global temporary table to local, switch the table name variable's value back to a global and then run the define statement in SSMS. Once that's done, then you can continue editing the package.
I assure you, the local temporary table does work once you have the metadata set and you use queries via variables for source/destination.
No need for the global temporary table hack, or the SET FMTONLY OFF hack (which no longer works).
Just specify the result set metadata in the SQL query with WITH RESULT SETS. eg
EXEC ('
create table #t
(
ID INT,
Name VARCHAR(150),
Number VARCHAR(15)
)
insert into #t (Id, Name, Number)
select object_id, name, 12
from sys.objects
select * from #t
')
WITH RESULT SETS
(
(
ID INT,
Name VARCHAR(150),
Number VARCHAR(15)
)
)
If you need to parameterize the query, there's a bit of a catch because there are some limitations in how SSIS discovers parameters. SSIS runs sp_describe_undeclared_parameters, which doesn't really work with batches that call sp_executesql, because sp_executesql has a very unique way it handles parameters, one which you couldn't replicate with a user stored procedure.
So to parameterize the query you'll either need to pass the parameter values into the query using the "query from variable" and SSIS expressions, or push all this TSQL into a stored procedure.

SSIS: enrich query and table with input file as base

I need to extract data from a DB2 database to a SQL Server. I need to create my query based on a Excel file I have 176 records, which I need to create repeating queries & put in SQL server
So for example;
I have an Excel with a Number, From date, To date, and a Country
So the query should use these information from the records
SELECT *
FROM dbo.Test
WHERE Number = excel.Number1 AND Date BETWEEN excel.fromDate1 AND excel.toDate1 AND Country = excel.country1
And then another query with
SELECT *
FROM dbo.Test
WHERE Number = excel.Number2 AND Date BETWEEN excel.fromDate2 AND excel.toDate2 AND Country = excel.country2
Etc...
How should I do something like this in SSIS?
If needed I can put the DB2 and Excel data in MS SQL
You can proceed with the following approach:
Extract data rows from Excel and put it into SSIS Object Variable
Proceed with a Foreach loop to get each row from the Object Variable, parsing Object Variable to separate variables
Inject variable values into SQL Select command with Expressions
Perform Data Flow task based on SQL command, transform and put it into the target
Overall, your task seems to be feasible, but requires some knowledge on parsing Object Variable in Foreach Loop, and writing Variable Expressions.

SSIS SQL TASK MAX(DATE) to Variable in DATA FLOW

OK this seems like it should be insanely easy, but I cannot figure it out. Every where I look online says to create temp tables and VB scripts and I cannot believe I have to do that. My goal is to insert all the records in a table with a date later than the max date in that destination table.
UPDATE The 2 tables are in two different non linked SQL databases
So:
Select #[User::Dated] = MAX(Dateof) from Table2
Insert into Table2
Select *
From Table1
Where DateOf > #[User::Dated]
I am trying to do this in SSIS. I declared a variable, the SQL execution step looks like it is assigning the single row output to it. But when I got go into the data flow it give me no parameters to choose, when I force the known parameter which is in the project scope it says no parameter exists
Create two OLE DB data sources each pointing at you two databases.
Create a variable called max_date and make its data type String.
Place an Execute SQL Task on the Control Flow, change its connection type to OLE DB and for the connection select the name of the data source that contains Table2. Set the ResultSet to Single Row. Add the following for the SQLStatement:
SELECT CAST(MAX(Dateof) AS VARCHAR) AS max_date FROM Table2
Go to the Result Set pane, click Add and enter the following:
Result Name: max_date
Variable Name: User::max_date
You can now use the max_date variable in an expression to create a SQL statement, for example you could use it in another Execute SQL Task which would use the second Data Connection like so:
"INSERT INTO Table2
SELECT *
FROM Table1
WHERE DateOf > '" + #[User::max_date] + "'"
Or in an OLE DB Source in a data flow like so:
"SELECT *
FROM Table1
WHERE DateOf > '" + #[User::max_date] + "'"
You can do this in a single SQL Task if you want:
Insert into Table2
Select *
From Table1
Where DateOf > (Select MAX(Dateof) from Table2)
If you want to use multiple Execute SQL Task items in the control flow, or want to make use of the parameter in a data flow instead, you have to change the General > Result Set option for your MAX() query to Single Row, then move from General to Result Set and Add a new variable for your result set to occupy.
To use that variable in your INSERT INTO.... query via Execute SQL Task, you'll construct your query with a ? for each parameter and map them in the parameter mapping section. If a variable is used multiple times in a query it's easiest to use a stored procedure, so you can simply pass the relevant parameters in SSIS.

loop over company names ssis 2008

I have tables from navision which names are
dbo.[CompanyName$Cust_ Ledger Entry].[Customer No_]
I have 30 companies . I am looking to loop over companies
I follow this solution https://social.msdn.microsoft.com/forums/sqlserver/en-US/a13d39ab-968b-41f2-bf85-cb46db763d4e/variable-to-dynamically-change-tablename-for-data-flow-task
But how can I use it in the data flow task ? how can I change table names dynamically ?
Ok I can see from your other post that you have attempted things rather than just asking how to do something, so I will give you a hand.
Here is an example of how you can do this:
Here is an example of the SSIS package.
For starters you will need 3 variables:
Two string variables i've called them CompanyName and SQLStatement and an Object Variable which i've called CompanyList.
We first of all need to populate the CompanyList, we do so in the Execute SQL Task.
We need to set the Result Set to be "Full Result Set", then in the SQL Statement you need to add your way of getting the full company names, for instance "Select distinct Name from dbo.company", here we then store these results in the Object variable:
This will now store all the results into the Object Variable.
We now need to setup our SQLStatement Variable. The variable is just a string variable however what we need to do it set it to "EvaluateAsExpression" - "True":
In the expression we need to enter the following:
"Select * from dbo.[" + #[User::CompanyName] + "$Cust_ Ledger Entry].[Customer No_]"
Now this is setup we can get on with the ForeachLoop.
Set the ForEachLoop as above, where we want to evaluate "Rows in first table" against the "CompanyList" object variable.
The next step is to map that to our final variable "CompanyName" which will be used in out SQL Expression:
In the ForEachLoop go to the VariableMapping tab and enter Select the "CompanyName" variable with Index 0, this will change the "CompanyName" variable each time the loop runs, effectively changing the SQL Statement Variable allowing us to loop through each of the companies.
The final step is to go into the DFT and setup the OLEDB Source:
For the OLEDB Source we need to select "SQL command from variable", here we then select the "SQLStatement" variable.
You can then carry on your DFT as you wish.

How to insert retrieved rows into another table using ssis

I have a table and it has 500 rows. I want to retrieve only 10 rows and i want to insert into another table using control flow only. Through data flow task we can use OLEDB source and OLEDB destination. But i want result in such a way that by using execute sql task and for each loop. Is it possible to do in that way? My Idea is, get the set of ten records and and by using foreach loop iterate to every row and insert into the table by using execute sql task. The destination table need to create on the fly. I tried with some approach but not moving towards. Please find the image file.
Example taken from Northwind
Create variables (in variable collection) which represent the columns in the table which u ll create at runtime
Example :-
Customer_ID as string
Order_Id as int
Then u need to create Execute SQL Task and write the below query to select first 10 rows
Select top 10* from orders
Use FullResultSet and in Result Set configuration store the table rows in a variableName :- User::Result ResultName:0
Drop one Execute SQL Task and create a table on fly
IF OBJECT_ID('myOrders') IS not NULL
drop table myOrders
Create table myOrders
(OrderID int,
CustomerID varchar(50)
)
combine the 2 flows from Execute sql task and connect it to the Foreach loop
Drag a foreach loop .In collection use enumerator type as Foreach ADO Enumerator
In enumerator configuration select user::Result variable which stores the top 10 rows from the execute sql task and select the radio button " Rows in the first table"
In variable mapping ,map the column variables which u have created in the first step and the index will 0 for first column and 1 for 2nd column
Drag a execute sql task inside a foreach loop and write the below query :
Insert into myOrders( OrderID,CustomerID)
values
(?,?)
Map the parameters using parameter mapping configuration in execute sql task
VariableName : OrderID Direction : Input DataType=Long ParamterName=0
VariableName : CustomerID Direction : Input DataType=varchar ParamterName=1
I hope you are doing this on a "study-mode". There is no reason why to do this on the control flow over the data flow.
Anyway, your print screen is correct, I would just add another execute sql task in the beginning to create your destination table.
Then, your execute sql task should have the query to bring the 10 rows you want, its result set should be set to "Full result set" and on the resultset tab you should map the result set to a variable like this:
and configure your foreach loop container like this:
on each loop of the foreach you will have access to the values on the variables, then you can use another execute sql task to insert then on the new crated table

Resources