How to import files week by week using SSIS? - sql-server

I want to load files in SQL server database on weekly basis. Each file name contains date on it. Currently, I am using Foreach Loop Container to get the file name and stored it in table. Table contains 3 columns FileName, Date and Week. After loading FileName using Execute SQL Task I extract Date and Week from the FileName and Populate Date and Week column. Then I use Execute SQL Task to SELECT all table date ORDER BY Date and Week and store it into object variable. Finally, I use Foreach Loop Container to load actual files in date order using ADO Enumerator and object variable. This works fine. However, I want to load files on Weekly basis. For an example all the files which has week 15 in the table should loaded first. Then it should load load all the files of week 16 and so on. The reason I want to load like this is after loading one week of files I want to process it using some stored procedure.

I think the problem can be solved by making two edits:
Loop over weeks
Add an Execute SQL Task that retrieve the Distinct Weeks from the table
Add a foreach loop container to loop over weeks
inside the foreach loop add an Execute SQL Task that retrieve the rows based on the current week
Use another foreach loop container to loop over result
Ordered results
You can simply add an ORDER BY clause inside the Execute SQL Task to get an ordered resultset.

This is a limitation of the ForEach loop enumerators - there is no way to load files in a sorted/ordered manner. If you want to load files in such a manner then there are two ways to do this:
Purchase an expensive package of components from third party vendors that provide a ForEach loop enumerator that can process files in a sorted/ordered manner
Do it yourself manually.
For option two, you will need to perform the following steps:
Create a ForEach File loop enumerator scan the folder for all files and insert the file names into a database table.
Create an Execute SQL Task that will SELECT all file names, ORDERED BY file name. You can add constraints in the WHERE clause to control the date range of files that you want to process.
Load the result set into a variable of type Object
Create a ForEach ADO loop enumerator to loop through each file name that is stored in the object.
Place a data flow in the loop and then process the files.

Related

Dynamically choose which SSIS packages to launch

Having different set of tables for every SSIS load, I want to implement a smart routine which would launch only the packages which files are present.
I have a task that uploads file listing from a folder to a database table:
[dbo].[FileList]
Product.csv
Sales_2018.csv
Customer.csv
Delivery.csv
If in my SSIS, besides Product, Sales, Customer, Delivery packages I also have Shipping, Returns and others, is it possible to disable those automatically based on FileList match. So only Product, Sales, Customer, Delivery packages would run?
Or should it be approached in a different way?
Thank you!
I've done this in the past using this simple control flow.
Note the 2nd for each only loops through one time. It is just a check to see if the one file exists without a script task.
A few more notes:
1. Store execute sql results into Object Variable
2. Outer foreach is on ADO Object (variable from step 1)
2a. Map the current iteration of the object to local variables
3. Inner foreach is on file based on local variable from step 2
4. Package expression is based on local variable from step 2

How to Unpivot Columns dynamically in foreach loop in ssis

I am using foreach loop to dynamically load file in database but in my source file year and month wise data is stored.If 2020 will come then issue will generate.I want to know that if i want to do unpivot of all those columns suppose 2020 and 2021 data came then it should store sales values in different column and month and year in different column.
By foreach loop i can read all file but if i want to unpivot it dynamically and want to store it in database.
if anyone have any idea how to resolve it or which process should follow then please let me know...
I am using Foreach Loop in ssis to loop through all files inside the folder.But when i want to unpivot files at that time it is giving error because yearwise column will change.
And in my folder am having different yearwise files.How can i unpivot it dynamically??
This is My source files i want to unpivot all this files dynamically in ssis

Delete old backup files in pentaho etl tool

I want to know how to delete files based the creation date using a kettle job. I have a log folder which contains log files for last four years. But I want to keep only last week log files. The job should be deleting all the log files which are more than one month old. There is a delete file option in pentaho job. But how do we get file creation date and delete the files accordingly.
Step by step process I used to create kettle:
Get file name
Get system info
Add constants
Database lookup: here I am using postgresql it lookup the field the entity_name and attribute_name from database and date is inserted in database by using this database lookup.
Select values:
Calculator
Filter rows
Set files in result
Process files with option delete.
I want to ask that i am having filename for eg:abcd_2018_06_05.backup.
I have to use hard core regular expression to define above filename.Could anybody help me to define it so that it can take right(file_name, len(file_name)-7).
I know how it can be done in a SQL query, but in pentaho I don't know.
The get filenames step also returns the last modified timestamp. Can’t you use that instead?
Something like this:
Get filenames -> get system info (to get current date) -> calculator (subtract 7 days from current date -> filter rows (let only files older than 7 days through) -> process files: delete (delete old files.
Alternatively, using the regex step you can parse the filename and then filter rows.

For Each Loop SSIS. Dependent on SQL Query

I have an SSIS package which checks for the unprocessed file present in a tracking table and then processed it. Till date only one file would come in and we would process it and as such the process was designed accordingly.
However now multiple files can come in one go and we store those multiple files in the tracking table and we have a column which keeps a track of the unprocessed file.
I am trying to use the For Each loop to process all the unprocessed file. So I get the count of the unprocessed files and would like to simply tun the Point 1 by passing a parameter to the step 1 but I have not been successful in doing it using Foreach From Variable Enumerator. Am I missing something ?
You can do this using the following steps:
Add an Execute SQL Task to get unprocessed files and store the resultset inside a variable of type System.Object
Add a Foreach loop container, change the type to ADO enumerator and select the variable as source
In the variable mapping tab map the result (each file path) to a variable of type string
Inside the foreach loop container add a dataflow task that contains the Flat File source and implement the processing logic you need
Add a flat file connection manager define the columns
Click on the flat file connection manager, press F4 to show the property tab, go to expression.
Select the connectionstring property and use the variable that holds the filepath as expression
Detailed articles
Implementing Foreach Looping Logic in SSIS
Looping Through a Result Set with the ForEach Loop
Using SSIS to Loop Over Result Set and Dynamically Generate Output Files
How to loop through full result set using foreach container in SSIS

How to automate export of multiple tables to delimited text in SSIS?

I am trying to create some sort of automation whereby I can generate a series of pipe-delimited text extracts for about 100 different tables each month. Each extract would be based on a simple query like this:
SELECT *
FROM tablename
WHERE AsOfDate = 'currentmonth'
where both tablename and currentmonth would be variables. The tablename variable name would change for each of the tables but currentmonth would remain the same throughout the execution.
I have been attempting to build an SSIS package that uses a ForEach Loop container that runs through a list of all the table names and passes that variable into a SQL string, which is then used by the OLE DB Data source in the data flow.
However, all of these tables have different columns. Based on what I can tell, it would not be feasible to do a simple OLE DB Source to a Flat File Destination within that loop container since the Flat File Connection Manager must be configured to account for the different columns of each table.
Would there be any feasible way to do this outside of configuring the process manually for each of the 100+ tables?
You could look into BiML which programmatically creates your dataflows based on metadata.
Or you could use a Script task that loops through the tables, loops through their columns, and generates text files instead of using any dataflow at all.

Resources