check if file exists then import using script task in SSIS - sql-server

I have static table that has the has something like this:
xy_Jan10
yz_Feb11
xx_March14
by_Aug09
etc.
these names are static and they are stored in a table. I am using ForEachLoop container, so first i am reading and saving all the static names that i mentioned above into an system object variable. Next i am using the ForeachLoop containter and looping through for each of the file name and saving it into another string variable called strFileName. So in my for each loop container, i have script task that checks first if the file exists and here is where i have the problem, for each file name that comes to the variable i want to check if that file name exist firs, if exists i want to load it into my table, if not exist then i want to check the next file name, if next file name does not exist then i want to check the next variable name inline and so on. I only want to load if the variable file name matches the files on the network drive, if it is not found then i want to check next one until i go through each one in my static list names. My issue now script task stops when there is no match with the file names but i want it to go to the next variable name in the list and load it because there are a lot of other matches that are not loaded. the script task stops at the first one where it finds non much. Here is my script task:
please not the files i am loading are SAS files.
Public Sub Main()
'
' Add your code here
'
Dim directory As DirectoryInfo = New DirectoryInfo("\\840KLM\DataMart\CPTT\Cannon")
Dim file As FileInfo() = directory.GetFiles("*.sas7bdat")
If file.Length > 0 Then
Dts.Variables("User::var_IsFileExist").Value = True
Else
Dts.Variables("User::var_IsFileExist").Value = False
End If
Dts.TaskResult = ScriptResults.Success
End Sub

It looks like you need to wrap the script task inside a ForEach loop container. There's plenty of information about how to do this on the web, or even on Stack Overflow: How to loop through Excel files and load them into a database using SSIS package?
or
http://www.sqlis.com/sqlis/post/Looping-over-files-with-the-Foreach-Loop.aspx

Related

How to program SSIS to read a bunch of files sequentially in a directory?

I'm working on an SSIS package where I need to read several different CSV files in order to insert their data into a SQL Server Database.
There will be roughly 500 csv files, all in the same folder. They will have an ordered naming pattern like:
tFile1.csv
tFile2.csv
tFile3.csv
tFile4.csv
tFile5.csv
etc
How can I program SSIS to automatically start with tFile1.csv, then automatically do tFile2.csv then tFile3.csv etc in order?
try using a ForEach Loop Container.
If the order the files are processed in is important then note that this processes them in file name order. If the name order is not the correct processing order then it is probably easier to rename the files than to try and build a workaround to the file processing order.
For example, if you want to process the files in creation date order then rename the files to prefix them with their creation date in yyyymmdd format
This can be done by a combination of a Script Task to obtain and sort the file names and Foreach loop to load each file in order. I'm not sure what version you're using, but this worked on SSDT 2017 with no issues for me. Also note that the file extension is case sensitive and does need a period for your case (i.e. ".csv" lowercase). More details are on this below.
Create an object-type SSIS variable and if necessary string variables for the file location, prefix, and extension. Also create an empty string variable to use for the Flat File Connection Manager. If you're using an expression for the file location make sure to add a \ after the final folder and an extra \ to escape this, for example "C:\\Your Folder\\File Source Folder\\"
If you haven't already, create a Flat File Connection Manager with an expression for the ConnectionString property. This will update on each iteration of the Foreach Loop. Add an expression that puts the source file location variable together with the current file name.
Then add a Script Task (the example uses C#) on the control flow with the location, prefix, and extension variables in the ReadOnlyVariables pane and the object variable in the ReadWriteVariables field. Don't forget to add the references from the using statements in the Script Task as well. More details on the code is in the example.
Add a Foreach Loop with the Foreach ADO Enumerator and the object variable as the ADO Object Source Variable.
On the Variable Mappings pane add the empty string variable for the current file name at Index 0.
Inside the Foreach Loop create a Data Flow Task the loads the necessary destination object from the Flat File Connection Manager.
Flat File Connection Manager Expression:
#[User::FileLocation] + #[User::CurrentFile]
C# Script Task:
using System;
using System.IO;
using System.Data.OleDb;
using System.Data;
using System.Windows.Forms;
using System.Text.RegularExpressions;
//Add these as ReadOnlyVariables in the Script Task
string fileLocation = Dts.Variables["User::FileLocation"].Value.ToString();
string filePrefix = Dts.Variables["User::FilePrefix"].Value.ToString();
//make sure to use the . in ".csv" for the detension
string fileExt = Dts.Variables["User::FileExt"].Value.ToString();
DataTable preSortDT = new DataTable();
OleDbDataAdapter adapter = new OleDbDataAdapter();
preSortDT.Columns.Add("FileName", typeof(string));
preSortDT.Columns.Add("FileNumber", typeof(int));
DirectoryInfo sourceDirectoryInfo = new DirectoryInfo(fileLocation);
foreach (FileInfo fi in sourceDirectoryInfo.EnumerateFiles())
{
if (fi.Name.ToLower().StartsWith(filePrefix.ToLower()) && fi.Extension == fileExt)
{
//regex to get last numeric digits before final . (i.e. .csv)
int fileNumber =
Convert.ToInt32(Regex.Match(fi.Name.Substring(0, fi.Name.LastIndexOf('.')), #"\d+$").Value.ToString());
preSortDT.Rows.Add(fi.Name, fileNumber);
}
}
//create DataView for sort of records
DataView preSortDV = preSortDT.DefaultView;
preSortDV.Sort = "FileNumber asc";
//create final data table to hold sorted records
DataTable postSortDT = preSortDV.ToTable();
DataSet postSortDS = new DataSet();
postSortDS.Tables.Add(postSortDT);
//Add object variable as a ReadWriteVariable and populate via sorted data set
Dts.Variables["User::FileNames"].Value = postSortDS;

SSIS Foreach Loop Container Not Looping through Excel Files

The Foreach Loop container in my process isn't pushing the new filename into the established variable. It loops through the process as many times as there are files that meet the criteria, I just need the file name to be dynamic.
I have created a variable name that contains the full filepath of the first file in my desired directory. Looks something like C:\Somepath\ExcelFile.xlsx
I have also created a variable name ExtProperties to be used in the ConnectionString with the value "Excel 12.0;HDR=Yes"
The Foreach Loop Container has the following settings:
:
The Enumerator is set to the Foreach File Enumerator
The Folder is the directory location of my files
The Files is currently set to *.xlsx
Retrieve file name is set to Fully Qualified
The ExcelFileName variable I mentioned previously has been set at Index 0
I've created an Excel connection manager pointing to the initial file with the following relevant properties:
DelayValidation: True
Expression: I have tried both setting the ExcelFilePath to the
ExcelFileName variable and using the following for the
ConnectionString:
"Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + #[User::ExcelFileName] + ";Extended Properties=\"" + #[User::ExtProperties] + "\""
Right now it is using only the ConnectionString.
Retain Same Connection: False
The data flow is using an excel source using the excel connection manager. The purpose of the dataflow is to pull the number of records from each excel file, get the name of the file and the user performing the load, and push the information into the DB. When pushed out to the DB however, the filename and record count is constantly the first file used, the same number of times as however many files meet the criteria.
I get no error messages or warnings. I have used the following script in my control flow to see if the value of the variable has been changing, but the message box popping up shows that I still get the initial value.
MessageBox.Show(Dts.Variables["User::ExcelFile"].Value.ToString());
Dts.TaskResult = (int)ScriptResults.Success;
I've been reading threads about this for days and these were the settings that were proposed to work, but this is still an issue for me. Any help would be appreciated.
From the first image, it looks like you have set the Variable ExcelFileName to be evaluated as Expression since the expression mark (fx) is shown on the variable icon:
Just remove the expression from the variable and check that the EvaluateAsExpression property is set to False
In the foreach loop editor select name and extension instead or fully qualified.
and check traverse subfolders if you have subfolders.

How to pick dynamic files from a specific folder and then export to SQL server using SSIS

I have been trying to create an SSIS task which picks the MS Access file from a specific folder
and then export to SQL Server ( if that file/table found in server then skip else export).
I am new to SSIS, i have used script task to select the file names dynamically and then trying to move, but I end up getting unsatisfied results . Even I have googled and got few ideas, but still not able to get it the way I wanted. Any detailed help would be very helpful.
Note : Here, am not always sure about the filename from that folder(i.e dynamic)
There are many options for dynamically selecting files. Since you're unsure about the filename, I'm assuming this is a parameter or variable. The following is an example of checking a folder from a variable for the given file name and loading it to an SSIS object variable. These files are then loaded into a SQL Server table using the Foreach Loop. You mentioned files as opposed to a single file, so this example assumes that only part of the file name is passed in, such as would be the case if the date/UID was appended to the beginning or end of the file name.
Add a Script Task, with the parameters/variables holding the file and folder name as ReadOnlyVariables and the object variable which will store the file names during execution as a ReadWriteVariable. The code for this is at the end of this post.
The string.IndexOf method is used to check for files containing the given text, with the StringComparison.CurrentCultureIgnoreCase parameter used to make this search case-insensitive. This example uses a variable for the file path and a parameter for the file name (denoted by $Package in the parameter name).
Add a Foreach Loop of the Foreach From Variable Enumerator Enumerator type. Add the object variable that was populated in the Script Task as the Variable on the Collection page. On the Variable Mappings pane, add a string variable at index 0. This will need to be an empty string variable that will hold the name of each file.
Create a Flat File Connection Manager from an example data file. Make sure that the column names and data types are appropriately configured. To set the file name dynamically, choose the ConnectionString expression (click the ellipsis of the Expression property in the Properties window of the connection manager) and add the same string variable from the Mappings Pane of the Foreach Loop.
Inside the Foreach Loop, add a Data Flow Task with a Flat File Source using the same connection manager. Then add either an OLE DB or SQL Server Destination with your destination connection and connect the flat file source to this. I've found SQL Server Destinations to perform better, but you'll want to verify this in your own environment before making the choice. Choose the necessary table and map the columns from the flat file source accordingly.
List<string> fileList = new List<string>();
//get files from input directory
DirectoryInfo di = new DirectoryInfo(Dts.Variables["User::FilePathVariable"].Value.ToString());
foreach (FileInfo f in di.GetFiles())
{
//check for files with name containing text
if (f.Name.IndexOf(Dts.Variables["$Package::FileNameParameter"].Value.ToString(), 0, StringComparison.CurrentCultureIgnoreCase) >= 0)
{
fileList.Add(f.FullName);
}
}
//populate object variable
Dts.Variables["User::YourObjectVariable"].Value = fileList;

how to move files to different folders , based on matching filename and foldername in ssis

I have four files xxxxxxCd999, xxxxCf999, xxxxC999 , xxxxD999 ... I need to move these files to their respective folders based on file name , for example file xxxxxCd999 should be moved to folder Cd999 , file xxxxCf999 should be moved to folder Cf999 ,file xxxC999 should ne moved to folder C999 so on ...
How do I achieve this in ssis ?
I have used a for each loop container, assigned some variables for sourcepath, destinationpath , and a file system task to use these variables , but im lost now n have no idea how to proceed ,
Kindly help me
Try this :-
The Foreach Loop will enumerate the source folder and the path will be stored in a variable. In the script task write a code to get the folder Name using regular expression .The script task value will be stored in another variable which will be used in File System Task
The package design will be
Create 3 variable
Name DataType Expression
FolderName string
DestLoc string "D:\\"+ #[User::FolderName]
LoopFiles string
In the above expression for DestLoc variable ,change it as per your location
ForEach Loop configuration
Change the source folder location as per the need
Script task -Add the 2 variable as below
You need to extract the folder name from the variable LoopFiles
Example
LoopFiles variable will have D:\ForLoop\SampleFolder1.txt at runtime
So in order to extract folder name from the above variable use regular expression
Open Edit Script and write the following code
List<string> filePatterns = null;
public void Main()
{
filePatterns = new List<string>();
filePatterns.Add("Folder1");
filePatterns.Add("Folder2");
string fileName = Path.GetFileNameWithoutExtension(Dts.Variables["User::LoopFiles"].Value.ToString());
Match match = Regex.Match(fileName, string.Join("|", filePatterns.ToArray()));
Dts.Variables["User::FolderName"].Value = match.Value;
Dts.TaskResult = (int)ScriptResults.Success;
}
In the above code ,you are extracting the folder name and storing it in the variable FolderName.If you have multiple folders ,then just add the folder names to the filePatterns collection variable.
File System Task Configuration

SSIS retrieving current folder within a foreach loop traversing subfolders

I use SSIS to read .txt files in input and execute my business logic over them saving the output results in a file whose name is the same as the current inpout file (file name dynamically stored in a variable).
When all the files are stored in the same folder, I have no problem accessing them since I use the following expression for the flat file connection string in the data flow: "path" + #[User::inputFileName] + ".txt"
Now I have to process a folder with subfolders (I set traverse subfolders in the foreach loop) and I have some issues with the flat file connection string since I cannot use a wildcard like: my path\\subfolder*" + #[User::inputFileName] + ".txt" where every subfolder has same name and changes only the last portion of the name.
How can I save the current subfolder name in a variable so that I can use it in the following way? "path\\"+ #[User::currentSubFolder] +"\\" + #[User::inputFileName] + ".txt"
I was able to solve my issue, therefore I write here my solution in the case someone else would be in the same situation.
I used a script transformation block before my foreach loop. From it I can retrieve the current full path (used afterwards in the Flat File connection string) and the input file name without extension to be used as output file name containing the results of the SSIS scripts.
In order to keep the values of interests I used 2 variables: one for the file name and one for the path.
Here the script code:
Public Sub Main()
'Variable Index 0 => FileName
'Variable Index 1 => filePath
Dim fullPath As String = Dts.Variables.Item(1).Value.ToString
Dim fileName As String = Path.GetFileName(fullPath)
fileName = fileName.Substring(0, fileName.Length - 4)
Dts.Variables.Item(0).Value = fileName
Dim x As String = Dts.Variables.Item(0).Value.ToString
Dts.TaskResult = Dts.Results.Success
End Sub

Resources