I'm little new to SSIS and I have a need to import some flat files into SQL tables in the same structure.
(Assume the table is already exist in the same structure and table name and flat file name is same)
I thought to create a generic package (sql 2014) to import all those file by looping through a folder.
I try to create a data flow task in a foreach loop container in the data flow task I dropped a flat file source and ADO.Net destination .
I have set the file source to a variable so that every time it loops through it get the new file. similarly for the ADO.net table name I set it to a the variable so that each time it select a different table according to the file name.
since both source column names and destination column names are same I assume it will map the columns automatically.
but with a simple map it didn't let me to run the package so added a column on the source and selected a table and mapped it.
when I run the package I assumed it will automatically re map everything.
but for the first file it ran but second file it failed complaining with map issues.
can some one let me know whether this is achievable by doing some dynamic mapping?? or using any other way.
any help would be much appreciated.
thanks
Ned
Related
I've chased my tail for a full 12 hours. Haven't found the right solution.
I'm locked into using SSIS. I have a SQL Server table with full paths and filenames already concatenated. Examples:
\\MydevServer1\C$\ABC\App_Data\Sample.pdf
\\MydevServer2\E$\Garth\App_Data\Morefiles.txt
\\MydevServer3\D$\Paths\App_Data\MySS.xlsx
etc.
I need to read each row of the table, get the path and filename and move that file to a new static destination directory.
The rows in the table will remain unchanged. I only use it as a source to locate the file to be moved.
I've tried:
1) Feeding a resultset from an ole db source to a recordset destination then to an Object variable that connects via variable to a foreach loop container holding a files system task. (Very problematic.)
2) Sending the table rows to a .csv file and reading each line of the csv file using a foreach loop container holding a file system task.
3) Reading directly from the table rows using a foreach loop container holding a file system task. (preferred).
and many other scenarios.
I have viewed a hundred examples online, but most of them involve loading a table, or sending results to flat files, or moving files from one folder to another based on extension type, etc. I haven't found anything on configuring a file system task to read a table supplied path and move the file based on the table value as the source.
I'm rambling. :-)
Any insight or help will be appreciated. I'm not new to SSIS, but I sure feel like it right now.
Create two string variables to store source and destination paths
Use an Execute SQL Task to populate a Full Recordset (Variable with Object data type)
Use For Loop container to go through each row of recordset and set those two variables.
Inside For Loop container, use File System Task. You need to specify IsSourcePathVariable = True, IsDestinationPathVariable = True, path variables - DestinationVariable / SourceVariable, and set operation (copy, move, etc.)
It appears I've been tail chasing due to the error, "Source is empty error".
This was caused by a blank first row in my recordset. I was searching for a fix to the Object variable is empty issue, when in reality the issue was that the Object variable couldn't find data right off the bat.
Insert shameful smug here.
Thanks to Anton for the help.
I keep running into issues creating a SSIS project that does the following:
inspects folder for .csv files -> for each csv file -> insert into [db].[each .csv files' name]
each csv and corresponding table in the database have their own unique columns
i've tried the foreach loop found in many write ups but the issue comes down to the flat file connection. it seems to expect each csv file has the same columns as the file before it and errors out when not presented with this column names.
anyone aware of a work around for this?
Every flat file format would have to have it's own connection because the connection is what tells SSIS how to interpret the data set contained within the file. If it didn't exist it would be the same as telling SQL server you want data out of a database but not specifying a table or its columns.
I guess the thing you have to consider is how are you going to tell a data flow task what column in a source component is going to map to a destination component? Will it always be the same column name? Without a Connection Manager there is no way to map the columns unless you do it dynamically.
There are still a few ways you can do what you want and you just need to search around because I know there are answers on this subject.
You could create a Script Task and do the import in .Net
You could create a SQL Script Task and use BULK INSERT or OPENROWSET into a temporary stagging table and then use dynamic sql to map and import the final table.
Try to keep a mapping table with below columns
FileLocation
FileName
TableName
Add all the details in the table.
Create user variables for all the columns names & one for result set.
Read the data from table using Execute SQL task & keep it in single result set variable.
In For each loop container variable mappings map all the columns to user variables.
Create two Connection Managers one for Excel & other for csv file.
Pass CSV file connection string as #[User::FileLocation]+#[User::FileName]
Inside for each loop conatiner use bulk insert & assign the source & destination connections as well as table name as User::TableName parameter.
if you need any details please post i will try to help you if it is useful.
You could look into BiML Script, which dynamically creates and executes a package, based on available meta data.
I got 2 options for you here.
1) Scrip component, to dynamically create table structures in sql server.
2) With for each loop container, use EXECUTE SQL TASK with OPENROWSET clause.
I get files to a shared location . Every file has different meta ie. file name, date created.
I have to extract the data using SSIS if and only if file content is different than previously processed files.
This should be fairly straight-forward -
Use a ForEach container configured to For Each File setting. Folder name would be the shared location. File Name should be a wildcard (example, *.csv)
Create a table in SQL called LoadedFiles which will hold the names of the files loaded. Note that when you create the ForEach container you would have also created a variable that would hold the file-name. Now in the ForEach container, check if the value in this variable exists already in the LoadedFiles table. If it doesn't, only then load.
I've assumed that all the files have the same metadata (column names and data types). Even if they do not, you can employ the same logic.
Also, if it isn't obvious, for this to work you need to insert a new row into the LoadedFiles table every time you do decide to load a file.
EDIT: It seems same file name does not equate to same content for the OP. In that case, he should just do a MERGE on the SQL table instead of a blind insert.
MERGE on the primary key and IF MATCHED do nothing else INSERT
I got work around
SSIS execute process task and i have called FC.exe
http://www.howtogeek.com/206123/how-to-use-fc-file-compare-from-the-windows-command-prompt/
I have been searching for about a week now and I was wondering if anyone may have a clue. I wrote a package to do the following:
loop through a parent folder and its subfolders for a csv with a particular naming structure (works)
Create a table for each .csv based on the enumeration of each file (works).
Import the data into sql server in their own tables with the file name that was created as the table name and not OLE DB Destination (which does not work). It works if it there is destination folder for everything, but when I use table variable that does not work.
What I did was add an Execute SQL task to the for each container to create a table with a variable for the file path that is mapped as an expression in the for each container in a create table query under property sqlstatementsource expression. The tables are created, but when I use the variable that was mapped for the for each loop as the table name or variable in OLE DB Destination I get an error asking for me to check if the table exists. The tables are created, but I cannot get the insertion of the data into their own tables. Even when I bypass the error of "Destination table has not been provided" and run the package. I set delayValidation as true and still nothing. SSIS from what I have seen so far does some cool things. However, I am stuck right now. What else am I doing wrong?
I forgot to mention that the data is going to sql server.
Thanks for everything.
You can't create an OLEDB Destination at design time with a variable for a table name. The OLEDB destination needs to know the table name, and the columns, so that it can pre-map the data flow to the table columns.
You have a couple of other options:
You can use BiML to dynamically create your dataflows and destinations.
You can use an ExecuteSQL Transformation as your dataflow destination, and write a dynamic SQL statement that inserts each row in the dataflow to the desired table.
How do I use SSIS to iterate the image files in a directory and using the filename run a query to insert the image into sql server?
I realise that with a Foreach File Enumerator I can loop the files and get the filename into a variable. How do I use this variable to run a query to find the record for that filename from hd in my table and then import the image into my sql server image type column?
Once I have the file in my database, I will delete the file from hd.
If I'm understanding the problem correctly, you would like to sweep all the files in some location into SQL Server using SSIS?
Data Flow Task
Your data flow task will be responsible for the actual import of files into the database. Your approach would be the same as outlined in Import varbinary data Pretty picture version at insert XML file in SQL via SSIS
Your source will be a Script Transformation Component operating as a source component. It's job will be to add all the file names into the Data Flow. Change the filter in the second link to *.png (or whatever your filter is) and it should work.
Use the Import Column Component on the generated file names. This will add the file pointer into the data flow so that it can get imported into the database. You will need to ensure your data type is DT_IMAGE. Even if you're using varbinary(max)/varchar(max)/nvarchar(max) it's all going to be DT_IMAGE within the context of the pipeline's metadata.
Route all of that data into your target table and you will have imported your file data.
File cleanup
At this point, you have imported all this data and now you want to remove the files from disk. Assuming you stored the file name in the database along with the image bits, I'd use an Execute SQL Task to retrieve the list of file names. Change the output type from None to Full Result Set and store that into a variable of type Object.
Connect a Foreach Enumerator to the output of the SQL Task and here you'll want to "shred" the results. Google that term and you'll find a variety of blog posts or previous SO questions on how to do this. The end result will be a file name will be pulled from the recordset object and assigned to a local variable.
Inside the Foreach Enumerator, use a File System Task and Delete the file which is referenced in the variable set from the Foreach Enumerator.