SSIS Package with XML Configuration, Dynamic File Location Source - sql-server

I have an SSIS package that picks up a file from a directory and imports it into a database, pretty straightforward stuff. The only issue is the input file is named based on the current date, e.g \path\to\file\filename_010115.txt, where 010115 is for January 1, 2015.
I was calling this package with a custom bat file that set the connection manager for a Flat File Source to the current date formatted filename and passed it into dtexec.exe directly; however, our environment demands that we use XML configuration files and not bat files. Is there a simple way to set the file source "prefix" in the xml and append the date-formatted filename before attempting to pick up the file?

Have you considered rearchitecting your approach? You're not going to be able to gracefully do this with a Configuration - be it XML, table, environment, or registry key. In the simplest case, a table, before you could even start your SSIS package, a process would need to update that table to use the current date. A scheduling tool like SQL Agent can run SQL Commands. If you're going the XML route though, you're looking at a tiny little app or PowerShell command to modify your configuration file. Again, that's going to have to modify your file every day and before the SSIS package begins as it sets values once in the beginning and never consults the configuration source again.
You could use an Expression in SSIS to use the current date as part of your flat file source's connection string but in the event you need to process more than one day's file or reprocess yesterday's file, you're humped and either have to manually rename source files or change the system clock and nobody's going to do that.
A more canonical approach would be to use a Foreach (file) Loop Container. Point it at the source folder and find all the file(s) that match your pattern. I'm assuming here you move processed files out of the same folder. The Foreach Container finds all matching files and enumerates through the list popping the current one into whatever variable you've chosen.
See a working example on
https://stackoverflow.com/a/19957728/181965

Related

SSIS File System Task Doesn't Work but Reports Success

I created an SSIS package that extracts two files from a .zip file, imports data from them and then attempts to delete the files that were extracted.
The package works, data is imported and all tasks report success. However the File System Tasks that attempt to delete the files don't delete them, even though they report success.
I use the CozyRoc Zip Task to extract the files. When I remove the Zip Task, the File System Tasks actually do delete the files. I'm not certain that CozyRoc is causing the problem, but the existence of that task may be causing other issues with the package.
Can anyone help me figure out how to reliably delete the files?
Do I need to put in some sort of pause after the Data Flow Tasks to allow them to release whatever locks they might have on the files?
Is there a way to view the DOS commands that the File System tasks use at run time, to verify that they are actually attempting to delete the correct files?
Thank You,
Robbie
Control Flow:
Details:
Visual Studio 2019 v16.11.3
File Names are from Flat File Connection Managers (See image below).
Flat File Connection Managers use Expressions to set their connection strings.
The same connection managers are used to import the data, so I presume that they accurately refer to the correct files and their correct locations.
File System Task Editor for one of the delete tasks:

Read file in SSIS Project into a variable

My SSIS projects tend to run queries that require changes as they move between environments, like the table schema might change or a value in the Where clause. I've always either put my SQL into a Project Parameter, which is hard to edit since formatting is lost, or just put it directly into the Execute SQL Task/Data Flow Source then manually edited it between migrations which is also not ideal.
I was wonder though if I added my SQL scripts to files within the project, can these be read back in? Example if I put a query like this:
select id, name from %schema%.tablename
I'd like to read this into a variable then it's easy to use an expression as I do with Project Parameters to replace %schema% with the appropriate value. Then the .sql files within the project can be edited with little effort or even tested through an Execute SQL Task that's disabled/removed before the project goes into the deployment flow. But I've been unable to find how to read in a file using a relative path within the project. Also I'm not even sure these get deployed to the SSIS Server.
Thanks for any insight.
I've added a text file query.sql to an SSIS (SQL 2017) Project in Visual Studio, bit I've found no way to pull the contents of query.sql into a variable.
Native tooling approach
For an Execute SQL Task, there's an option to source your query directly from a file.
Set your SQLSourceType to File Connection and then specify a file connection manager in the FileConnection section.
Do be aware that while this is handy, it's also ripe for someone escalating their permissions. If I had access to the file the SSIS package is looking for, I can add a drop database, create a new user and give them SA rights, etc - anything the account that runs the SSIS package can do, a nefarious person could exploit.
Roll your own approach
If you're adamant about reading the file yourself, add two Variables to your SSIS package and supply values like the following
User::QueryPath -> String -> C:\path\to\file.sql
User::QueryActual -> String -> SELECT 1;
Add a Script Task to the package. Specify as a ReadOnly variable User::QueryPath and specify as a ReadWrite variable User::QueryActual
Within the Main you'd need code like the following
string filePath = this.Dts.Variables["User::QueryPath"].Value.ToString();
this.Dts.Variables["User::QueryActual"].Value = System.IO.File.ReadAllText(filePath);
The meat of the matter is System.IO.File.ReadAllText. Note that this doesn't handle checking whether the file exists, you have permission to access, etc. It's just a barebones read of a file (and also open to the same injection challenges as the above method - just this way you own maintaining it versus the fine engineers at Microsoft)
You can build your query by using both Variable and Parameter.
For example:
Parameter A: dbo
Build your variable A (string type) as : "Select * FROM server.DB." + ParameterA + ".Table"
So if you need to change the schema, just change the parameter A will give you the corresponding query in variable A.

Change the file encoding of the file which is created using SSIS Log provider for Text Files

I am new to SSIS, I have already designed a package and configured SSIS Log provider for Text Files.
This works fine and log files are generated successfully.
We have a monitoring team, they use this log file for monitoring. They are unable to read the log files since the file encoding is in Unicode format.
They are expecting a non unicode format for their monitoring.
I tried to change the existing log file encoding to ANSI but when I re-run the package my log file has been created again with UNICODE encoding.
Is any way we can create log files using SSIS Log provider for Text Files with non unicode encoding. Kindly suggest me any workaround. I am unable to find solution for the past two days.
Trying to figure out the issue
Since SSIS Log provider for Text Files use a File connection manager for logging purposes, you don't have the choice to edit the file encoding within the SSIS package because this type of connection manager can be used for different files format (excel, text ...).
While searching for this issue it looks like if the log is created for the first time by SSIS it will write unicode data.
why are my log files getting generated with a space between every two characters?
Why is my SSIS text logfile formatted in this way?
Possible workaround
Try to create an empty text file using notepad and save it with ANSI encoding.
Then select this file from the SSIS logging configuration.
Other helpful links
Change the default of encoding in Notepad
Add Logging with SSIS
Update 1 - Experiments
To test the workaround i provided i have run the following experiments:
I add SSIS Logging and created and a new log file
After executing the package the file is create in Unicode (to check that i opened the file using notepad and click Save As the encoding shown in the combobox is Unicode)
I create a new file using Notepad and save it using Ansi encoding as mentioned above.
In SSIS i changed the File connection manager to Use Existing instead of Create New and i selected the file i created
After executing the package the log is filled within the file and the encoding is still Ansi
I repeated executing the package several times and the undoing wont changes.
TL DR: Create a file with ANSI encoding outside the ssis package and within the package create a file connection manager, select Use Existing option and choose the created file. Use this file connection manager for logging purposes.

Spoon - read SQL code from txt file and execute on DB

I'm learning to develop ETL using Pentaho Spoon, I'm pretty noob yet.
Instead of storing SQL operations inside its file, I'd like to have them on their own .sql files. It makes easier to track changes on Subversion, and in case of need I can just open the sql file on DB manager and execute it directly.
How could I do that? I suppose I could use some component to read a txt file into a variable, and another component to take that variable and execute it on DB.
How's the simplest way to achieve that?
In the standard SQL Table input, you can define the query to be a parameter ${my_query} and this parameter has to be defined (without ${...} decoration) in the transformation properties: right-click anywhere, select Properties on the popup menu, the Parameter tab.
Each time you run the transformation, you'll presented the list of parameters, among which my_query which you can overwrite.
To automatize, follow the example which was shipped with the installation zip. In the same directory as you spoon.bat/spoon.sh, there is a folder named sample, in which you will find a job to read_all_files or read all_tables. Basically this job list the files in a directory, and for each one puts it in a variable and use it as a parameter to run the transformation. Much more easier to do than to explain.

SQL Server script to read many csv files at certain intervals and insert records

I have some csv files. I want to write SQL Server script to read the file at certain period and insert in SQL Server db, if record is not found and ignore it if file has already been read previously by scheduler. Each csv will contain one record only.
Like:
1.csv => John,2000,2012/12/12
2.csv => Tom,3000,2012/12/11
It will be great if someone can provide examples of script.
Thanks!
If I was you I would create an SSIS package that uses the multi file input. This input let's you pull data from every file in a directory.
Here are the basic steps for your SSIS package.
Check if there are any files in the "working" directory. If not end the package.
Move every file from your "working" directory to a "staging" directory.
You will do this so that if additional files appear in your "working" directory while you are in the midst of the package you won't lose them.
Read all of the files in the "staging" directory. Use a data flow with the multi file input.
Once the reading has been completed then move all of the files to a
"backup" directory.
This of course assumes you want to keep them for some reason. You could just as easily delete them from the "staging" directory.
Once you have your package completed then schedule it using SQL Server agent to run the package at whatever interval you are interested in.

Resources