I'm trying to run a very simple SSIS package where I am dumping the contents of an Excel file into a database. The job is scheduled, but the file it is reading from is manually moved to the source location.
As it stands, if no one puts the Excel file in the staging area before the package runs the whole process fails.
Is there a way to not kill the job if the import file is missing? Maybe just log an error and try again later?
Thanks
An easy way to handle this is use a ForEach enumator and set a variable to the count of files in the folder. If the count is 0, gracefully exit out of the ForEach and set the precedence constraint accordingly.
Related
I have a daily 5 txt file in a Arrived folder, I'd like to design a SSIS package: each the txt file data load to stage then file goes to archive folder, if file not processing and load to stage then it will not goes archive, if all files load or not in stage this package job done. I have other package for stage to final table load but I want to make sure all files are load in stage then only this package run otherwise not run.
how I will design SSIS package?
Can somebody help me?
As there is dependency on package:files loading to Staging to package:staging to final table, it is better to make them as single SSIS package.
Define a Foreach Sequence Container, which processes the files.
On successful completion of the Foreach Sequence Container, you can have task for data load to destination.
Note: I have made the control flow simple. You need to have appropriate error logging to see the failed files and move them accordingly and successful files and move them accordingly.
Sample Reference code is given below:
Probably a basic SSIS question ...
I have a Loop Container which loops over a directory of Excel File and imports and moves each individual file to another folder ( About 380 Excel Files).
Please no comments about not using a robust file format for importing that volume of files in SSIS, as I completely agree.
The problem I have is that many of the Excel Files, about 80 had Excel File import Errors and I had to stop , put the problem file aside and restart about 20 times before I could identify all the problem Excel files to manually fix.
So is there any way I can get the process to skip problem Excel Files that cause the Excel source in the Data Flow to go RED, and just process all the good files?
Finding a solution to this will save me at least an hour when running the process ?
This is SSIS 2008.
You can achieve this by setting the task property ForceExecutionResult to Success.
If you are looking for something more robust
I have created my first SSIS package. I have a for each container the loops thru a folder. The data flow has a derived column task and 4 lookups. When I run the package in Visual Studio (2013) it starts with the first file and arrives at the destination, but it does insert the data it only hangs with the text "Validating OLE DB Destination 1" in the status bar.
The files are located on my hard drive and the destination database is on the locale network. I'm using a sysadmin account the be sure that the user have sufficient access rights.
I'm unable to query the destination database table from SSMS as well.
Anyone have some idea what could be the problem and how I could solve it?
Sorry for the unspecific question. In my control flow in ssis I have a for each loop container who contains a data flow task to import all the data in every file that the container loops. Connected to the task, is two move file tasks dependent on success or failure of the import task. The strange part is that one file i moved, no data is inserted in the database and the for each loop hangs after the first loop (the folder contains 150 files). While this ssis process hangs, i'm unable to to query the database with select *, no error it just says "executing query".
The latter. It finishes the first round (moves the file to my success folder) and then halts with the "still working" icon on the data import task. But the data is not inserted even if the file is move. Will the transaction commit first when it has finished processing all the files?
Edit: Image of the control flow and the data flow
The answer was found in the "lock table" option. Since both destinations are the same table, i guest the first destination locks it, when the second destination hits the same table it is locked and it waits until it is unlocked. And that does not happen since its not ready to commit yet.
I need to import a flat file daily. The file changes its name every day. After the file is processed, it needs to be moved to another folder.
I noticed I can schedule jobs in the SQL Server Agent, and that I can tell it to run every hour or so and that I am able to add CMD commands to it.
The solution I found was to run a script to check if the file exists, since the folder should be empty or have at least one file.
If the file exists, the script renames the file to one used in the SSIS package and then it runs the SSIS package.
After the whole thing is done, it should rename the file again based on today's date and move it to another folder.
If the file does not exist, then it should do nothing and wait another hour or so to run again.
What's the best solution to this scenario? Is the script a good idea? Maybe is it possible to add the if/else -for the file exists- into the SSIS package? Or even make the script run from the SSIS package itself instead of adding it to the Server Agent?
EDIT:
It seems I was a little naïve, it's possible to run VB scripts from the server. Would that be the recommended solution? It does solve my problem, but I'm just wondering if it's a good idea.
This solves all my questions:
http://www.sqlservercentral.com/articles/Integration+Services+%28SSIS%29/90571/
Background:
I've a folder that gets pumped with files continuously. My SSIS package needs to process the files and delete them. The SSIS package is scheduled to run once every minute. I'm picking up the files in ascending order of file creation time. I'm building an array of files and then processing-deleting them one at a time.
Problem:
If an instance of my package takes longer than one minute to run, the next instance of the SSIS package will pick up some of the files the previous instance has in its buffer. By the time the second instance of teh package gets around to processing a file, it may already have been deleted by the first instance, creating an exception condition.
I was wondering whether there was a way to avoid the exception condition.
Thanks.
How are you scheduling the job? If you are using the SQL Server Job Scheduler I'm under the impression is should not re-run a job already running; see this SO question: Will a SQL Server Job skip a scheduled run if it is already running?
Alternatively, rather than trying to move the file around you could build a step of your job to test if it is already running. I've not done this myself but it would appear to be possible, have a read of this article Detecting The State of a SQL Server Agent Job
Can you check for the existance of the file before you delete it.
File.Exists(filepathandname)
To make sure your package are not messing up with the same files, you could just create an empty file called just like the filename but with another extension (like mydata.csv.being_processes) and make sure your Data Flow Task is running only on files that don't have such file.
This acts as a lock.
Of course you could change the way you're scheduling your jobs but often - when we encounter such an issue - it's because we got no leverage on those things :)
You can create a "lock file" to prevent parallel execution of packages. To protect yourself from case with crashed package consider using file's create date to emulate lock timeout.
I.e.: at the beginning of package you will check for existence of lock file. If it's non-existent OR it was created more than X hours ago - then continue with import. Otherwise exit.
I have a similar situation. What you do is have your SSIS package read all files within a folder and create a work file like 'process.txt'. This will create a list of valid files at that point in time. If you have multiple packages, then create the file with a name like 'process_.txt'. The package will only process the files named in their process file. This will prevent overlap.