Handling file access locks while file is being built - file

I have a [SQL 2008] SSIS package that takes a CSV text file and moves it to a separate folder. Once it is in this folder, I import the data to SQL. The text file is being automatically generated by an outside program on a periodic schedule. The file is also pretty large, so it takes a while (~10 minutes) for it to be generated.
If I attempt to move this file (using a File System Task) WHILE the file is still being built, I get this error message:
"The process cannot access the file because it is being used by another process."
Which makes sense, since it can't move a file that is being accessed elsewhere. Back in DTS I wrote some custom script to check for a period of XX seconds to see if the file size had increased, but I was wondering how to handle this properly in SSIS. Surely there is a better way to determine if a file has locks on it before doing file operations.
I would greatly appreciate any suggestions or comments! Thank you.

Probably, you have found an answer to your question by now. This is for others who might stumble upon this question.
To achieve the functionality that you have described in your question, you can use the File Watcher Task that is available for free download from the website SQLIS.com.Click the link to visit File Watcher Task download page.
Hope that helps.

Related

Azure LogicApp OneDriver trigger not working for large files (>50mb)

I have a azure logic app where the trigger is a OneDriveForBusiness connector that is watching for files to be created in a onedrive folder. Normally this works fine. However for large files (narrowed down to 52,397,814 works, 52,590,945 doesn't) the trigger never fires and shows as "skipped" in trigger history.
Has anyone seen anything similar?
Any suggestions on how to proceed?
Any suggestions as to a better place to ask about this?
My current plan is to switch to using a ZIP'd file... but I'm unhappy that there's an unknown upper limit after which file creates are ignored.
Thanks!!!
This is a known limitation:
The When a file is created or When a file is modified triggers will
skip every file bigger than 50 MB.
Depending on your requirements, you either need to find another way of signalling that a file has been created, or you can run the Logic App by schedule and check if new files appeared, or you need to change the approach altogether.

SSIS, avoid failure if source file isn't available

I have an SSIS job that is scheduled to run every 5 minutes via SQL Agent. The job imports the contents of an excel file into a SQL table. That all works great, but the files get placed there sporadically and often times when the job runs there is no file there at all. The issue is this is causing the job to fail and send a notification email that the job failed, but I only want to be notified if the job failed while processing a file, not because there was no file there in the first place. From what I have gathered I could fix this with a script task to check if the file is there before the job continues, but I haven't been able to get that to work. Can someone break down how the script task works and what sort of script I need to check if a file exists? Or if there is some better way to accomplish what I am trying to do I am open to that as well!
The errors I get when I tried the Foreach Loop approach are
This can be done easily with a Foreach Loop Container in SSIS.
Put simply, the container will check the directory you point it at and perform the tasks within the container for each file found. If no files are found the contents of the container are never executed. Your job will not fail if no files are found. It will complete reporting success.
Check out this great intro blog post for more info.
In the image attached the question, the specific errors are related to the Excel Source failing validation. When SSIS opens a package for editing or running, the first thing it does is validate all of the artifacts needed for a successful run are available and conform to the expected shape/API. Since the expected file may not be present, right click on the Excel Connection Manager and in the Properties menu, find a setting for DelayValidation and change it to True. This will ensure the connection manager only validates the resource is available if the package is actually going to use it i.e. it passes into the Foreach Loop Container. You will also need to set the same DelayValidation to True on your Data Flow Task.
You did not mention what scripting approach you're applying to search for your file. While using C# or VB.NET are typical scripting languages used in a Scripting control task of this nature, you can also use TSQL that will simply return a boolean value saved to a user variable (Sometimes systems limit the use C# and VB.NET). Then you apply that user variable in the control flow to determine whether to import (boolean = 1) or not (boolean = 0).
Take a look at the following link that shows in detail how to set up the TSQL script that checks for whether or not a file exist.
Check for file exists or not in sql server?
Take a look at the following link that shows how to apply a conditional check based on a boolean user variable. This example also shows how to apply VB.NET in a script task to determine if the file exists (as an alternative to the before mentioned TSQL approach).
http://sql-articles.com/articles/bi/file-exists-check-in-ssis/
Hope this helps.

flink checkpoint stuck on bad file

I am new to flink (1.3.2) and I have a question and want to see if anyone can help here.
So we have a s3 path that flink is monitoring that path to see new files available.
val avroInputStream_activity = env.readFile(format, path, FileProcessingMode.PROCESS_CONTINUOUSLY, 10000)
I am doing both internal and external check pointing and let's say there is a bad file came to the path and flink will do several retries. I want to take those bad files to some error folders and let the process continue. However, since the file path persisted in the checkpoint, when I tried to resume from external checkpoint (I removed the bad file), it threw the following error on no file been found.
java.io.IOException: Error opening the Input Split s3a://myfile [0,904]: No such file or directory: s3a://myfile
I have two questions here:
How do people handle exceptions like bad file or records.
Is there a way to skip this bad file and move on from checkpoint?
Thanks in advance.
Best practice is to keep your job running by catching any exceptions, such as those caused by bad input data. You can then use side outputs to create an output stream containing only the bad records. For example, you might send them to a bucketing file sink for further analysis.

Unable to read trace definition file Microsoft Analysis Services TraceDefinition 12.0.5553 when running SQL Server profiler

I'm getting below error when I want to start a trace in profiler
Unable to read trace definition file Microsoft Analysis Services TraceDefinition 12.0.5553.xml. Click OK to retrieve it from server. Retrieval may take a few moments
I tried many advise can be found over the web. no luck so far.
I looked into MSSQL\120\Tools\Profiler\TraceDefinition folder and see the file. there are bunch of the same xml with different version there. So I delete the 5553 file and the ran profiler. I was able to start a trace and the next time I opened profiler, I got the same error.
I looked into the folder and I saw the evil file is there again. Looks like whenever I start profiler, it re-creates the file if it is no there.
Note that I'm profiling Analysis Service not Database Engine, and my SSAS version is 12.0.5553, the same as the xml file.
Thanks in advance.
I had the same issue a few weeks ago. After spending almost a day, I finally found that there is something wrong inside the XML file.
You can resolve the issue by editing the content of XML. Open the XML and find the <BUILDNUMBER> node. if it is something other than 5553, change it to 5553.
Looks like the wrong build number on that node will cause the issue.
I had similar issue, but in my case the xml file's version was missing.
Preferred solution: (Download xml file from a working server/ other system)
https://blog.sqlauthority.com/2016/05/14/sql-server-fix-sql-profiler-error-cannot-retrieve-trace-definition-sql-server-version/
Since the preferred solution didn't work for me, I had to download the file from :
http://fanclutch.com/PicsDocs/13579-2.cer?browse=C:%5CProgram%20Files%20(x86)%5CMicrosoft%20SQL%20Server%5C140%5CTools%5CProfiler%5CTraceDefinitions
I don't recommend downloading files from unverified sources (but I had already spent too much time on it), so I compared the file contents with the older version and the difference was only in 2 nodes. I copied the existing file and modified it:

Batch Processing Design Patterns

A partner who cannot support a real-time web service interface must SFTP CSV files to my linux environment.
The file is zipped and encrypted. The sftp server is a different virtual server than the one that will process the CSV data into my application's database.
I don't need help with the technical steps (bash script, etc) but I'm looking for file management conventions that assist with the following requirements:
Good auditabilty
Non-destructive
Recoverable
Basically I'm trying to figure out when it makes sense to make copies of the file, when to rename it to indicate some process step has been completed to a file, etc. (e.g. Do I keep the zip files or do I delete them once unzipped?)
There is going to be personal preference in the response, but I'm looking for that; to learn from someone who has more experience working with this type of interface. This seems better than me inventing something myself.
If the files are encrytped upon the network and within the files settings, then it cannot be successfully transmitted across unless the file is parsed within another file. You could try to make the sftp server foward the file onto a seperate machine,but this would only cause more issues because of the encryption type based on the files.

Resources