file arrival in blob storage trigger data factory pipeline - file

I need to invoke a Data Factory V2 pipeline when a file is placed in a blob container.
I have tried using Powershell to check if the file is present, the issue I have there is that if the file is not there, and tell me its not there, I then place the file in the container and Powershell will still tell me its not there, though perhaps if it reruns the variable will get a fresh value and tell ites there? Maybe there is a way around that? If yes, I can then use the result to invoke the pipeline with the Powershell script. Am I along the right lines here?
Other option will be to write a t-sql query that will give a true/false result if the row condition is met, but I am not sure how I can use this result within/against DFv2. In the IF condition module?
Tried a Logic app but it was kind of useless. It would be great if I could get some suggestions in some ways to trigger the pipeline on the arrival of the file in the blob container, there is more than one way to skin a cat, so open to any and all ideas. Thank you.

This is now available as an event trigger with ADF V2 as announced in this bog post on June 21, 2018.
Current documentation on how to set it up is available here: Create a trigger that runs a pipeline in response to an event.
From the documentation:
As soon as the file arrives in your storage location and the corresponding blob is created, this event triggers and runs your Data Factory pipeline. You can create a trigger that responds to a blob creation event, a blob deletion event, or both events, in your Data Factory pipelines.
There is a note to be wary of:
This integration supports only version 2 Storage accounts (General purpose).
Event triggers can be one, or both of:
Microsoft.Storage.BlobCreated
Microsoft.Storage.BlobDeleted
With firing conditions from the following:
blobPathBeginsWith
blobPathEndsWith
The documentation also provides the following examples of event trigger firing conditions over blobs:
Blob path begins with('/containername/') – Receives events for any blob in the container.
Blob path begins with('/containername/foldername') – Receives events for any blobs in the containername container and foldername folder.
Blob path begins with('/containername/foldername/file.txt') – Receives events for a blob named file.txt in the foldername folder under the containername container.
Blob path ends with('file.txt') – Receives events for a blob named file.txt at any path.
Blob path ends with('/containername/file.txt') – Receives events for a blob named file.txt under container containername.
Blob path ends with('foldername/file.txt') – Receives events for a blob named file.txt in foldername folder under any container.

Related

Is it possible to edit Logic app file system or SFTP trigger conditions to fire a logic app based on filename or extension?

I would like to trigger my logic app which is reading files from SFTP only if files with a certain name or extension are uploaded/modified. I want to avoid using multiple actions to check file name. Is there any possible way to edit File System/SFTP trigger conditions to check file name and accordingly trigger the logic app?
Yes you could. If you want to use trigger condition to check the file name, you have to use When a file is added or modified (properties only).
I test the one without properties only, I check the output is there any property to get the file name, and then I use #equals('47.txt',trigger()['outputs']['headers']['x-ms-file-name']) as trigger condition, however I get this error message.
So this trigger could not meet your requirement. Then I test with properties only, this output body has a property Displayname to get the file name. So I changed the codition to #equals('47.txt',trigger()['outputs']['body']['DisplayName']), with this codition, if the filename doesn't equal, it will be triggered, however it won't fired it.
Hope this could help you.

LogicApps scenario log on, download, zip

I access a 3rd party website using forms authentication (username and password).
Once logged on I make a call to a HTTP endpoint and receive XML in the body. The XML contains 1000 XML elements. Within each element there is a text value, a code.
For each of these codes I make a further call to a different HTTP endpoint. The response is more XML.
When all 1000 responses have been received I would like to add all the XML responses as files to a zip container and make it available for download.
I would like to see how LogicApps could do this as quickly as possible.
Make the call to the first HTTP endpoint (auth set to basic auth with user/pass inputted)
Use the xpath(xml(<body var here>), '//elementNameHere') expression on the Body of the result from the call to get all the elements of the return value that have the code in it
Foreach over this return value and
make the HTTP call
append the result to an array variable, or concat on to a string variable.
Submit this value to blob storage
Because you're messing w/ vars in the foreach loop, though, you'll have to do it sequentially (set concurrency control on the Foreach Loop to 'on' and '1') else you could end up with a bad result.
I don't know of a way to "zip contents" here so you may have to send the result to an Azure Function that uses a .Net zip lib to do the work (or js zip lib, whatever your flavor) and does the put to blob storage for you.
This would also all be much easier in Durable Functions land, I encourage you to look in to that if you're so inclined.
One mild alternative you might consider is for step 3.2, instead upload that result to a blob storage container, then make the entire container available for download via an Azure Function call which gets the container & zips up the contents (or does the Blob Storage URL for a container do this for you already? not sure)

Filtering the list of files based on the value obtained from another file (Apache Camel File component)

I need to create a file filter (to pick files from a folder) based on the content received from another file.
I set up a route like this:
File1 Url -> pollEnrich(File2 Url with filter, aggregationStrategy) -> log
But the issue is that in pollEnrich, the value obtained from File1 is not available. Hence I am not able to create a filter based on which I have to pick the files from Folder2.
I tried both the filer option in the URL as well as the programmatic filter (by extending GenericFileFilter class). Any suggestions are very much appreciated.
Recall the fact from content-enricher,
pollEnrich only accept one message as response
The pollEnrich will collect single file when call with file component. Thus, you should use file component with fileName option inside pollEnrich to collect single file and use while loop to call pollEnrich multiple time.

can we drop a file to a folder location automatically using camel,or at a set period of time(not intervals)?

Iam trying to automate the testing of a java bundle,Which will process once a file is dropped in a particular folder.
can we drop a file to a folder location automatically using camel,or at a set period of time(not intervals)?
is this possible purely by camel or should we incorporate other frameworks?
sure, you can use the camel-file component to produce (create files somewhere) and consume (read/process files from somewhere) and optionally control the initial/polling delays easily with attributes...
here is a simple example of consuming->processing->producing
from("file://inputdir").process(<dosomething>).to("file://outputdir")
alternatively, you could periodically produce a file and drop it somewhere
from("timer://foo?fixedRate=true&period=60000").process(<createFileContent>").to("file://inputdir");
Although camel could do this by creating a timer endpoint, then setting the file content and writing to a file endpoint, my answer would be to simply use a bash script. No camel needed here.
Pseudo bash script:
while [ true ]
do
cp filefrom fileto
pauze 10s
done

Orbeon - how to validate file content uploaded by user

I need to validate content of pdf file sent by "File Attachement" component, using Webservice, uploaded by user.
How to do that ?
Action Value Change is not called
ver.orbeon-4.4.0.201311042036-PE
Thanks
Piotr
That probably requires a few steps:
Determine when the upload is complete. With recent versions of Orbeon Forms, the eventxxforms-upload-done can be used.
Send the content of the uploaded file to a service. The file can be a binary file, but there is a way to submit binary content.
Depending on what the service returns, mark the control valid or invalid. You could do this with an attribute on the element holding the uploaded file's URL, e.g.: <my-upload valid="true"/> and then use a constraint like constraint="#valid = 'true'".

Resources