LogicApps scenario log on, download, zip - azure-logic-apps

I access a 3rd party website using forms authentication (username and password).
Once logged on I make a call to a HTTP endpoint and receive XML in the body. The XML contains 1000 XML elements. Within each element there is a text value, a code.
For each of these codes I make a further call to a different HTTP endpoint. The response is more XML.
When all 1000 responses have been received I would like to add all the XML responses as files to a zip container and make it available for download.
I would like to see how LogicApps could do this as quickly as possible.

Make the call to the first HTTP endpoint (auth set to basic auth with user/pass inputted)
Use the xpath(xml(<body var here>), '//elementNameHere') expression on the Body of the result from the call to get all the elements of the return value that have the code in it
Foreach over this return value and
make the HTTP call
append the result to an array variable, or concat on to a string variable.
Submit this value to blob storage
Because you're messing w/ vars in the foreach loop, though, you'll have to do it sequentially (set concurrency control on the Foreach Loop to 'on' and '1') else you could end up with a bad result.
I don't know of a way to "zip contents" here so you may have to send the result to an Azure Function that uses a .Net zip lib to do the work (or js zip lib, whatever your flavor) and does the put to blob storage for you.
This would also all be much easier in Durable Functions land, I encourage you to look in to that if you're so inclined.
One mild alternative you might consider is for step 3.2, instead upload that result to a blob storage container, then make the entire container available for download via an Azure Function call which gets the container & zips up the contents (or does the Blob Storage URL for a container do this for you already? not sure)

Related

file arrival in blob storage trigger data factory pipeline

I need to invoke a Data Factory V2 pipeline when a file is placed in a blob container.
I have tried using Powershell to check if the file is present, the issue I have there is that if the file is not there, and tell me its not there, I then place the file in the container and Powershell will still tell me its not there, though perhaps if it reruns the variable will get a fresh value and tell ites there? Maybe there is a way around that? If yes, I can then use the result to invoke the pipeline with the Powershell script. Am I along the right lines here?
Other option will be to write a t-sql query that will give a true/false result if the row condition is met, but I am not sure how I can use this result within/against DFv2. In the IF condition module?
Tried a Logic app but it was kind of useless. It would be great if I could get some suggestions in some ways to trigger the pipeline on the arrival of the file in the blob container, there is more than one way to skin a cat, so open to any and all ideas. Thank you.
This is now available as an event trigger with ADF V2 as announced in this bog post on June 21, 2018.
Current documentation on how to set it up is available here: Create a trigger that runs a pipeline in response to an event.
From the documentation:
As soon as the file arrives in your storage location and the corresponding blob is created, this event triggers and runs your Data Factory pipeline. You can create a trigger that responds to a blob creation event, a blob deletion event, or both events, in your Data Factory pipelines.
There is a note to be wary of:
This integration supports only version 2 Storage accounts (General purpose).
Event triggers can be one, or both of:
Microsoft.Storage.BlobCreated
Microsoft.Storage.BlobDeleted
With firing conditions from the following:
blobPathBeginsWith
blobPathEndsWith
The documentation also provides the following examples of event trigger firing conditions over blobs:
Blob path begins with('/containername/') – Receives events for any blob in the container.
Blob path begins with('/containername/foldername') – Receives events for any blobs in the containername container and foldername folder.
Blob path begins with('/containername/foldername/file.txt') – Receives events for a blob named file.txt in the foldername folder under the containername container.
Blob path ends with('file.txt') – Receives events for a blob named file.txt at any path.
Blob path ends with('/containername/file.txt') – Receives events for a blob named file.txt under container containername.
Blob path ends with('foldername/file.txt') – Receives events for a blob named file.txt in foldername folder under any container.

Apache Camel Interceptor with regular expression

This is my route. I want to send a file to an Azure blob. I want to set the name of the blob as the file name without extension. I also want to filter out the whitespaces from the file names. I am thinking of using an interceptor
from("file://C:/camel/source1").recipientList(simple("azure-blob://datastorage/container1/${header.fileName}?credentials=#credentials&operation=updateBlockBlob"))
I want to invoke the interceptor only for updateBlockBlob operatin
interceptSendToEndpoint("^(azure-blob:).+(operation=updateBlockBlob)").setHeader("fileName",simple("${file:onlyname.noext}")).convertBodyTo(File.class)
The above code works with interceptFrom().
I tried replacing the regular expression with wild card like azure* i.e interceptSendToEndpoint("azure*"). It did not work
Whats wrong with the above code? Is it because of recipientList?
Also what features does simple have to remove white space?
Is there a better way to generate blob names dynamically?
Here is the documentation from camel on interceptors.
http://camel.apache.org/intercept.html
interceptFrom that intercepts incoming Exchange in the route.
interceptSendToEndpoint that intercepts when an Exchange is about to
be sent to the given Endpoint.
So I suspect the Exchange is already formed and camel expects the url to be resolved.
So the header needs to be set before the exchange is created for the Azure end point.
I did the following. To set the header, I use the interceptFrom, and to convert the object into File I used the inteceptSendToEndPoint
interceptSendToEndpoint("^(azure-blob:).+(operation=updateBlockBlob)").convertBodyTo(File.class)
interceptFrom().setHeader("fileName",simple("${file:onlyname.noext}".replaceAll("[^a-zA-Z\d]")))
Managed to get rid of the whitespace too

libs3 for listing bucket does not return all bucket contents

http://docs.ceph.com/docs/hammer/radosgw/s3/cpp/#creating-and-closing-a-connection
I used "LISTING A BUCKET’S CONTENT" section from above link. But I am not able to list all contents of bucket. isTruncated comes to 1 in call back, but nextMarker is null. Any help ?
I will try using aws-sdk but that is too large for my simple needs. And it has gcc 4.9 as requirement.
You need to save and use the last returned key from the first request as marker for the second request.
The nextMarker is only set in the response if, in your request, you set a delimiter, because when you are using a delimiter, it's not always possible to determine where you should start back up based on the contents of the response.
http://docs.aws.amazon.com/AmazonS3/latest/API/RESTBucketGET.html

How secure is str(BlobKey)?

I have implemented a generic blob serving handler as mentioned in the appengine docs. The handler will serve any blob to you, as long as you know that blob's key string. I am using it to easily compose URLs that clients can use to download their files. If client A inspects the URL to download their file and finds their blob key (i.e. 1CX2kh468IDYKGcDUiq5c69u8BRXBtKBYcIaJkmSbSa4QY096gGVaYCZJjGZUpDz == str(BlobKey)), can they somehow reverse-engineer this key and easily construct another key that can be used to download client B's files? Or does the key have a random component added?
For reference, there is this note about str(db.Key), which is what raises my question:
Note: The string representation of a key looks cryptic, but is not
encrypted! It can be converted back to the raw key data, both kind and
identifier. If you don't want to expose this data to your users (and
allow them to easily guess other entities' keys), then encrypt these
strings or use something else.
I am creating the files like this, which does not specify a filename parameter, so I think the question boils down to, how does create() "pick" a filename when one is not specified? I suppose I could generate a random filename and pass it in here to be doubly sure this is secure.
file_name = files.blobstore.create(mime_type='application/octet-stream')
BlobKeys are non guessable. If a user has one key, that in no way enables them to guess another key. Unlike datastore keys, which contain full path information, BlobKeys do not encode any such data. You can share them safely without risk of a user doing an attack as you describe.
(I could not locate docs for these claims - this is based on my recollection.)
Assign a filename when creating a blob:
name = .....
file_name = files.blobstore.create(mime_type='application/octet-stream', _blobinfo_uploaded_filename=name)
And you do not need to use str(BlobKey). The BlobKey can be part of your serving url

PDFs, WCFs and iFames

Well now, just when I think I'm done with this little project they throw me another curve...
I have two WCFs. One hosted in IIS and the other is in a self-hosted service on a different server.
A function in the self-hosted service returns a PDF in the form of Byte(). The WCF in IIS calls the function, then uses System.IO.FileStream to write the PDF to intepub. The aspx performs a callback, and the page is reloaded with a dynamic iFrame displaying the pdf. Works good enough for me, but not good enough for the boss.
Somehow, I have to get the second WCF to pass the PDF back to my ASP app WITHOUT saving it to disk.
I need something like:
iFrameControl.Attributes.Add("src", ServiceReference1.GetPDF_Byte())
Any way to do this?
Thanks in advance,
Jason
If I understand you correctly, there is some action in the ASPX page which causes a call (possibly passing some parameter) to be made to the first service (WCF1, hosted in IIS), which in turn calls the second service (WCF2, from a different machine); WCF1 retrieves the PDF from WCF2, saves it locally in inetpub and returns the URL of the saved file; the callback call on the ASPX page then uses that URL to display the PDF on the iFrame.
A short answer: you can't use a service reference to do what you need (ServiceReference1.GetPDF_Byte()) - the "src" attribute for the control (or for any XML) needs to be a string, which in this case represents the URL of the resource which is the actual source for the control. You can, however, use WCF to implement that - a REST endpoint in the "raw" mode (http://blogs.msdn.com/b/carlosfigueira/archive/2008/04/17/wcf-raw-programming-model-web.aspx) can be used to return a PDF file.
You would change the structure of your application as follows: some action in the ASPX page causes it not to make a call to WCF1 directly, but to simply set the "src" property of the iFrame control to a call to a REST endpoint in WCF1. This call would take the parameters, and call WCF2 to retrieve the PDF file, and that call would return the PDF file directly (as a Stream). This way you don't incur the buffering cost that you would in your buffer solution (if many clients are requesting the page at the same time you may have some memory issues, and in this case you don't need to manage buffer lifetimes either).
Found it somewhere else in C and did a conversion, posting here just in case someone else needs it.
Answer: Create a new class (Globals.vb) to house a byte array that can be accessed from both pages, then create a new page and do a response.BinaryWrite your byte array in Page Load, and set the iFrame's src to the new (blank) page.
Protected Sub Page_Load(ByVal sender As Object, ByVal e As System.EventArgs) Handles Me.Load
Response.Clear()
Response.Buffer = True
Response.ContentType = "application/pdf"
Response.BinaryWrite(Globals.PDF_Data.ToArray)
End Sub

Resources