Uploading Large Amounts of Data from C# Windows Service to Azure Blobs - wpf

Can someone please point me in the right direction.
I need to create a windows timer service that will upload files in the local file system to Azure blobs.
Each file (video) may be anywhere between 2GB and 16GB. Is there a limit on the size? Do I need to split the file?
Because the files are very large can I throttle the upload speed to azure?
Is it possible in another application (WPF) to see the progress of the uploaded file? i.e. a progress bar and how much data has been transferred and what speed it is transferring at?

The upper limit for a block blob, the type you want here, is 200GB. Page blobs, used for VHDs, can go up to 1TB.
Block blobs are so called because upload is a two-step process - upload a set of blocks and then commit that block list. Client APIs can hide some of this complexity. Since you want to control the uploads and keep track of their status you should look at uploading the files in blocks - the maximum size of which is 4MB - and manage that flow and success as desired. At the end of the upload you commit the block list.
Kevin Williamson, who has done a number of spectacular blog posts, has a post showing how to do "Asynchronous Parallel Blob Transfers with Progress Change Notification 2.0."

Related

Medium to large file uploads with progress updates in AspNet Core

By medium to large I mean anything from 10mb -> 200mb (sound files if that is important)
basically I want to make an API that does some spectral analysis on the file itself, this would require a file upload. But for UI/UX reasons it would be nice to have a progress bar for the upload process. What are the common architectures for achieving this interaction.
The client application uploading the file will be a javascript client (reactjs/redux) and the API is written in ASP.NET Core. I have seen some examples which use websockets to update the client on progress, and other examples where the client polls for status updates given a resource url to query the status. Are there any best practices (or the "modern way of doing this") for doing such a thing that I should know of? TIA
In general, you just need to save progress status while reading the input stream in your controller to some variable (session-specific variable, because there might be a few file uploading sessions at the same time) and then get this status from the client-side by ajax requests (or signalr).
You could take a look at this example: https://github.com/DmitrySikorsky/AspNetCoreUploadingProgress
I have tried 11 MB files with no problems. There is line
await Task.Delay(10); // It is only to make the process slower
there, don't forget to remove it in the real solution.
In this sample files are loaded by the ajax, so I didn't try really large files, but you can use iframe solution from this sample:
https://github.com/DmitrySikorsky/AspNetCoreFileUploading
The other part will be almost the same.
Hope this helps you. Feel free to ask if have any additional questions.

Storing Images on a Database vs Fileserver vs Zip file on Server

I am creating a simple system where users can view a small image by entering the name of that image into a search box. I am having trouble deciding how to store the images in the most efficient way. I have thought of 3 solutions, and I am wondering which one is the best solution for this problem:
Storing the Images as blobs or base64 string in a Database and Loading the Image based on the user input with a simple query. Doing it this way will increase the load on the database and will result in longer load times.
Storing the Images as separate files on a file server. And just loading it by assigning the image.src attribute based on the user input: image.src = "./server/images/" + userInput; Doing it this way however will increase the number of file requests on my server, so it will be more expensive.
Or lastly, I could store the Images in a single zip file on the fileserver. And download the all at once at the start of the program. The advantage of this, is that there will only be a single request when loading the page. However it will take some time to load all the files.
Note that each image is around 1-3KB in size. And all images will be placed on the server/db manually. Furthermore, there will only be around 100-200 Images max. So all these considerations may not matter too much. But I want to know what the recommended way of doing this is.
NOTE: The server I am running is an AWS server, I I found that having too many requests will increase the cost of the server. This is why I am sceptical about approach nr. 2
I too, manage stored images and retrieve from a AWS, EC2. My solution, and suggestion to you is similar to option 2, but adding caches as a way to reduce server requests.
Keep the images in a folder, or better in a S3 storage, and call by name from either a database query that holds the URL or just the image name. Then place it inside a placeholder in HTML.
Select url from img.images where image_name='blue_ocean'
then I bind it to a placeholder in HTML
<img src="/images/< blue_ocean>.jpg alt="">
About many request to the server, you can cache images, I suggest the use of Service Workers a powerful WebApi which allow to cache images, and therefore reduce the amount of data served.
Another approach is to use Sprites, which is a one file or image sheet that contains all the requested images, so instead of many requests, just one request, then grab each required image by parsing it's X,Y, coordinates. This method is very efficient, is used in games, in order to reduce the overhead derived of requesting multiple images in short spans of time, multiple times.

Session variable or temporary flie on client post?

My application web ASP.NET can launch a lot of reports and I need to store temporary the pdf file before to show the report.
So, which is the better way to store temporary a pdf file ? This is for few seconds or maximum few minutes, the pdf file can have several hundred pages and a lot of users can launch the reports.
In a session variable or in a temporary file ?
Thank you.
According to me it should be a temporary file. Storing that in session memory not a good practice even if it one/two pages. It would be difficult to scale to multiple users. Imagine have 10 users using the app and having 10 files in memory. You will be better of using the memory for other reasons than files in memory.

Silverlight streaming upload

I have a Silverlight application that needs to upload large files to the server. I've looked at uploading using both WebClient as well a HttpWebRequest, however I don't see an obvious way stream the upload with either option. Do to the size of the files, loading the entire contents into memory before uplaoding is not reasonable. Is this possible in Silverlight?
You could go with a "chunking" approach. The Silverlight File Uploader on Codeplex uses this technique:
http://www.codeplex.com/SilverlightFileUpld
Given a chunk size (e.g. 10k, 20k, 100k, etc), you can split up the file and send each chunk to the server using an HTTP request. The server will need to handle each chunk and re-assemble the file as each chunk arrives. In a web farm scenario when there are multiple web servers - be careful to not use the local file system on the web server for this approach.
It does seem extraordinary that the WebClient in Silverlight fails to provide a means to pump a Stream to the server with progress events. Its especially amazing since this is offered for a string upload!
It is possible to code what would be appear to be doing what you want with a HttpWebRequest.
In the call back for BeginGetRequestStream you can get the Stream for the outgoing request and then read chunks from your file's Stream and write them to the output stream. Unfortunately Silverlight does not start sending the output to the server until the output stream has been closed. Where all this data ends up being stored in the meantime I don't know, its possible that if it gets large enough SL might use a temporary file so as not to stress the machines memory but then again it might just store it all in memory anyway.
The only solution to this that might be possible is to write the HTTP protocol via sockets.

Google App Engine Large File Upload

I am trying to upload data to Google App Engine (using GWT). I am using the FileUploader widget and the servlet uses an InputStream to read the data and insert directly to the datastore. Running it locally, I can upload large files successfully, but when I deploy it to GAE, I am limited by the 30 second request time. Is there any way around this? Or is there any way that I can split the file into smaller chunks and send the smaller chunks?
By using the BlobStore you have a 1 GB size limit and a special handler, called unsurprisingly BlobstoreUpload Handler that shouldn't give you timeout problems on upload.
Also check out http://demofileuploadgae.appspot.com/ (sourcecode, source answer) which does exactly what you are asking.
Also, check out the rest of GWT-Examples.
Currently, GAE imposes a limit of 10 MB on file upload (and response size) as well as 1 MB limits on many other things; so even if you had a network connection fast enough to pump up more than 10 MB within a 30 secs window, that would be to no avail. Google has said (I heard Guido van Rossum mention that yesterday here at Pycon Italia Tre) that it has plans to overcome these limitations in the future (at least for users of GAE which pay per-use to exceed quotas -- not sure whether the plans extend to users of GAE who are not paying, and generally need to accept smaller quotas to get their free use of GAE).
you would need to do the upload to another server - i believe that the 30 second timeout cannot be worked around. If there is a way, please correct me! I'd love to know how!
If your request is running out of request time, there is little you can do. Maybe your files are too big and you will need to chunk them on the client (with something like Flash or Java or an upload framework like pupload).
Once you get the file to the application there is another issue - the datastore limitations. Here you have two options:
you can use the BlobStore service which has quite nice API for handling up 50megabytes large uploads
you can use something like bigblobae which can store virtually unlimited size blobs in the regular appengine datastore.
The 30 second response time limit only applies to code execution. So the uploading of the actual file as part of the request body is excluded from that. The timer will only start once the request is fully sent to the server by the client, and your code starts handling the submitted request. Hence it doesn't matter how slow your client's connection is.
Uploading file on Google App Engine using Datastore and 30 sec response time limitation
The closest you could get would be to split it into chunks as you store it in GAE and then when you download it, piece it together by issuing separate AJAX requests.
I would agree with chunking data to smaller Blobs and have two tables, one contains th metadata (filename, size, num of downloads, ...etc) and other contains chunks, these chunks are associated with the metadata table by a foreign key, I think it is doable...
Or when you upload all the chunks you can simply put them together in one blob having one table.
But the problem is, you will need a thick client to serve chunking-data, like a Java Applet, which needs to be signed and trusted by your clients so it can access the local file-system

Resources