Silverlight streaming upload - silverlight

I have a Silverlight application that needs to upload large files to the server. I've looked at uploading using both WebClient as well a HttpWebRequest, however I don't see an obvious way stream the upload with either option. Do to the size of the files, loading the entire contents into memory before uplaoding is not reasonable. Is this possible in Silverlight?

You could go with a "chunking" approach. The Silverlight File Uploader on Codeplex uses this technique:
http://www.codeplex.com/SilverlightFileUpld
Given a chunk size (e.g. 10k, 20k, 100k, etc), you can split up the file and send each chunk to the server using an HTTP request. The server will need to handle each chunk and re-assemble the file as each chunk arrives. In a web farm scenario when there are multiple web servers - be careful to not use the local file system on the web server for this approach.

It does seem extraordinary that the WebClient in Silverlight fails to provide a means to pump a Stream to the server with progress events. Its especially amazing since this is offered for a string upload!
It is possible to code what would be appear to be doing what you want with a HttpWebRequest.
In the call back for BeginGetRequestStream you can get the Stream for the outgoing request and then read chunks from your file's Stream and write them to the output stream. Unfortunately Silverlight does not start sending the output to the server until the output stream has been closed. Where all this data ends up being stored in the meantime I don't know, its possible that if it gets large enough SL might use a temporary file so as not to stress the machines memory but then again it might just store it all in memory anyway.
The only solution to this that might be possible is to write the HTTP protocol via sockets.

Related

Client side JS: Persist blob to disk before saving/prompting user for save location

1. The setup
I'm currently initiating a GET request to an S3 bucket (not important) to download a very large file using the browser fetch(). This file is, in it's stored form, raw and unusable binary data, not structured.
2. The task and problem
There are a few things I want to do on the client-side with this data:
I need to process this data as it streams into the client to perform transformations on it (decryption, for example).
Once the data is processed and downloaded, it might still not be of any immediate use to the user outside the context of the web UI. Maybe the data should stay stored within the web app's sandbox disk space unless a user explicitly exports it?
3. The question
Where can I store this blob of unstructured data in both or either of the use cases listed above? There appear to be many options but none that fit this use case precisely. Any thoughts?
EDIT:
I feel like an idiot. I totally forgot about the FileSystem API. I'll take a look and answer my own question with a pseudo-implementation of this works.
EDIT 2:
I feel the need to reiterate what I stated in 2.2 above:
within the web app's sandbox disk space
I don't care about accessing the user's whole file system. I just want a space I can work with large files in on disk, similar to the app space directories provided to mobile applications by Android and iOS.
If you want to save and process a file at client level, and Blob is not an option, you may consider File System Access API (https://developer.mozilla.org/en-US/docs/Web/API/File_System_Access_API#writing_to_files), even if this will introduce an interaction with the user.
Another option would be to take the advantages of PWAs client-side storage (https://developer.mozilla.org/en-US/docs/Learn/JavaScript/Client-side_web_APIs/Client-side_storage), this is also about your application architecture.
Before to check if to process your file at client level can be done as you need with the existing technologies, check if you really need to do that because it is only option, or, instead, if you are able to move such logic at server level, depending on your use cases.

Medium to large file uploads with progress updates in AspNet Core

By medium to large I mean anything from 10mb -> 200mb (sound files if that is important)
basically I want to make an API that does some spectral analysis on the file itself, this would require a file upload. But for UI/UX reasons it would be nice to have a progress bar for the upload process. What are the common architectures for achieving this interaction.
The client application uploading the file will be a javascript client (reactjs/redux) and the API is written in ASP.NET Core. I have seen some examples which use websockets to update the client on progress, and other examples where the client polls for status updates given a resource url to query the status. Are there any best practices (or the "modern way of doing this") for doing such a thing that I should know of? TIA
In general, you just need to save progress status while reading the input stream in your controller to some variable (session-specific variable, because there might be a few file uploading sessions at the same time) and then get this status from the client-side by ajax requests (or signalr).
You could take a look at this example: https://github.com/DmitrySikorsky/AspNetCoreUploadingProgress
I have tried 11 MB files with no problems. There is line
await Task.Delay(10); // It is only to make the process slower
there, don't forget to remove it in the real solution.
In this sample files are loaded by the ajax, so I didn't try really large files, but you can use iframe solution from this sample:
https://github.com/DmitrySikorsky/AspNetCoreFileUploading
The other part will be almost the same.
Hope this helps you. Feel free to ask if have any additional questions.

Processing a large (>32mb) xml file over appengine

I'm trying to process large (~50mb) sized xml files to store in the datastore. I've tried using backends, sockets (to pull the file via urlfetch), and even straight up uploading the file within my source code, but again keep running into limits (i.e. the 32 mb limit).
So, I'm really confused (and a little angry/frustrated). Does appengine really have no real way to process a large file? There does seem to be one potential work around, which would involve remote_apis, amazon (or google compute I guess) and a security/setup nightmare...
Http ranges was another thing I considered, but it'll be painful to somehow connect the different splitted parts together (unless I can manage to split the file at exact points)
This seems crazy so I thought I'd ask stackover flow... am I missing something?
update
Tried using range requests and it looks like the server I'm trying to stream from doesn't use it. So right now I'm thinking either downloading the file, hosting it on another server, then use appengine to access that via range http requests on backends AND then automate the entire process so I can run it as a cron job :/ (the craziness of having to do all this work for something so simple... sigh)
What about storing it in the cloud storage and reading it incrementally, as you can access it line by line (in Python anyway) so it wont' consume all resources.
https://developers.google.com/appengine/docs/python/googlecloudstorageclient/
https://developers.google.com/storage/
The GCS client library lets your application read files from and write
files to buckets in Google Cloud Storage (GCS). This library supports
reading and writing large amounts of data to GCS, with internal error
handling and retries, so you don't have to write your own code to do
this. Moreover, it provides read buffering with prefetch so your app
can be more efficient.
The GCS client library provides the following functionality:
An open method that returns a file-like buffer on which you can invoke
standard Python file operations for reading and writing. A listbucket
method for listing the contents of a GCS bucket. A stat method for
obtaining metadata about a specific file. A delete method for deleting
files from GCS.
I've processed some very large CSV files in exactly this way - read as much as I need to, process, then read some more.
def read_file(self, filename):
self.response.write('Truncated file content:\n')
gcs_file = gcs.open(filename)
self.response.write(gcs_file.readline())
gcs_file.seek(-1024, os.SEEK_END)
self.response.write(gcs_file.read())
gcs_file.close()
Incremental reading with standard python!

WCF File upload from Silverlight client

I am trying to upload files to my WCF service using streaming to upload large files. All of this works fine using a normal client (like a ASP.net page). In Silverlight however I get the following error:
Timeouts are not supported on this stream
I am uploading via a memorystream and I assume the issue is basically because instead of calling the synchronous method in Silverlight I am forced to call the async method. So it is this that doesn't like the normal memorystream. I have tried to find some other stream to use but it seems like either they are not supported in silverlight (bufferedstream, networkstream) or break the method (generic stream which for some reason MUST be the only parameter of the method to be used). Am I missing something here? I originally was using a byte array but there are too many size limitations there for what I need to allow to be uploaded.
I can insert my code here but since everything works flawlessly with my ASP.net test client I am assuming my bindings and code are fine.
There are three separate issues here:
1) Can you use the Stream type in your contract?
2) Can you get true streaming behavior on the client? (E.g. upload 2GB file without allocating 2GB of memory anywhere in the stack - including the underlying HTTP stack)
3) Can you get true streaming behavior on the server?
As far as I remember, the answers to #1 and #2 are "no" in Silverlight (though it may have changed in SL4.0). So the best you can achieve is #3. For example, you can try some trick with having a byte[]-based contract on the Silverlight side that results in the same XML projection as a Stream-based contract on the server side. Or, have byte[] on the client side and read from the Message class directly on the server side.
But my recollection about #1/#2 may be wrong...

Google App Engine Large File Upload

I am trying to upload data to Google App Engine (using GWT). I am using the FileUploader widget and the servlet uses an InputStream to read the data and insert directly to the datastore. Running it locally, I can upload large files successfully, but when I deploy it to GAE, I am limited by the 30 second request time. Is there any way around this? Or is there any way that I can split the file into smaller chunks and send the smaller chunks?
By using the BlobStore you have a 1 GB size limit and a special handler, called unsurprisingly BlobstoreUpload Handler that shouldn't give you timeout problems on upload.
Also check out http://demofileuploadgae.appspot.com/ (sourcecode, source answer) which does exactly what you are asking.
Also, check out the rest of GWT-Examples.
Currently, GAE imposes a limit of 10 MB on file upload (and response size) as well as 1 MB limits on many other things; so even if you had a network connection fast enough to pump up more than 10 MB within a 30 secs window, that would be to no avail. Google has said (I heard Guido van Rossum mention that yesterday here at Pycon Italia Tre) that it has plans to overcome these limitations in the future (at least for users of GAE which pay per-use to exceed quotas -- not sure whether the plans extend to users of GAE who are not paying, and generally need to accept smaller quotas to get their free use of GAE).
you would need to do the upload to another server - i believe that the 30 second timeout cannot be worked around. If there is a way, please correct me! I'd love to know how!
If your request is running out of request time, there is little you can do. Maybe your files are too big and you will need to chunk them on the client (with something like Flash or Java or an upload framework like pupload).
Once you get the file to the application there is another issue - the datastore limitations. Here you have two options:
you can use the BlobStore service which has quite nice API for handling up 50megabytes large uploads
you can use something like bigblobae which can store virtually unlimited size blobs in the regular appengine datastore.
The 30 second response time limit only applies to code execution. So the uploading of the actual file as part of the request body is excluded from that. The timer will only start once the request is fully sent to the server by the client, and your code starts handling the submitted request. Hence it doesn't matter how slow your client's connection is.
Uploading file on Google App Engine using Datastore and 30 sec response time limitation
The closest you could get would be to split it into chunks as you store it in GAE and then when you download it, piece it together by issuing separate AJAX requests.
I would agree with chunking data to smaller Blobs and have two tables, one contains th metadata (filename, size, num of downloads, ...etc) and other contains chunks, these chunks are associated with the metadata table by a foreign key, I think it is doable...
Or when you upload all the chunks you can simply put them together in one blob having one table.
But the problem is, you will need a thick client to serve chunking-data, like a Java Applet, which needs to be signed and trusted by your clients so it can access the local file-system

Resources