Uploading multiple files to blobstore (redux) - google-app-engine

Yes, I've seen this question already, but I'm finding information that contradicts its accepted answer and Nick Johnson's blog on the GAE docs.
The docs talk about uploading more than one file at the same time - the function to get uploaded files returns a list:
The get_uploads() method returns a
list of BlobInfo objects, one for each
uploaded file in the request.
But everywhere I've looked, the going assumption is that only one file a time can be uploaded, and a new upload url needs to be created each time.
Is it even possible to upload more than one file at the same time using HTML5/Flash using Plupload?

Currently, the blobstore service upload URLs only support one file upload per post. In order to upload multiple files, you need to use the pattern documented in my blog posts. In future, we may extend the blobstore API to support more flexible upload URLs, supporting multiple uploaded files in a single request.
Edit: The blobstore now supports multiple file uploads in a single request.

Here's how I use the get_uploads() method for more than one file:
blob_info = self.get_uploads()[0]
blob_info2 = self.get_uploads()[1]
Nick Johnson's dropbox service is another example and I hope you find what suits your needs.

Related

S3 multipart upload with React JS

I am trying to upload Image/Video files into S3 bucket from my React JS application. So I refered some of the React S3 uploader npm packages react-dropzone-s3-uploader , react-s3-uploader-multipart. But both are keep giving Errors while importing into React JS component. And I have already post this error message on my another stack question (please refer this qus). I would like to do this multipart upload directly from my React application to S3 bucket. If anyone knows the solution please share with me.
Thanks in advance.
The only lib which worked perfectly and supported AWS S3 multipart with minimum work was Uppy for me. Highly recommended to try out:
https://uppy.io/docs/aws-s3-multipart/
you will need to provide couple endpoints for it though, so read the docs. You will see "Companion" mentioned there, you can easily ignore it, provide 5 needed endpoints of your custom API and it will be all good. I would suggest to run the UI part, puth in some dummy URLs for these 5 functions and check network activity of the browser to faster understand how it works.
A function that calls the S3 Multipart API to create a new upload
A function that calls the S3 Multipart API to list the parts of a file that have already been uploaded
A function that generates a batch of signed URLs for the specified part numbers
A function that calls the S3 Multipart API to abort a Multipart upload, and removes all parts that have been uploaded so far
A function that calls the S3 Multipart API to complete a Multipart upload, combining all parts into a single object in the S3 bucket
Yet no matter what way you would build multipart upload, you will always need to start the upload, list parts, get signed URLs to upload each part, cancel the upload & complete. So it will never be 3 minutes task to build this, but with Uppy i had most of success.
You can use React Dropzone Uploader, which gives you file previews (including image thumbnails) out of the box, and also handles uploads for you.
Uploads have progress indicators, and they can be cancelled or restarted. The UI is fully customizable.
Here's an example of how to upload files directly to an S3 bucket, using pre-signed URLs.
Full disclosure: I wrote this library.
Here’s a way to do it full stack MERN with express file upload. Server code here is minimal. This might be helpful, if not, no worries!
https://link.medium.com/U1SdsoHMy2

Download large file on Google App Engine Python

On my appspot website, I use a third party API to query a large amount of data. The user then downloads the data in CSV. I know how to generate a csv and download it. The problem is that because the file is huge, I get the DeadlineExceededError.
I have tried tried increasing the fetch deadline to 60 (urlfetch.set_default_fetch_deadline(60)). It doesn't seem reasonable to increase it any further.
What is the appropriate way to tackle this problem on Google App Engine? Is this something where I have to use Task Queue?
Thanks.
DeadlineExceededError means that your incoming request took longer than 60 secs, not your UrlFetch call.
Deploy the code to generate the CSV file into a different module that you setup with basic or manual scaling. The URL to download your CSV will become http://module.domain.com
Requests can run indefinitely on modules with basic or manual scaling.
Alternately, consider creating a file dynamically in Google Cloud Storage (GCS) with your CSV content. At that point, the file resides in GCS and you have the ability to generate a URL from which they can download the file directly. There are also other options for different auth methods.
You can see documentation on doing this at
https://cloud.google.com/appengine/docs/python/googlecloudstorageclient/
and
https://cloud.google.com/appengine/docs/python/googlecloudstorageclient/functions
Important note: do not use the Files API (which was a common way of dynamically create files in blobstore/gcs) as it has been depracated. Use the above referenced Google Cloud Storage Client API instead.
Of course, you can delete the generated files after they've been successfully downloaded and/or you could run a cron job to expire links/files after a certain time period.
Depending on your specific use case, this might be a more effective path.

Uploading to Google Cloud Storage using Blobstore: Blobstore doesn't retain file name upon upload

I'm trying to upload to GCS using the Blobstore. I have set the GCS bucket name while generating the upload url, and the file gets uploaded successfully.
In the upload handler, blobInfo.getFilename() returns the right file name. But the file actually got saved in the GCS bucket in some different file name. Each time, the file name is some random hash like this one:
L2FwcGhvc3RpbmdfcHJvZC9ibG9icy9BRW5CMlVvbi1XNFEyWEJkNGlKZHNZRlJvTC0wZGlXVS13WTF2c0g0LXdzcEVkaUNEbEEyc3daS3Vham1MVlZzNXlCSk05ZnpKc1RudDJpajF1TmxwdWhTd2VySVFLdUw3US56ZXFHTEZSLVoxT3lablBI
Is this how it will work? Is this an anomaly?
I store the file name to the datastore based on the data returned from blobInfo.getFilename(), which is the correct value of file name. But I'm unable to access the file using the GcsFilename since the file is stored in GCS with that random hash as file name.
Any pointers would be greatly helpful.
Thanks!
PS: The blobstore page says that BlobInfo is currently not available for GCS objects. But BlobInfo.getFilename returns the right value for me. Is that something wrong from my end?
It's how it works, see https://cloud.google.com/appengine/docs/python/blobstore/fileinfoclas ...:
FileInfo metadata is not persisted to datastore [...] You must save
the gs_object_name yourself in your upload handler or this data will
be lost
I personally recommend that new applications use https://cloud.google.com/appengine/docs/python/googlecloudstorageclient/ directly, rather than the blobstore emulation on top of it.
The latter is currently provided essentially only for (limited, partial) backwards compatibility: it's not really all that suitable for new applications.

From Drive to Blobstore using Picker

I have the Google picker set up, as well as Blobstore. I'm able to upload files from my local machine to the Blobstore, but now I have the Picker set up, it works, but I don't know know how to use the info (url? fileid?) to then load that selected file into the Blobstore? Any tips on how to do this? I haven't been able to find much of anything on it on Googles resources
There isn't a direct link between the Google Picker and the App Engine Blobstore. They are kind of different tools for different jobs. The Google Picker is designed as an end user tool, to select data from a users Google account. It just so happens that the Picker also provides an upload interface (to Google Drive) as well. The Blobstore on the other hand, is designed as a blob storage mechanism for your App Engine application.
In theory, you could write a script to connect the two, but there are a few considerations:
Your app would need access to the users Google Drive account using OAuth2. This is necessary, as the Picker API is a client side API, whereas the Blobstore API is a server side API. You would need to send the selected document URL to the server, then download the document and finally save it to Blobstore.
Unless you then deleted the data from Drive (very risky due to point 3), your data would be persisted in 2 places
You cannot know for sure if the user selected an existing file, or uploaded a new one
Not a great user experience - the user things they are uploading to Drive
In essence, this sounds like a bad idea! What is your use case?
#Gwyn - I don't have enough reputation to add a comment to your solution, but I had an idea about problem #3: You cannot know for sure if the user selected an existing file, or uploaded a new one
Would it be possible to use Response.VIEW to see what view they were using when the file was selected? If you have one view constructor for Drive files and one for Upload files, something like
var driveView = new google.picker.View(google.picker.ViewId.DOCS);
var uploadView = new google.picker.DocsUploadView();
would that allow you to know whether the file was a new upload (safe to delete) or an existing file (leave it alone)?
Assuming that you want to pick a file from your own Google Drive and move it to the Blobstore.
1)First you have to perform Oauth for Google Drive API
2)Using the picker when you select a file from drive, you need to get it's id
3)Using the id obtained in step 2 you can programmatically download it using Drive API
4)After downloading the file you can use FileService(deprecated though) to upload the file to the
Blobstore.

CAN I use Google Application Engine to implement this project?

I take web application course this semester and I want to use google application engine to implement my course project, but I'm wondering if GAE can satisfy this project's requirements.
This course project is a homework submittal system which allows users(students) uploading homework to the sever and teachers checking homework online.
Assuming homework students uploaded is some html and css stuff. What confused me is how to implemnent teacher checking online function? For example:
Student A uploaded a html file hello.html and teacher want to use http: //xxx.xx/xx/xx/hello.html to check this homework.
Can GAE satisfy this requirement? As far as I konw, GAE uses app.yaml to point to different files or htmls, but when students upload their homework, they can not change app.yaml,right?
I get stuck here. Please help me. Thank you!
Yes, you can use GAE to create this application, but you'll have to move away from the idea that you are uploading and serving an HTML file as if it were living directly on the filesystem. You can't do that.
What you can do -- relatively easily -- is store the submitted file or files as datastore objects and provide a URL which takes the desired filename as a parameter and serves it out of the datastore.
You could store the submitted files in a model like this:
class HomeworkItem(db.Model):
author = db.UserProperty()
filename = db.StringProperty()
content = db.TextProperty(multiline=True)
submitted_on = db.DateProperty()
The content field is declared as a TextProperty assuming that you are dealing with HTML and CSS files, but if you were ever going to deal dealing with binary data, you'd want to use a BlobProperty.
You'd need to have two URLs to handle upload and download of assets. You can use a web framework or write some code to handle parameterized URLs, allowing you to encode things like the filename into the URL itself, like this:
http://homeworkapp.edu/review/hello.html
And then the method that handles /review/* URLs would retrieve the data from the datastore and send it back as the reply.
GAE would satisfy your requirement but you would need to save each “hello.html” file in either the Blobstore or the Datastore and build some system to retrieve and serve the uploaded files. See this Q&A for further reference.

Resources