Have a requirement of accessing large pandas data frame files to run some analytics in the worker of the app engine via google cloud tasks
Can somebody please suggest on what component in google cloud can be used for storing and accessing files quickly ?
Any reference to example would be of great help.
I think Google Cloud Storage is the best place to store and accessing the files quickly :
https://cloud.google.com/storage/docs/discover-object-storage-console
GCS can store large files and a big amount of files.
You can also use :
gsutil to move or copy files between buckets.
Storage transfer service
https://cloud.google.com/storage-transfer/docs/overview
Your application from app engine could use Cloud Storage.
Related
I'm trying to build a system where a user selects a large dataset from their dropbox, and this data is downloaded to a google cloud storage bucket.
The problem is that my backend code runs on AppEngine, and therefore I cannot download the large file to disk for uploading to the bucket.
Is there a way to programmatically tell Cloud storage to retrieve data from a URL?
Or is there another way to download this data on an AppEngine instance, and upload it from there?
You can't directly tell GCS to please download one file from the Internet and save it in a bucket.
On the other hand, moving a large collection of objects is the business of Google's Storage Transfer service. It may suit your needs, depending on what you mean by "a large dataset." https://cloud.google.com/storage-transfer/docs/overview
I am migrating my google app engine project to windows azure app platform.Please help me to migrate the all google app engine blobstore files to Azure blobstore.
I have one solution with python.But I am not much aware of Python,Please help me if it is possible with javascript,java or any tool.
A simple way to do this (if you are not working with a huge amount of data) would be to download the google app engine blobstore files to a local drive, and then upload them using a tool like Azure Storage Explorer (http://storageexplorer.com/). Azure Storage Explorer lets you upload a directory through the user interface. Or you can use the AzCopy command-line interface to upload local files to Blob storage (https://azure.microsoft.com/en-us/documentation/articles/storage-use-azcopy/).
For downloading google blobstore files, you may be able to use a tool or interface like the ones described here: How can I get to the blobstore in the new Google Cloud Console?
If you have a very large amount of data, then you may need a different option. For uploading very large amounts of data to Azure Blob storage, you can use the Import/Export service, and send Microsoft a drive containing your data (https://azure.microsoft.com/en-us/documentation/articles/storage-import-export-service/).
I am developing an app where users can upload images. The app has a NodeJs Backend an Angular Frontend with a Redis and Neo4j all dockerize and run by Kubernetes. Now I would like to store images, but there are so many service that I think could do the job that I don't know what to do... Can I use my Google Drive account and the Drive Sdk to upload the images of my users ? Should I look into Google Cloud Storage ? What about the persistence storage option in Kubernetes ? Or can I use my Flickr Account ??? Could someone point me the right direction... Thanks
For uploading and storing static files such as images in the cloud using GCP should probably be using Cloud Storage.
While both Google Drive and Google Cloud Storage provide an API to upload files, Cloud storage is more suited for your use case. I took this excerpt from here
Cloud Storage is intended to be accessed primarily through its API and
provides all the functionality necessary for developers to use it as a
backing store for their own applications.
and
Cloud Storage enables developers to store their application data in
the Google cloud (and they’re responsible for the storage their app
consumes), whereas in Drive, users allow apps to interact with the
user’s private storage and content.
I'm new to AppEngine and I'm building an app that accept user image uploads from Android devices.
I built it with Cloud Storage but then I realized that I have problems uploading large files (maybe because of request time limits?)
so I figured out I should use Blobstore's upload URL to properly upload multiple large files.
Blobstore also has the on-the-fly image resizing feature which is very nice.
the thing is, Cloud Storage is cheaper than the Blobstore.
should I move the uploaded files from Blobstore to Cloud Storage after uploading ?
is there a way to upload multiple large files to AppEngine without going through the Blobstore upload URL way ?
I'm using Go if it matters..
The simplest answer is probably to use a signed url to allow the user to upload directly to Cloud Storage. This lets you bypass App Engine entirely for your upload, which in turn simplifies the network usage and allows you to take full advantage of all of Cloud Storage's upload infrastructure.
Currently, blobstore is $0.0009 / GBHour, while Cloud Storage is $0.0027 / GBHour, so it seems that blobstore is now 3 times cheaper than Cloud Storage. So while there may be reasons to move to Cloud Storage, cost is not currently one of them. Note that the prices changed recently.
If you need the richer API provided by Cloud Storage, then that's another story of course.
I'm using boto in an App Engine backend instance to get the file and the GAE Storage APIs to store it. I raised the default fetch deadline.
However for files that go over the 32M limit, there's a problem and ResumableDownloadHandler doesn't help there because the GCS file handle has an incompatible interface.
Can anyone suggest an existing solution that would be resumable and not require a file-system?
This is a known problem, which we're working on. In the meantime you can use the Google Python client library as an alternative to the Files API.