Storing data in a Google App Engine App - google-app-engine

I'm reading up on Google App engine and I'm thinking of using it as a CDN for a project I'm working on. As far as I can tell, there's two ways to store data. I could use a datastore or I could put files in a directory.
I was brought up believing it's a bad idea to store large binary data in a database, but according to Google, the datastore isn't an RMDB, it just acts like one.
So my gut is telling me to upload files to a directory. However, I thought I'd best canvas an opinion on here before making my mind up.
Has anyone used GAE for stuff like this? And if so, what method did you choose for storing files, and why?

You cannot write to the file system in App Engine. You need to use the Datastore to store any data.
Note that if your "large binary files" are actually large, you're going to run in to the 1MB limit on all API calls. An API for storing larger blobs is on the roadmap, but there's no way of knowing when it will be released. At present, you need to split blobs larger than 1MB into multiple datastore entities.

The blobstore API lets you store files upto 50 mb ,though its an experimental api and requires billing be enabled.Also its its different from bigtable.
http://code.google.com/appengine/docs/java/blobstore/

Nowadays Google Cloud Storage is the way to go for large files.

Related

How to host large files on Google-App-Engine

I would like my server that runs on Google App Engine to host large files such as audio scripts and images, is it possible to store them as a column in a database? If not, what mechanisms may I use?
You have two options:
Blobstore (currently available in Java, Python and Go).
Google Cloud Storage (currently available in Java, Python and PHP).
Blobstore and GCS are most likely what your are looking for.
Both services are not covered by the GAE SLAs however. If you need that kind of reliability promise you're stuck with the GAE datastore.
You can put your files in a BLOB property of a datastore entity and serve it from there. Datastore entities have a size limit of 1MB however.
To circumvent that, you must split and re-assemble your files using multiple entities. There again is a size limit to any GAE response which is 32MB.

Is it better to store a physical Image into datastore or a link to it?

I need to store images and I have 2 options:
store the image into GAE datastore.
store the image somewhere (maybe also on Dropbox or another website) and store its link into GAE datastore.
What's the best practice when we need to store an image into DB, in the hypotesis that each image is bijectivelly linked to a specific element of the datastore?
I think it depends heavily on the use case.
I have a small company website running on appengine and the content images are all stored in the datastore and for that application it works well (they are all relatively small images).
If you have a high traffic site you may find storing them in GCS, or some other mechanism that supports a more cost effective CDN will be more appropriate.
If the images are large (more than 1MB) then the datastore isn't a practical solution.
There will be no hard and fast rule. Understand your use cases, your cost structure, how complex the solution will be to manage, and then choose the most appropriate solution.
Neither of the above. Google's cloud platform includes a service specifically for storing files, Google Cloud Storage, which is well integrated into GAE. You should use that.

Should GAE Bloblstore be used in place of traditional Database

I am writing a web application that requires a database which will have entities like user, friends etc. Since Cloud SQL service is not free so i am looking for alternatives. Amazon RDS is one option, since they have a free tier which would suit my needs in the short term but before I get into it I would like to know more about blobstores.
Is it ideal to use blobstore to store such kind of information?
There are questions like:
how will the read/write latency be compared to a traditional db ?
if i start with blobstore and later i want to move to relational db, what are the problems that i could face ?
The most important of all is, if it is ideal to use blobstore in my scenario.
After looking at the documentation on google dev site I have found that blobstores are used to store large/medium files like images and videos.
You can't and shouldn't try to use the blobstore for structured data. That's what the datastore is for. Blobstore is for unstructured data such as files.

Google App Engine BlobStore as image host?

If I were to make a project with the Google App Engine (using Python), and this project contained small user-generated images (which, once uploaded, will be accessed a lot, but won't change or altered dynamically anymore), would the Google App Engine BlobStore make sense to use (in terms of costs, speed etc.)? Or would GAE or the client connecting to Amazon S3 and storing images there make more sense, as these files will end up being static?
For what it's worth, the generated image files are all considered to be public, not user-private, and it would be perfectly fine for them to be on another subdomain. All files will be fixed-palette 16 colors PNGs of exactly 19x19 pixels. Their URL/ID would be referenced in the GAE datastore, with a couple of more attributes (like creatorId), for handling/ showing them in the web app.
Thanks!
If you are concerned about speed and cost, by far the best way is to store them in the blobstore and use get_serving_url() (link). These images are served by google's high performance image servers, and will never cost you instance hours, just bandwidth, and you don't have to worry about memcache.
I asked a similar question a few days ago.
Im sticking with storing the images in the DataStore as BLOBS (not in the BlobStore) and ensuring i set a Cache Control header to ensure they arent requested too many times.
For such small images, you can simply use the Datastore. Even with the 1Gb of space it gives you in the free quotas you should be able to store a few 19x19 pixels images easily. Using BlobStore is slightly more complicated as the API's are more complex and the actual sotorage procedure involves more steps than just storing binary data in the DataStore. I do recommend however that you implement memCache for the retrieval of these images, since you say that will not be modified afterwards. You don't want to query the same 19*19*4 bytes out of the database for each image over and over.

Back up AppEngine database (Google cloud storage?)

I have an AppEngine application that currently has about 15GB of data, and it seems to me that it is impractical to use the current AppEngine bulk loader tools to back up datasets of this size. Therefore, I am starting to investigate other ways of backing up, and would be interested in hearing about practical solutions that people may have used for backing up their AppEngine Data.
As an aside, I am starting to think that the Google Cloud Storage might be a good choice. I am curious to know if anyone has experience using the Google Cloud Storage as a backup for their AppEngine data, and what their experience has been, and if there are any pointers or things that I should be aware of before going down this path.
No matter which solution I end up with, I would like a backup solution to meet the following requirements:
1) Reasonably fast to backup, and reasonably fast to restore (ie. if a serious error/data deletion/malicious attack hits my website, I don't want to have to bring it down for multiple days while restoring the database - by fast I mean hours, as opposed to days).
2) A separate location and account from my AppEngine data - ie. I don't want someone with admin access to my AppEngine data to necessarily have write/delete access to the backup data location - for example if my AppEngine account is compromised by a hacker, or if a disgruntled employee were to decide to delete all my data, I would like to have backups that are separate from the AppEngine administrator accounts.
To summarize, given that getting the data out of the cloud seems slow/painful, what I would like is a cloud-based backup solution that emulates the role that tape backups would have served in the past - if I were to have a backup tape, nobody else could modify the contents of that tape - but since I can't get a tape, can I store a secure copy of my data somewhere, that only I have access to?
Kind Regards
Alexander
There are a few options here, though none are (currently) quite what you're looking for.
With the latest release of version 1.5.5 of the SDK, we now support interfacing with Google Storage directly - you can see how, here. With this you can write data to Google Storage, but to the best of my knowledge there's no way to write a file that the app will then be unable to delete.
To actually gather the data, you could use the App Engine mapreduce API. It has built in support for writing to the App Engine blobstore; writing to Google Storage would require you to implement your own output writer, currently.
Another option, as WoLpH suggests, is to use the Datastore Admin tool to back up data to another app. With a little extra effort you could modify the remote_api stub to prohibit deletes to the target (backup) app.
One thing you should definitely do regardless is to enable two-factor authentication for your Google account; this makes it a lot harder for anyone to get control of your account, even if they discover your password.
The bulkloader is probably one of the fastest way to backup/restore your data.
The problem with the AppEngine is that you have to do everything through views. So you have the restrictions that views have... the result is that a fast backup/restore still has to use the same API's as the rest of your app. So the bulkloader (possibly with a few modifications) is definately your best option here.
Perhaps though... (haven't tried it yet), you can use the new Datastore Admin to copy the data to another app. One which only you control. That way you can copy it back from the other app when needed.

Resources