How can I associate images with entities in Google App Engine - database

I'm working on a Google App Engine application, and have come to the point where I want to associate images on the filesystem to entities in the database.
I'm using the bulkupload_client.py script to upload entities to the database, but am stuck trying to figure out how to associate filesystem files to the entities. If an entity has the following images: main,detail,front,back I think I might want a naming scheme like this: <entity_key>_main.jpg
I suppose I could create a GUID for each entity and use that, but I'd rather not have to do that.
Any ideas?
I think I can't use the entity key since it might be different between local and production datastores, so I would have to rename all my images after a production bulkupload.

There is a GAE tutorial on how to Serve Dynamic Images with Google App Engine. Includes explanations & downloadable source code.

I see two options here based on my very limited knowledge of GAE.
First, you can't actually write anything to the file system in GAE, right? That would mean that any images you want to include would have to be uploaded as a part of your webapp and would therefore have a static name and directory structure that is known and unchangeable. In this case, your idea of _main.jpg, OR /entity_key/main.jpg would work fine.
The second option is to store the images as a blob in the database. This may allow for uploading images dynamically rather than having to upload a new version of the webapp every time you need to update images. It would quickly eat into your free database space. Here's some information on serving pictures from the database. http://code.google.com/appengine/articles/images.html

If you're uploading the images statically, you can use the key based scheme if you want: Simply assign a key name to the entities, and use that to find the associated images. You specify a key name (in the Python version) as the key_name constructor argument to an entity class:
myentity = EntityClass(key_name="bleh", foo="bar", bar=123)
myentity.put()
and you can get the key name for an entity like so:
myentity.key().name()
It sounds like the datastore entities in question are basically static content, though, so perhaps you'd be better simply encoding them as literals in the source and thus having them in memory, or loading them at runtime from local datafiles, bypassing the need to query the datastore for them?

Related

How can I create an add folder functionality to my React CMS application?

Click here to see a picture of what I mean
I haven't tried anything yet because I'm not sure how to even approach this problem. I'm not even sure what to Google. I do, however, have a pretty good handle on React. Thanks!
Update: The folders will not be storing files, just hyperlinks.
You need to model the problem space first. i.e. models for folders, and files. Each having properties (name, etc.) and associations (folders can have many files and subfolders).
To store the physical files you can use a third-party service like Amazon S3.
This would get you started at least.

Cloud Datastore - Exclude indexes from index.yaml file

I would like to have indexes only on few fields in a kind. Rather than excluding all the fields in the in the Java code during the creation of Entity as described here, I was wondering if there is a way I can define it in index.yaml file and not worry about it during creation of entities.
App Engine applications written in Java do not have an index.yaml file, they have a datastore-indexes.xml file instead. However, the concept is the same.
By default, most properties are indexed by default. Any composite indexes must be defined in your index config file (yaml or xml depending on language). When defining your models, you can tell App Engine to prevent auto-indexing a property. This will save write-ops and speed up your app.
To answer your question more specifically, you cannot use the index config file to prevent index creation, rather it is used to tell App Engine which indexes to create.
Also, indexes are only created as entities are saved. So if you add more after entities have been crested, you will need to run a script to update them.
Similarly, to remove indexes after they have been created you need to do this from the command line using the sdk. See here.

How save text, svg, html, css all together efficiently

In an application, I am using Fabric.js, which lets users write text, draw SVG's, insert images etc.
I want to know, what is the best way to store this data.
Requirements are:
Ability to query the data(text), which tells me that i should store it in DB (MySQL as of now)
I have images, and I am targeting IPad as well, so the images are important, as to how they are stored.
SVG's and HTML/CSS to be saved as well.
I also want to do versioning of the content, as Quora does it, so that a user can see the changes from the past version to the current version. This also includes the versioning of images and SVG's.
I am wondering how Google Docs does it, because they also store our documents, drawings etc.
What is the best way of doing this?
i dont known if it helps but, Opera browser offer an option to save the webpages to an unique file { mht extension }, this stores all the files { css, images, scripts, etc } in base64 encoded text for a later use { when the document is opened }... maybe this can be a way to store data :P
I manage a webapp where users generate reports, and found it more efficient to store images and binary files in the filesystem, and link to them from the database. Elements that are in xml or text are kept in the database for easier searching - in your case this would include css/html and svg (which is xml). Use the database for managing revisions.
Might also check out this thread on storing images in a database.
It looks like Frabic.js is using the node.js javascript webserver on the backend - haven't used this before, but you might investigate which databases are easiest to use with node.js:
node.js database
nodejs and database communication - how?
nodejs where to start?
If you want to query the text efficiently, then perhaps putting all bits of information into the DB separately is the most efficient. Maybe you with to play with OOXML or ODF, that may serve as container for all information you require, and then XML-storage (e.g. eXist) to store it and query (e.g. the text). As these standards are XML-based, you can transform them into HTML (e.g. here or here) but writing an online editor for this is something that monster like Google can do.
You can take a look at NoSQL databases like MongoDB or
CouchDB
See also Storing images in NoSQL stores

creating a video database

I am interested in creating a video databse. My goal is to have a folder where my videos will be kept and each time I copy/delete a video the website that presents them should be updated to. the problem is I have no idea how to approach it.
Should I..
Use Sql and store a reference to each video location?
Have a script that checks all the time if new changes happen in that folder?
A package like joomla?
I am using ubuntu btw. I already have a simple html5 page, and I am presenting the videos using html5 video.
It depends on the size and the performance you want.
1.Way : use php to scan the folder and generate links on the fly
2.way : Use a database to store the file names and retrieve the names from the database and generate urls
pros and cons.
simple to implement , no changes in upload or download script. no database required.
You need have a database , little coding required for upload and also while genrating a page
You should make a db (format does not matter) and storing in it only file names of videos: the videos would be stored on hard drive.
Any operation on the web site will pass first on db for insert/update/delete videos records and then (maybe in a transaction context) on the file system.
This would be the standard approach to your question.

Question on serving Images on App Engine ( 2 Alternatives )

planning to launch a comic site which serves comic strips (images).
I have little prior experience to serving/caching images.
so these are my 2 methods i'm considering:
1. Using LinkProperty
class Comic(db.Model)
image_link = db.LinkProperty()
timestamp = db.DateTimeProperty(auto_now=True)
Advantages:
The images are get-ed from the disk space itself ( and disk space is cheap i take it?)
I can easily set up app.yaml with an expiration date to cache the content in user's browser
I can set up memcache to retrieve the entities faster (for high traffic)
2. Using BlobProperty
I used this tutorial , it worked pretty neat. http://code.google.com/appengine/articles/images.html
Side question: Can I say that using BlobProperty sort of "protects" my images from outside linkage? That means people can't just link directly to the comic strips
I have a few worries for method 2.
I can obviously memcache these entities for faster reads.
But then:
Is memcaching images a good thing? My images are large (100-200kb per image). I think memcache allows only up to 4 GB of cached data? Or is it 1 Mb per memcached entity, with unlimited entities...
What if appengine's memcache fails? -> Solution: I'd have to go back to the datastore.
How do I cache these images in the user's browser? If I was doing method no. 1, I could just easily add to my app.yaml the expiration date for the content, and pictures get cached user side.
would like to hear your thoughts.
Should I use method 1 or 2? method 1 sounds dead simple and straightforward, should I be wary of it?
[EDITED]
How do solve this dilemma?
Dilemma: The last thing I want to do is to prevent people from getting the direct link to the image and putting it up on bit.ly because the user will automatically get directed to only the image on my server
( and not the advertising/content around it if the user had accessed it from the main page itself )
You're going to be using a lot of bandwidth to transfer all these images from the server to the clients (browsers). Remember appengine has a maximum number of files you can upload, I think it is 1000 but it may have increased recently. And if you want to control access to the files I do not think you can use option #1.
Option #2 is good, but your bandwidth and storage costs are going to be high if you have a lot of content. To solve this problem people usually turn to Content Delivery Networks (CDNs). Amazon S3 and edgecast.com are two such CDNs that support token based access urls. Meaning, you can generate a token in your appengine app that that is good for either the IP address, time, geography and some other criteria and then give your cdn url with this token to the requestor. The CDN serves your images and does the access checks based on the token. This will help you control access, but remember if there is a will, there is a way and you can't 100% secure anything - but you probably get reasonably close.
So instead of storing the content in appengine, you would store it on the cdn, and use appengine to create urls with tokens pointing to the content on the cdn.
Here are some links about the signed urls. I've used both of these :
http://jets3t.s3.amazonaws.com/toolkit/code-samples.html#signed-urls
http://www.edgecast.com/edgecast_difference.htm - look at 'Content Security'
In terms of solving your dilemma, I think that there are a couple of alternatives:
you could cause the images to be
rendered in a Flash object that would
download the images from your server
in some kind of encrypted format that
it would know how to decode. This would
involve quite a bit of up-front work.
you could have a valid-one-time link
for the image. Each time that you
generated the surrounding web page,
the link to the image would be
generated randomly, and the
image-serving code would invalidate
that link after allowing it one time. If you
have a high-traffic web-site, this would be a very
resource-intensive scheme.
Really, though, you want to consider just how much work it is worth to force people to see ads, especially when a goodly number of them will be coming to your site via Firefox, and there's almost nothing that you can do to circumvent AdBlock.
In terms of choosing between your two methods, there are a couple of things to think about. With option one, where are are storing the images as static files, you will only be able to add new images by doing an appcfg.py update. Since AppEngine application do not allow you to write to the filesystem, you will need to add new images to your development code and do a code deployment. This might be difficult from a site management perspective. Also, serving the images form memcache would likely not offer you an improvement performance over having them served as static files.
Your second option, putting the images in the datastore does protect your images from linking only to the extent that you have some power to control through logic if they are served or not. The problem that you will encounter is that making that decision is difficult. Remember that HTTP is stateless, so finding a way to distinguish a request from a link that is external to your application and one that is internal to your application is going to require trickery.
My personal feeling is that jumping through hoops to make sure that people can't see your comics with seeing ads is solving the prolbem the wrong way. If the content that you are publishing is worth protecting, people will flock to your website to enjoy it anyway. Through high volumes of traffic, you will more than make up for anyone who directly links to your image, thus circumventing a few ad serves. Don't try to outsmart your consumers. Deliver outstanding content, and you will make plenty of money.
Your method #1 isn't practical: You'd need to upload a new version of your app for each new comic strip.
Your method #2 should work fine. It doesn't automatically "protect" your images from being hotlinked - they're still served up on a URL like any other image - but you can write whatever code you want in the image serving handler to try and prevent abuse.
A third option, and a variant of #2, is to use the new Blob API. Instead of storing the image itself in the datastore, you can store the blob key, and your image handler just instructs the blobstore infrastructure what image to serve.

Resources