I am currently using the Memcache service provided by GAE to cache content on the server. The current size of the cache is close to 20~30MB.
Initially the cache had a lifetime of 6-7 hours .. with increasing traffic, the lifetime of the cache has fallen to 20 minutes.
We are planning to increase the cache size to 1-2GBs. Are there any alternative implementations of Distributed Caching on GAE we can use?
The List of methods that I have already tried are mentioned below. But, these steps do not fix our need to have better caching service on GAE
Using Memcache (cons - limited cache size)
Store object in the Instance Memory (cons - data consistency across instances cannot be maintained)
Compressing JAVA Objects being stored ( slight improvement - only 20% improvement in lifetime of cache)
Since you were originally relying on a cache of 6-7 hours, this sounds like an excellent use case for taking advantage of Google's Edge Cache. This is, in theory, a free cache based on Google's distributed caching of websites.
Basically, you want to set caching headers such as:
Cache-Control: public, max-age=600
See this SO answer and this Google Groups post.
If you are a Python developer, maybe this blog post from Nick Johnson will help you: http://blog.notdot.net/2010/11/Storage-options-on-App-Engine
Related
I've almost completed migrating based on google's instructions.
It's very nice to not have to call into the app-engine libraries whatsoever.
However, now I must replace my calls to app-engine-standard memcached.
Here's what the guide says: "To use a memcache service on App Engine, use Redis Labs Memcached Cloud instead of App Engine Memcache."
So is this my only option; a third party? They don't even list pricing on their page if GCE is selected.
I also see in the standard environment how-to guides there is a guide on Connecting to internal resources in a VPC network.
From that link it mentions Cloud Memorystore. I can't find any examples if this is advisable or possible to do on GAE standard. Of course it wasn't previously possible but now that GAE standard has become much more "standard", I think it should be possible?
Thanks for any advice on the best way forward.
Memorystore appears to be Google's replacement:
https://cloud.google.com/memorystore/
You connect to it using this guide:
https://cloud.google.com/appengine/docs/standard/go/using-memorystore
Alas it costs about $1.20/GB per day with no free quota.
Thus, if your data doesn't change, and requires less than 100MB of cache at a time, the first answer might be better (free). Also, your data won't explode the instance as you can control the max size of the cache.
However, if your data changes or you need more cache, MemoryStore is a more direct replacement to MemCache - just costs money.
I've been thinking about this. 2nd gen instances have twice the ram, so if global cache isn't required (as in items don't change once created - (name items using their sha256)), you can run your own local threadsafe memcache (such as https://github.com/dgraph-io/ristretto) and allocate some of the extra ram to it. It'll be faster than Memcache was, so requests can be serviced even faster, keeping the number of instances low.
You could make it global for data that does change, by using pub/sub between instances, but I think that's significantly more work.
To ease the migration to 1.12, I have been thinking of using this solution:
create a dedicated app using the 1.11 runtime.
setup twirp endpoints to act as a proxy for all the deprecated app engine services (memcache, mail, search...)
I'm considering building an app on App Engine, and I'm trying to decide if I should store data in the datastore or on google cloud storage.
Each object is going to be typically no more than around a kilobyte, perhaps a few kilobytes at most (and often less). It won't change too often.
I could have the client directly access the data, but though I might live without there would be some benefit to the app engine app accessing the data and using it as part of serving a response.
What are the performance characteristics of google cloud storage? How quickly do requests come back? I was able to find a status dashboard for the datastore which indicates that they are usually reasonably quick at handling requests but I've had trouble getting guidance on how fast GCS is.
Under the most recent price reductions, it seems like the datastore might actually be cheaper for my use case of relatively small chunks of data ($0.06/100,000 requests vs $0.01/10,000 class b operations). Am I interpreting that correctly?
This thread might give you some insights. Back in september 2013 I had about 200-250ms on average for "blank" sequential inserts. You can get a great speedup if you combine your requests. You can insert up to 500 entities in a single request. Which takes roughly 500ms-900ms if I remember correctly.
We currently have our application hosted in Google App Engine. Billing is enabled to that application. This application is still in beta that we are using for testing purpose. We have a logic of serving data from the Memcache if present, if not then we get the data from the datastore and update the memcache and serve the data. We are encountering strange behaviour related to Memcache. The data related to some keys in Memcache is getting dropped after few minutes after being set. We tried setting expiration time for the keys in the memcache, even that does not seem to work. Since the data is getting dropped from the memcache the data is again from the datastore which is increasing the billing for our application.
Currently nearly 80% of the billing is related to datastore read. The datastore read is high as the memcache is not working efficiently as it should be. Any insight why we are facing this issue would be really helpful.
Just an FYI, we are having around 75000 keys in the memcache with total size of 100 MB data. Our structure demands keeping such large number of keys in memcache, which I think should not be an issue.
Our application is being by 10 users and the billing amount per day is coming to around $40.
Thanks,
Krish
Unfortunately memcache will evict keys as and when it requires. Setting the time to expire only means the item will be in memcache for up to the expiry time.
Take a look at the docs regarding eviction.
Also, take a look at this for some more insight into ways around memcache issues.
Regarding your data structure, perhaps you could post a new question and we can see if others have advice for you.
After uploading a app new version appengine, memcache reduced in size. When I log the memcache statistics(memcache.get_stats())`, I see that the oldest_item_age is less the a minute old and the cache size is not more then 3 meg. In the older version of the app, the oldest_item 3600 second old and the cache size was ~30 meg.
I work with backends and when I stop them the problem disappear.
also I use django-nonrel
Thanks
Uri
I don't know if the backends actually affect memcache capacity, but I do know that it's variable and changes over time. It might also be a shared pool of some sort, and might also vary depending on which of your instances the request happens to hit. It could have also been because you'd just pushed a new version.
I usually don't worry about check the capacity. To reduce server load, it's in the GAE team's own best interests to apply the best optimization, allocation and eviction policies to your memcache.
I've just finished watching the Google IO 2011 presentation on AppEngine backends (http://www.google.com/events/io/2011/sessions/app-engine-backends.html) which piqued my curiosity about using a backend instance for somewhat more reliable and configurable in-memory caching. It could be an interesting option as a third layer of cache, under in-app caching and memcache, or perhaps as a substitute for some cases where higher reliability is desirable.
Can anyone share any experience with this? Googling around doesn't reveal much experimentation here. Does the latency of a URLfetch to retrieve a value from a backend's in-memory dictionary render it less attractive, or is it not much worse than a memcache RPC?
I am thinking of whipping up some tests to see for myself, but if I can build on the shoulder of giants...thanks for any help :)
Latency between a backend and frontend instance is extremely low.
If you think about it, all App Engine RPC's are fulfilled with "backend instances". The backends for the Datastore and Memcache are just run by Google for your convenience.
Most requests, according to the App Engine team, stay within the same datacenter - meaning latency is inter-rack and much lower than outside URLFetches.
A simple request handler and thin API layer for coordinating the in memory storage is all you need - in projects where I've set up backend caching, it's done a good job of fulfilling the need for more flexible in-memory storage - centralizing things definitely helps. The load balancing doesn't hurt either ;)