I have an app that stores ~20000 entries in the memcache. Each entry is a Serializable with one String and two integers. I set the expiration time to 6 hours.
So I was using the shared/free memcache. It only seemed to store ~5000 entries -> ~7mb. The oldest entry is always just some minutes old. Why is that?
Then I thought: let's switch to dedicated memcache. Then the cache runs fine, it stores all entries, oldest entry is 6 hours old, everything is as expected. Except for the quota. After just some hours it says that I already used 18 "Gbyte hours".
Well my total cache size is ~11mb. So I would guess that the cost would be ($0.12 / Gbyte / hr) -> $0.12*~0.01Gb*24hrs per day, which would be just ~$0.03.
What am I doing wrong? Is my calculation wrong? Do I misunderstand the meaning of "Gbyte hours"?
AppEngine dedicated memcache is priced per 1GB chunks, not based on what you use in your dedicated allocated 1GB of storage. See here https://developers.google.com/appengine/docs/adminconsole/memcache
Your dedicated memcache will cost you a flat $2.88 per day.
The "dedicated" gives you space but operations per seconds (10,000 ops per second).
Regarding your experience on the shared (and free) memcache, what you are seeing here is not what you should generally expect. You are unfortunately most likely on an AppEngine cluster where some other apps are abusing the shared memcache.
The dedicated memcache bills by the gigabyte. So anything less than 1GB is billed as 1GB, and you probably had it running for 18hrs. Yeah, kinda sucks.
Related
Google is extremely vague about when memcache entries might expire in the shared memcache. This is completely understandable as it is free and shared but does anybody have any practical guidance for how long my entries will probably exist before they are removed from the shared cache? 1 hour? 1 day? 1 week? 1 month?
I plan to store some session tokens and am trying to figure out how long the sessions can be revisited.
The memcache can trashed all your datas at any time. You can't really trust the expiration time.
According to Google, your application must work fine even without or with memcache.
The memcache is just a plus to avoid to make a lot of datastore queries and manage correctly your quotas.
And for you, the memcache is not the place to store sessions tokens.
If you are using Python, you can user webapp2 sessions. It works wonderful with AppEngine.
I've honestly seen memcache last for a day, and as little as an hour.... And I never actively monitored memcache, so I wouldn't be surprised if you have way bigger or smaller timeframes in your memcache.
It's really hard to say, since (I guess) it depends on internal usage of the computer used for your instance and how much resources are needed by other computers around.
Unfortunately there is no real answer yo this question besides : code defensively, always check memcache and then if you don't find your data in memcache, you grab it from your permanent data (datastore I guess? or cloud storage) and then push it back to memcache for your next use
So basically, I am trying to decide if I should go for dedicated memcache.
My scenario is as follows:
I am working on an application that provides real time analysis for some public data.
I am going to maintain a total 15kb key/value memcache (20 keys, variable values)
Meanwhile values are constantly changing (total keys/values around 20 updates every 3 seconds)
Hits to the website will perform request for these keys (also around a request per 3 seconds)
I am assuming that 10000 users hit the website immediately, this will produce about 20 * 10000 requests each 3 seconds.
Considering the size of the memcache (relatively small), but also the number of requests produced about 7000/second (memcache key/value access), would dedicated memcache be more of "risk averse" deal for this situation.
Thanks,
The cached data seems crucial to the correct operation of your web application. If you lost data it might be a disservice to thousands of users! Hopefully your app also periodically persists the cached data and automatically recovers from cache erasure.
Although the size of the data is small, shared memcache still has more risk than dedicated memcache of evicting some or all of the data at unpredictable times. The design must also deal correctly with partial data loss. Not only your memory pressure but also that from other applications and cloud operational factors are more likely to result in AppEngine discarding shared cache.
With this data size you will not get any benefits from using the dedicated memcache.
The rate at which you access memcache does is not relevant for this decision.
I am developping an application using App Engine to collect, store and deliver data to users.
During my tests, I have 4 data sources which send HTTP POST requests to the server every 5s (all requests are exactly uniform).
The server stores received data to the datastore using Objectify.
At the beginning, all requests are manage by 1 instance (class F1) with 0.8 QPS, a latency of 80ms and 80MB of memory.
But during the following hours, the used memory increases and goes over the limit of F1 Instance.
However, the scheduler doesn't start another instance. When I stop all traffic, average memory never decreases.
Now I have 150MB memory instead of 128MB (limit of F1 class) and I stopped all the traffic.
I Tried to set performance settings manually or automatic, disable Appstats without any improvement.
I use Memcache and datastore, don't have any cron or task queues and the traffic is always the same.
What are the possible reasons the average memory increase?
Is it a bug of the admin console?
Which points define the quantity of memory used per request?
Another question:
Does Google have special discount for datastore read/write ( >30 million ops / day ) ?
Thank you,
Joel
Regarding the special price, I don't think there is. If your app needs this amount of read/write quota you should look into optimizing to minimize write and perhaps implement some sort of bulk writing if possible.
On the memory issue. You should post your code in order to get a straight answer since there are too many things to look into when discussing memory usage. Knowing more about your case will help in producing a straight answer.
Cheers,
Kjartan
I just enabled billing just because my daily bandwidth seemed to exceed 1 GB per day.
But I realized from the Billing History section, days before the day the billing is enabled, the daily bandwidth has exceeded 1 GB (even it ever reached 2.5 GB, and the last column says $0.18), and I was not charged anything.
How come the free version of app engine allows more than 1 GB of bandwidth? If it is so, then there is no purpose of enabling billing.
If you want your application to work reliably and serve all requests, you should enable billing when about to get over the limits. Sometimes (for example, if you are at 0.8GB and someone starts 500MB download) you may get a little bit over the quota, but this is usually rare and I would not try to build my business on this.
Memcache in general and on AppEngine in specific is unreliable in the sense that my data may be deleted from the cache for whatever reason at any point in time. However, in some cases there might be cases where a small risk may be worth the added performance using memcache could give, such as updating some data in memcache that gets saved periodically to some other, more reliable storage. Are there any numbers from Google that could give me an indication of the actual probability that a memcache entry would be lost from the cache before its expiration time, given that I keep within my quotas?
Are there any reasons other than hardware failure and administrative operations such as machines at the data centers being upgraded/moved/replaced that would cause entries to be removed from memcache prematurely?
Memcache, like any cache, should be used as... a cache. If you can't find something in the cache, there must be a strategy to find it in permanent storage.
In addition to the reasons you mention, Memcache and other caching approaches have limits to the amount of items they will hold (discarding usually the least recently used ones when the cache is full), and often also set other cache invalidation policies (e.g. flush everything unused for one hour).
If you don't configure and operate the cache yourself, you have NO guarantee of when and how items might be removed from the cache intentionally / by design.
Any concrete answer you get to this question is 100% subject to change.
That said, I've used memcache under light loads to accumulate data for 15 minutes or so before writing it all to the Datastore. This was for totally non-critical analytic data though. Do not depend on it.
It's not that data can be lost, but that if it is lost, it can be easily regained.
For example, using it to store data from the datastore is ideal, in that if a piece of data is not in the cache, it can be easily fetched.
If you're storing data such as a hit counter in the cache, it can't be regained if the cache is cleared, so you'll lose data.
If you're concerned about load for a common job, how about setting a job to update the counter later, using the task queue?
I have implemented a shared-memcache based stats counter that collects hourly to DB and can identify cache loss (log it). So far I see constantly <10% cache-losses total each day after at most 1h (average 30 mins) cache time with about 60 active counters. Counter losses appear to be random single counters. I suspect, that counters that are incremented only once (occurs quite often in my case) could have higher probability of being dropped.
My App uses <1MB total memcache in the shared memcache system. Unfortunately using dedicated memcache with 1GB minimum and substantial costs per year is out of the question. Stats counter used.
I have created a stackdriver counter that records memcache losses for a counter that is saved every full hour. The graph shows successful saves in red and memcache fails in blue. The counter saves every full hour and has a few counts in the hour.