NDB: What happens when 1/s write is exceeded? - google-app-engine

I am evauating how to use GAE + NDB for a new project, and got concerned with the limit of 1 write per second for ancestor writes. I might be missing information, so I'm happy to ask for help.
Say several users work with orders. If all new "order" entities have the same unique ancestor, what would happen if say 5 users each create a new order and all 5 hit "save" at the same time?
Do you know what the consecuences could be?
Thanks!

In your use case, nothing bad would happen - all of your writes will succeed. Some of them may be retried internally by the App Engine, but you should not worry about that. You should only get concerned when you expect this rate to be exceeded for a substantial period of time. Then retries would come on top of previous retries and commits may start failing. Giving your example, you will probably need a few million people working on those orders like crazy before it becomes an issue.
From the documentation (emphasis mine):
The first type of timeout occurs when you attempt to write to a single
entity group too quickly. Writes to a single entity group are
serialized by the App Engine datastore, and thus there's a limit on
how quickly you can update one entity group. In general, this works
out to somewhere between 1 and 5 updates per second; a good guideline
is that you should consider rearchitecting if you expect an entity
group to have to sustain more than one update per second for an
extended period.

Related

Is there an Entity Group Max Size?

I have an Entity that represents a Payment Method. I want to have an entity group for all the payment attempts performed with that payment method.
The 1 write-per-second limitation is fine and actually good for my use case, as there is no good reason to charge a specific credit card more frequently than that, but I could not find any specifications on the max size of an entity group.
My concern is would a very active corporate account hit any limitations in terms of number of records within an entity group (when they perform their 1 millionth transaction with us)?
No, there isn't a limit for the entity group size, all datastore-related limits are documented at Limits.
But be aware that the entity group size matters when it comes to data contention, see Keep entity groups small. Please note that contention is not only happening when writing entities, but also when reading them inside transaction (see Contention problems in Google App Engine) or, occasionally, maybe even outside transactions (see TransactionFailedError on GAE when no transaction).
IMHO your usage case is not worth the risk of dealing with these issues (fairly difficult to debug and address), I wouldn't use a single entity group in this case.

Entity Group - deciding on how to group

I've read throughout the Internet that the Datastore has a limit of 1 write per second for an Entity Group. Most of what I read indicate a "write to an entity", which I would understand as an update. Does the 1 write per second also apply to adding entities into the group?
A simple case would be a Thread where multiple posts can be added by different users. The way I see it, it's logical to have the Thread be the ancestor of the Posts. Thus, forming a wide entity group. If the answer to my question above is yes, a "trending" thread would be devastated by the write limit.
That said, would it make sense to get rid of the ancestry altogether or should I switch to the user as the ancestor? What I'd like to avoid is having the user be confused when they don't see the post due to eventual consistency.
A quick clarification to start with
1 write per second doesn't mean 1 entity per second. You can batch writes together, up to a maximum of 500 entities (transactions also have a 10 MiB limit). So if you can patch posts, you can improve your write rate.
Note: you can technically go higher than 1 per second, although your risk of contention errors increases the longer you exceed that limit as well as the eventual consistency of the system.
You can read more on the limits here.
Client-side sharding
If you need to use ancestor queries for strong consistency AND 1 write per second is not enough, you could implement client-side sharding. This essentially means that you write the posts to a up to N different entity-groups using a known key scheme, For example:
Primary parent: "AncestorA"
Optional shard 1: "AncestorA-1"
Optional shard N: "AncestorA-(N-1)"
To query for your posts, issue N ancestor queries. Naturally, you'll need to merge these results on the client-side to display it in the correct order.
This will allow you to do N writes per second.

Google Datastore - Not Seeing 1 Write per Second per Entity Group Limitation

I've read a lot about strong vs eventual consistency, using ancestor / entity groups, and the 1 write per second per entity group limitation of Google Datastore.
However, in my testing I have never hit the exception Too much contention on these datastore entities. please try again. and am trying to understand whether I'm misunderstanding these concepts or missing a piece of the puzzle.
I'm creating entities like so:
func usersKey(c appengine.Context) *datastore.Key {
return datastore.NewKey(c, "User", "default_users", 0, nil)
}
func (a *UserDS) UserCreateOrUpdate(c appengine.Context, user models.User) error {
key := datastore.NewKey(c, "User", user.UserId, 0, usersKey(c))
_, err := datastore.Put(c, key, &user)
return err
}
And then reading them with datastore.Get. I know I won't have issues reading since I'm doing a lookup by key, but if I have a high volume of users creating and updating their information, I would theoretically hit the max of 1 write per second constantly.
To test this, I attempted to create 25 users at once (using the above methods, no batching), yet I don't log any exceptions, which this post implies I should: Google App Engine HRD - what if I exceed the 1 write per second limit for writing to the entity group?
What am I missing? Does the contention only apply to querying, is 25 not a high enough volume, or am I missing something else entirely?
From the documentation:
Writes to a single entity group are serialized by the App Engine
datastore, and thus there's a limit on how quickly you can update one
entity group. In general, this works out to somewhere between 1 and 5
updates per second; a good guideline is that you should consider
rearchitecting if you expect an entity group to have to sustain more
than one update per second for an extended period.
Note the words "extended period". 1 update per second is basically a minimum guaranteed throughput. At any given moment you may be able to achieve a significantly higher levels, but Google is warning you not to architect for those levels to be always available.
The limitation is per entity group, that means you could create as many users as you need without limitation (that's where scaling shines), as long as they don't share the same ancestor.
Things change once you start using the user key as the ancestor of other entities, making them part of the same group and thus having a limit on how many changes you can make to it per second.
Btw this is a generalization, most likely you will be able to make ~5 changes per second, this limitation exist because of the transactional properties of an entity group, so there's some kind of table with changes that must be executed sequentially, so you have to lock, and thus there's limited throughput.
Still, rule of thumb is thinking you can only do 1 per second to force yourself think about how to work under this conditions.
And like mentioned, this is only relevant when you update the database, gets and queries should scale as needed.
I don't think you're missing anything here. Previously, I had seen the same limitations when writing to the same entity group but recently (this week, in fact) I have not seen the delays. I'm willing to suggest that Google has solved this problem, and I'm hoping that someone will prove me correct.

Whats the meaning of the 1 per second limit on ancestor transactions in Google app engine?

Near the end of the following document:
https://developers.google.com/appengine/docs/java/datastore/structuring_for_strong_consistency?csw=1
It says:
This approach achieves strong consistency by writing to a single
entity group per guestbook, but it also limits changes to the
guestbook to no more than 1 write per second (the supported limit for
entity groups)
Does this mean that this write limit is on the specific guessbook? or across all guest books?
i.e. If for example I have "Logs" and "Log_entries" that use the Logs as ancestors and lets say I have 10 different Logs - and suppose I get 5 parallel requests to write to 5 different logs - will it be a problem ?
or is the problem only if I get more then one request per second to write entries that belong to the same specific log?
[my app does not deal with logs or entries - it just an example....]
Answer: Write limit is on the guestbook (entity group).
More info: Batch puts/transaction count as 1 write (limited per second)
clarification: http://www.youtube.com/watch?v=xO015C3R6dw#t=335
The limitation is PER ENTITY GROUP.
In your example that is PER LOG. So you can write 1 log entry per second per log. If you have 5 logs, you can write at most 5 log entries per second if and only if the log entries belong to 5 different logs.
The one write per second rule is like the pirate code on parlay... is more what you'd call a "guideline" than actual precise rule. Transactions always get applied serially to an entitygroup (which takes some time) so if too many transactions get queued up for a single entitygroup bad things may happen, hence I do not think it would be good to ignore the 'rule'.
Google offers more information on this rule and a technique for working around it (in some cases) by using sharding here:
https://cloud.google.com/appengine/articles/sharding_counters

Transactional counter with 5+ writes per second in Google App Engine datastore

I'm developing a tournament version of a game where I expect 1000+ simultaneous players. When the tournament begins, players will be eliminated quite fast (possibly more than 5 per second), but the process will slow down as the tournament progresses. Depending when a player is eliminated from the tournament a certain amount of points is awarded. For example a player who drops first, gets nothing, while player who is 500th, receives 1 point and the first place winner receives say 200 points. Now I'd like to award and display the amount of points right away after a player has been eliminated.
The problem is that when I push a new row into a datastore after a player has been eliminated, the row entity has to be in a separate entity group so I would not hit the gae datastore limit of 1-5 writes per second for 1 entity group. Also I need to be able to read and write a count of rows consistently so I can determine the prize correctly for all the players that get eliminated.
What would be the best way to implement the datamodel to support this?
Since there's a limited number of players, contention issues over a few a second are not likely to be sustained for very long, so you have two options:
Simply ignore the issue. Clusters of eliminations will occur, but as long as it's not a sustained situation, the retry mechanics for transactions will ensure they all get executed.
When someone goes out, record this independently, and update the tournament status, assigning ranks, asynchronously. This means you can't inform them of their rank immediately, but rather need to make an asynchronous reply or have them poll for it.
I would suggest the former, frankly: Even if half your 1000 person tournament went out in the first 5 minutes - a preposterously unlikely event - you're still looking at less than 2 eliminations per second. In reality, any spikes will be smaller and shorter-lived than that.
One thing to bear in mind is that due to how transaction retries work, transactions on the same entity group that occur together will be resolved in semi-random order - that is, it's not a strict FIFO queue. If you require that, you'll have to enforce it yourself, though that's a far from trivial thing to do in a distributed system of any sort.
the existing comments and answers address the specific question pretty well.
at a higher level, take a look at this post and open source library from the google code jam team. they had a similar problem and ended up developing a scalable scoreboard based on the datastore that handles both updates and requests for arbitrary pages efficiently.

Resources