Can I define a read-only transaction using GAE's JDO? - google-app-engine

I'm use the latest versions of the GWT GAE w/ JDO stack.
I have a task queue updating persistent objects with the datastore.
I also have a gwt user interface displaying the save objects (without modification).
Given tightly defined transaction (start/commit) boundaries. Is there a way for me to define a read-only transaction for the GUI that does not conflict with the task updating the objects?
I believe they are conflicting and throwing these exceptions (abridged)
javax.jdo.JDODataStoreException: Transaction rolled back due to failure during commit
at org.datanucleus.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:402)
at org.datanucleus.jdo.JDOTransaction.commit(JDOTransaction.java:132)
....
NestedThrowablesStackTrace:
java.sql.SQLException: Concurrent Modification
at org.datanucleus.store.appengine.DatastoreTransaction.commit(DatastoreTransaction.java:70)

the app engine datastore actually uses optimistic concurrency (more), not locking. that means that a transaction that does only reads will not interfere or cause contention with other writes or transactions that include writes.

Related

Isolation Level vs Optimistic Locking-Hibernate , JPA

I have a web application where I want to ensure concurrency with a DB level lock on the object I am trying to update. I want to make sure that a batch change or another user or process may not end up introducing inconsistency in the DB.
I see that Isolation levels ensure read consistency and optimistic lock with #Version field can ensure data is written with a consistent state.
My question is can't we ensure consistency with isolation level only? By making my any transaction that updates the record Serializable(not considering performance), will I not ensure that a proper lock is taken by the transaction and any other transaction trying to update or acquire lock or this transaction will fail?
Do I really need version or timestamp management for this?
Depending on isolation level you've chosen, specific resource is going to be locked until given transaction commits or rollback - it can be lock on a whole table, row or block of sql. It's a pessimistic locking and it's ensured on database level when running a transaction.
Optimistic locking on the other hand assumes that multiple transactions rarely interfere with each other so no locks are required in this approach. It is a application-side check that uses #Version attribute in order to establish whether version of a record has changed between fetching and attempting to update it.
It is reasonable to use optimistic locking approach in web applications as most of operations span through multiple HTTP request. Usually you fetch some information from database in one request, and update it in another. It would be very expensive and unwise to keep transactions open with lock on database resources that long. That's why we assume that nobody is going to use set of data we're working on - it's cheaper. If the assumption happens to be wrong and version has changed in between requests by someone else, Hibernate won't update the row and will throw OptimisticLockingException. As a developer, you are responsible for managing this situation.
Simple example. Online auctions service - you're watching an item page. You read its description and specification. All of it takes, let's say, 5 minutes. With pessimistic locking and some isolation levels you'd block other users from this particular item page (or all of the items even!). With optimistic locking everybody can access it. After reading about the item you're willing to bid on it so you click the proper button. If any other of users watching this item and change its state (owner changed its description, someone other bid on it) in the meantime you will probably (depending on app implementation) be informed about the changes before application will accept your bid because version you've got is not the same as version persisted in database.
Hope that clarifies a few things for you.
Unless we are talking about some small, isolated web application (only app that is working on a database), then making all of your transactions to be Serializable would mean having a lot of confidence in your design, not taking into account the fact that it may not be the only application hitting on that certain database.
In my opinion the incorporation of Serializable isolation level, or a Pessimistic Lock in other words, should be very well though decision and applied for:
Large databases and short transactions that update only a few rows
Where the chance that two concurrent transactions will modify the same rows is relatively low.
Where relatively long-running transactions are primarily read-only.
Based on my experience, in most of the cases using just the Optimistic Locking would be the most beneficial decision, as frequent concurrent modifications mostly happen in only small percentage of cases.
Optimistic locking definately also helps other applications run faster (dont think only of yourself!).
So when we take the Pessimistic - Optimistic locking strategies spectrum, in my opinion the truth lies somewhere more towards the Optimistic locking with a flavor of serializable here and there.
I really cannot reference anything here as the answer is based on my personal experience with many complex web projects and from my notes when i was preapring to my JPA Certificate.
Hope that helps.

GAE: How to rollback a transaction?

I just read this great summary of GAE best practices: https://cloud.google.com/datastore/docs/best-practices
One of them is:
If a transaction fails, ensure you try to rollback the transaction.
The rollback minimizes retry latency for a different request
contending for the same resource(s) in a transaction. Note that a
rollback itself may fail, so the rollback should be a best-effort
attempt only.
I thought that transaction rollback was something that GAE did for you, but the above quote says that you should do it yourself.
The documentation here also says you should do a rollback but does not say how.
So, how do I rollback a transaction in GAE Python?
The best practices document is for using the Cloud Datastore directly through its API or client libraries.
This is only necessary in the flexible Appengine environment. Even in this case, the Cloud Datastore client library provides a context manager to automatically handle rollbacks - this example code is from the docs
def transfer_funds(client, from_key, to_key, amount):
with client.transaction():
from_account = client.get(from_key)
to_account = client.get(to_key)
from_account['balance'] -= amount
to_account['balance'] += amount
client.put_multi([from_account, to_account])
The docs state:
By default, the transaction is rolled back if the transaction block exits with an error
Be aware that the client library is still in Beta, so the behaviour could change in future.
In the standard Appengine environment, the ndb library provides automatic transaction rollback:
The NDB Client Libary can group multiple operations in a single transaction. The transaction cannot succeed unless every operation in the transaction succeeds; if any of the operations fail, the transaction is automatically rolled back.

Does the Google Cloud Datastore create transactions implicitly?

In many databases, when an operation is performed without explicitly starting a transaction, the database creates a new transaction implicitly.
Does the datastore do this?
If it does not, is there any model for reasoning about how the data changes in the absence of transactions? How do puts, fetches, and reads, work outside of transactions?
If it does, is there any characterization for when and how. Does it do it always? What is the scope of the transaction?
A mutation (put, delete) of a single entity will always be atomic (succeed entirely or fail entirely). You can think of the single mutation as transactional, even if you did not provide a transaction.
However, if you send multiple mutations in the same non-transactional request, that overall request is not atomic. Each mutation may succeed or fail independently -- one failure will not cause the other mutations to be reverted.
"Transactions are an optional feature of the Datastore; you're not required to use transactions to perform Datastore operations."
so there are no automatic transactions being opened for you across more than a single entity datastore operation.
a single entity commit will behave the same as a transaction internally. so if you are changing more than one entity or committing it more than once, its as if you open and close a transaction every time.

Google Datastore Transactions Optimistic Concurrency Control or not?

The questions is fairly simple.
Does Google Datastore Transactions Optimistic Concurrency Control or not?
One part of the documentations says that it does:
When a transaction starts, App Engine uses optimistic concurrency control by checking the last update time for the entity groups used in the transaction. Upon commiting a transaction for the entity groups, App Engine again checks the last update time for the entity groups used in the transaction. If it has changed since our initial check, an exception is thrown. Source
Another part of the documentation indicates that it doesn't:
When a transaction is started, the datastore rejects any other attempts to write to that entity group before the transaction is complete. To illustrate this, say you have an entity group consisting of two entities, one at the root of the hierarchy and the other directly below it. If these entities belonged to separate entity groups, they could be updated in parallel. But because they are part of the same entity group, any request attempting to update one of the entities will necessarily prevent a simultaneous request from updating any other entity in the same group until the original request is finished. Source
As I understand it, the first quote tells me that it is fine to start a transaction, read an entity and ignore closing the transaction, if I saw no reason for updating the entity.
The second quote tells me that, if I start a transaction and read an entity, then I should always remember to close it again, otherwise I cannot start a new on the same entity.
Which part of the documentation is correct?
BTW. In case the correct quote is the second one, I am using Objectify to handle all my transactions. Will this remember to close all started transactions, even though no changes was made?
The commenter (Greg) is correct. Whether or not you explicitly close a transaction, all transactions are closed by the container at the end of a request. You can't "leak" transactions (although you could screw up transactions within a single request).
Furthermore, with Objectify's transaction API, transactions are automatically opened and closed for you when you execute a unit of Work. You don't manage transactions yourself.
To answer your root question: Yes, all transactions in the GAE datastore are optimistic. There is no pessimistic locking in the datastore; you can start as many transactions as you want on a single entity group but only the first commit will succeed. All subsequent attempts to commit will rollback with ConcurrentModificationException.

app engine datastore transaction exception

In app engine transactions documentation I have found the following note:
Note: If your app receives an exception when submitting a transaction,
it does not always mean that the transaction failed. You can receive
Timeout, TransactionFailedError, or InternalError exceptions in cases
where transactions have been committed and eventually will be applied
successfully. Whenever possible, make your Datastore transactions
idempotent so that if you repeat a transaction, the end result will be
the same.
This is quite general information and I wasn't able to find more details. I have the following questions regarding this issue:
Does it affect NDB transations? NDB documentation doesn't
mention it, but I suppose that this behavior is inherited
What can cause this type of situation?
How often can it happen?
Can I prevent it, or decrease probability?
Are transactional tasks enqueued in this situation?
Is this situation a bug, which will be fixed in the future, or a feature, which I should just get used to?
Yes, it affects ndb too.
Potential causes include network partitions where the datastore server commits successfully but cannot communicate the result to the app.
It is rare, but cannot be prevented, and will never be fixed. It is inherent to all distributed systems.
Task queue adds are committed with the transaction by the datastore server.

Resources