Google NDB update only specific property? - google-app-engine

Is there a way to update only specific property on NDB entity?
Consider this example.
Entity A has following property:
property B
property C
Let's assume both of these properties have values of 1 at the moment.
Two different request are trying to update same entity and they are happening at the same time.
So when Request#1 and #2 are retrieving this entity, value of B and C were 1.
Now Request #1 tries to update property B so it sets the value B to 2 and put into Datastore. Now B = 2 and C = 1 in the datastore.
But, Request #2 has B=1 and C=1 in the memory and when it change C to 2 and put into DB, it put's B=1 and C=2 which overwrites B value written by Request #1.
How do you get around this? Is there way to only write specific property into datastore?

I believe you may want to look into transactions.
As per the documentation:
If the transaction "collides" with another, it fails; NDB automatically retries such failed transactions a few times. Thus, the function may be called multiple times if the transaction is retried.
Link: https://developers.google.com/appengine/docs/python/ndb/transactions

Related

Two-way synchronization between View Objects

I have two view objects in Oracle ADF.
LineVO represents order lines -- with one line per product.
Products are differentiated by several attributes... say "model" and "color". So, VO #1 contains a row for each model/color combination.
ModelVO represents a model-level summary of the lines.
Both VOs have a "quantity" field (an Integer).
There is a ViewLink between them and each has a row accessor to the other.
I want to achieve two-way coordination between these two view objects, such that:
When a user queries data, ModelVO.Quantity equals the sum of LineVO.Quantity, for the associated rows
When a user updates any LineVO.Quantity, the ModelVO.Quantity is immediately updated to reflect the new total
When a user updates a ModelVO.Quantity, the quantity is spread among the associated LineVO rows (according to complex business logic which I hope is not relevant here).
I have tried many different ways to do this and cannot get it perfect.
Right now, I am working on variations where ModelVO.Quantity is set to a Groovy expression "LineVO.sum('Quantity')". Unfortunately, everything I try either has the summing from LineVO->ModelVO working or the spreading from ModelVO->LineVO working, but never both at the same time.
Can someone suggest a way to do this? I want to do it in the model layer (either a EO or VO or combination).
Nevermind.. it turns out to be simple:
ModelVO.Quantity must be set to a Groovy "LineVO.sum('Quantity')" and it must have a recalcExpression set to a method where I can control things so it only recalculates when I am changing a LineVO.Quantity value.
The reason my approach didn't work initially was because, when the user updated a LineVO.Quantity value and I wanted to recalculate, I was getting the ModelVO row by lineVORow.getModelVO()... i.e., via a view accessor.
Apparently, that returns some sort of internal copy of the row and not the actual row.
When I got the parent row via the applicationModule.getModelVO().getCurrentRow(), the whole thing works perfectly.
I've posted another question about why accessing the row via the view accessor did not work. That part is still a mystery to me.

Cost of updating entities in datastore (and, possible to append properties)?

I have a two part question.
Let's say I have a entity with a blob property...
# create entity
Entity(ndb.Model):
blob = ndb.BlobProperty(indexed=False)
e = Entity()
e.blob = 'abcd'
e_key = e.put()
# update entity
e = e_key.get()
e.blob += 'efg'
e.put()
So questions are:
The first time I put() that entity, the cost is 2 Write Ops; how many Ops does it cost to update the entity, as in the above example?
When I added 'efg' to the property, the old property had to be read into memory first, does app engine provide a way to append the old value without reading it first?
There are no partial updates. Every time you overwrite the whole entity. Numbers of indexes will also have an impact on cost. You might like to have a look at https://developers.google.com/appengine/articles/life_of_write for a detailed breakdown of what happens.

Datastore - Resource contention in a single entity group -

I am getting lost on the following regarding the Datastore :
It is recommended to denormalize data as the Datastore does not support join queries. This means that the same information is copied in several entities
Denormalization means that whenever you have to update
data, it must be updated in different entities
But there is a limit of 1 write / second in a single entity group.
The problem I have is therefore the following :
In order to update records, I open a transaction then
Update all the required entities. The entities to be updated are within the same entity group but relate to different kinds
I am getting a "resource contention" exception
==> It seems therefore that the only way to update denormalized data is outside of a transaction. But doing this is really bad as some entities could be updated whereas other entities wouldn't.
Am I the only one having this problem ? How did you solve it ?
Thanks,
Hugues
The (simplified version of the ) code is as follows :
Objectify ofy=ObjectifyService.beginTransaction();
try {
Key<Party> partyKey=new Key<Party>(realEstateKey, Party.class, partyDTO.getId());
//--------------------------------------------------------------------------
//-- 1 - We update the party
//--------------------------------------------------------------------------
Party party=ofy.get(partyKey);
party.update(partyDTO);
//---------------------------------------------------------------------------------------------
//-- 2 - We update the kinds which have Party as embedded field, all in the same entity group
//---------------------------------------------------------------------------------------------
//2.1 Invoices
Query<Invoice> q1=ofy.query(Invoice.class).ancestor(realEstateKey).filter("partyKey", partyKey);
for (Invoice invoice: q1) {
invoice.setParty(party);
ofy.put(invoice);
}
//2.2Payments
Query<Payment> q2=ofy.query(Payment.class).ancestor(realEstateKey).filter("partyKey", partyKey);
for (Payment payment: q2) {
payment.setParty(payment);
ofy.put(payment);
}
}
ofy.getTxn().commit();
return (RPCResults.SUCCESS);
}
catch (Exception e) {
final Logger log = Logger.getLogger(InternalServiceImpl.class.getName());
log.severe("Problem while updating party : " + e.getLocalizedMessage());
return (RPCResults.FAILURE) ;
}
finally {
if (ofy.getTxn().isActive()) {
ofy.getTxn().rollback();
partyDTO.setCreationResult(RPCResults.FAILURE);
return (RPCResults.FAILURE) ;
}
}
This is happening because multiple requests to update the same entity group are occurring in a short period of time, not because you are updating many entities in the same entity group at once.
Since you have not shown your code, I can assume one of two things are happening:
The method you describe above is not actually using a transaction and you are running put_multi() with many entities of the same entity group. (If I had to guess, it'd be this.)
You have a high-traffic site and many other updates are simultaneously occurring at the same time.
Just in case someones gets in the same issue.
The problem was in the party.update(partyDTO) where under some specific conditions, I was initiating another transaction.
What I learned today is that :
--> Inside a transaction, you are allowed to include multiple puts even getting over the 1 entity / second
--> However, you should take care not initiating another transaction within your transaction

Django: Avoiding ABA scenario in Database and ORM’s

I have a situation where I need to update votes for a candidate.
Citizens can vote for this candidate, with more than one vote per candidate. i.e. one person can vote 5 votes, while another person votes 2. In this case this candidate should get 7 votes.
Now, I use Django. And here how the pseudo code looks like
votes = candidate.votes
vote += citizen.vote
The problem here, as you can see is a race condition where the candidate’s votes can get overwritten by another citizen’s vote who did a select earlier and set now.
How can avoid this with an ORM like Django?
If this is purely an arithmetic expression then Django has a nice API called F expressions
Updating attributes based on existing fields
Sometimes you'll need to perform a simple arithmetic task on a field, such as incrementing or decrementing the current value. The obvious way to achieve this is to do something like:
>>> product = Product.objects.get(name='Venezuelan Beaver Cheese')
>>> product.number_sold += 1
>>> product.save()
If the old number_sold value retrieved from the database was 10, then the value of 11 will be written back to the database.
This can be optimized slightly by expressing the update relative to the original field value, rather than as an explicit assignment of a new value. Django provides F() expressions as a way of performing this kind of relative update. Using F() expressions, the previous example would be expressed as:
>>> from django.db.models import F
>>> product = Product.objects.get(name='Venezuelan Beaver Cheese')
>>> product.number_sold = F('number_sold') + 1
>>> product.save()
This approach doesn't use the initial value from the database. Instead, it makes the database do the update based on whatever value is current at the time that the save() is executed.
Once the object has been saved, you must reload the object in order to access the actual value that was applied to the updated field:
>>> product = Products.objects.get(pk=product.pk)
>>> print product.number_sold
42
Perhaps the select_for_update QuerySet method is helpful for you.
An excerpt from the docs:
All matched entries will be locked until the end of the transaction block, meaning that other transactions will be prevented from changing or acquiring locks on them.
Usually, if another transaction has already acquired a lock on one of the selected rows, the query will block until the lock is released. If this is not the behavior you want, call select_for_update(nowait=True). This will make the call non-blocking. If a conflicting lock is already acquired by another transaction, DatabaseError will be raised when the queryset is evaluated.
Mind that this is only available in the Django development release (i.e. > 1.3).

How to filter rows with null refrences in Google app engine DB

I have a Model UnitPattern, which reference another Model UnitPatternSet
e.g.
class UnitPattern(db.Model):
unit_pattern_set = db.ReferenceProperty(UnitPatternSet)
in my view I want to display all UnitPatterns having unit_pattern_set refrences as None, but query UnitPattern.all().filter("unit_pattern_set =", None) returns nothing, though I have total 5 UnitPatterns, out of which 2 have 'unit_pattern_set' set and 3 doesn't have
e.g.
print 'Total',UnitPattern.all().count()
print 'ref set',UnitPattern.all().filter("unit_pattern_set !=", None).count()
print 'ref not set',UnitPattern.all().filter("unit_pattern_set =", None).count()
outputs:
Total 5
ref set 2
ref not set 0
Shouldn't sum of query 2 and 3 be equal to query 1 ?
Reason seems to be that I added reference property unit_pattern_set later on, and these UnitPattern objects existed before that, but then how can I filter such entities?
This is described succinctly in the docs:
An index only contains entities that
have every property referred to by the
index. If an entity does not have a
property referred to by an index, the
entity will not appear in the index,
and will never be a result for the
query that uses the index.
Note that
the App Engine datastore makes a
distinction between an entity that
does not possess a property and an
entity that possesses the property
with a null value (None). If you want
every entity of a kind to be a
potential result for a query, you can
use a data model that assigns a
default value (such as None) to
properties used by query filters.
In your case, you have 3 entities that don't have the unit_pattern_set property set at all (because that property wasn't defined in the Model at the time those entities were created) - therefore those properties doesn't exist in the database representation of that entity, therefore that entity does not appear in the index of that property for that kind of entity.
Dan Sanderson's book Programming Google App Engine explains this in great detail on ~page 150 (unfortunately not available in the Google Books preview)
To fix the models you already have, you'll have to iterate over a query on UnitPattern (I've not tested the following code, please check it before you run it on your live data):
patterns = UnitPattern.all()
for pattern in patterns:
if not pattern.unit_pattern_set:
pattern.unit_pattern_set = None
pattern.put()
Edit: Also, the Updating you model's schema article discuss strategies you can use to handle schema changes such as this in future. However, that article is quite old and its method requires a web browser to keep hitting a url to trigger the next job to update more records - now that Task Queues exist, you could use a series of Tasks to make the change. The article on using deferred.defer has a framework you could utilise - it does a small amount of work, catches the DeadlineExceededError, and uses the handler to queue a new task which picks up where the current task left off.

Resources