I am building an order system on Google App Engine.
Suppose I have the following model of Order:
class Order(ndb.Model):
time: DateTimeProperty()
The date plus a serial number as a string is used as the id of one entity:
today = str(datetime.datetime.now())[:10].replace('-','')
# Fetch today's orders
orders = Order.query(...).order(-Order.time).fetch(1)
if len(orders)==0: # No entities: today's first order
orderNum = today + '00001'
else:
orderNum = str(int(orders[0].key.id())+1)
order = Order(id=orderNum)
order.date = datetime.datetime.now()
order.put()
Suppose there are multiple clerks and they may execute the program at the same time. An obvious problem is that they may obtain the same orderNum and actually write to the same entity.
How do I do to prevent such situation from happening?
You could use something similar to the code below.
#ndb.transactional(retries=3)
def create_order_if_not_exists(order):
exists = order.key.get()
if exists is not None:
raise Exception("Order already exists: ID=%s" % order.key.id())
order.put()
Or you can use Model's class method get_or_insert() that transactionally retrieves an existing entity or creates a new one. See details here.
Related
I am trying to consume an HTTP API in my Dagster code. The API provides a log of "changes" which contain an incrementing ID. It supports an optional parameter fromUpdateId, which lets you only fetch updates that have a higher ID than some value.
This should let me do incremental loads, by looking at the highest update ID I have seen so far, and providing this as the parameter.
How can I accomplish this in Dagster? I am thinking it should be possible to write the highest ID as metadata when I materialize the asset. The metadata would then be available the next time the asset is materialized.
I am thinking it should be possible to write the highest ID as metadata when I materialize the asset. The metadata would then be available the next time the asset is materialized.
That sounds like the right approach to me.
Here's some Python code that implements that approach:
from dagster import asset, Output
#asset
def asset1(context):
asset_key = context.asset_key_for_output()
latest_materialization_event = context.instance.get_latest_materialization_events(
[asset_key]
).get(asset_key)
if latest_materialization_event:
materialization = (
latest_materialization_event.dagster_event.event_specific_data.materialization
)
metadata = {entry.label: entry.entry_data for entry in materialization.metadata_entries}
cursor = metadata["cursor"].value
else:
cursor = 0
return Output(value=..., metadata={"cursor": cursor + 1})
I have NDB model class like below:
class Contest(ndb.Model):
end_date = ndb.DateProperty(required=True, indexed=True)
ended = ndb.BooleanProperty(required=False, indexed=True)
...
I will have a daily cron job to mark contests with passed end_date with ended equal to True. I've written the following code to do it:
contests = Contest.query()
current_datetime = datetime.datetime.utcnow()
today = datetime.date(current_datetime.year, current_datetime.month, current_datetime.day)
contests = contests.filter(Contest.end_date < today)
contests = contests.filter(Contest.ended == False)
contests = contests.fetch(limit=10)
for contest in contests:
contest.ended = True
ndb.put_multi(contests)
But I don't like it, since I have to read all entities just to update one value. Is there any way to modify it to read keys_only?
The object data overwrites the existing entity. The entire object is sent to Datastore
https://cloud.google.com/datastore/docs/concepts/entities#Datastore_Updating_an_entity
So you cannot send only one field of an entity, it will "remove" all existing fields. To be more accurate - replace an entity with all fields with new version of entity that have only one field.
You have to load all entities you want to update, with all properties, not just keys, set new value of a property, and put back into database.
I think a Python property is a good solution here:
class Contest(ndb.Model):
end_date = ndb.DateProperty(required=True, indexed=True)
#property
def ended(self):
return self.end_date < date.today()
This way you don't ever need to update your entities. The value is automatically computed whenever you need it.
We have an app that is made up of almost entirely custom objects. Creating a smooth workflow for our users is vital for this app. I'm new to apex, but familiar with basic code writing.
Here are the relationships between the objects: (--< = 1 to many, M/D = Master/Detail)
Object_a --< Object_b --M/D--< Object_c;
Object_a --M/D--< Object_d
When a user populates a date field on Object_c and saves, we'd like a date field on all related records on Object_d (i.e. all Object_d's records for that specific Object_a record) to be updated with the same value.
Any help is appreciated.
You'll need to write a trigger on insert / update of Object_c as workflow rules can't fire on update of multiple objects and rollup summary fields won't help (master-details aren't set everywhere).
Before you deep dive into coding please consider what should happen if you have more than 1 Object__c modified at the same time (with Data Loader for example, through integration with some external system or other piece of code). If you have 2 records that both are related to same Object_A and one has "today" while other has "today + 7", which value should "win"?
One way of solving it would be to start with a map of Object_A Ids and date values to set. The way I'll be building the map means I'll keep overwriting the values on case of duplicate id which is not what you might need (select later of the dates maybe?).
Map<Id, Date> datesToSet = new Map<Id, Date>(); // you could make it Map<Id, Object_C__c> too, principle would be similar
for(Object_C__c c : [SELECT Id, Date_Field__c, Object_B__r.Object_A__c FROM Object_C__c WHERE Id IN :trigger.new AND Object_B__r.Object_A__c != null]){
datesToSet.put(c.dObject_B__r.Object_A__c, c.Date_Field__c);
}
System.debug(datesToSet);
Now we have map of unique A ids and values we should apply to all their child D records.
List<Object_D__c> childsToUpdate = [SELECT Id, Some_Other_Date_Field__c, Object_A__c FROM Object_D__c WHERE Object_A__c IN :datesToSet.keyset()];
System.debug('BEFORE: ' + childsToUpdate);
for(Object_D__c d : childsToUpdate){
d.Some_Other_Date_Field__c = datesToSet.get(d.Object_A__c);
}
System.debug('AFTER: ' + childsToUpdate);
update childsToUpdate;
I'm trying to wrap my head around eventuality consistency and 1 write per sec principles in GAE datastore. I have a scenario and two questions:
#python like pseudo-code
class User:
user_id = StringProperty
last_update_time = DateTimeProperty
class Comment:
user_id = StringProperty
comment = StringProperty
...
def AddCommentAndReturnAllComments(user_id):
user = db.GqlQuery("SELECT * FROM User where user_id = :1", user_id)
user.last_update_time = datetime.now()
user.put()
comment = Comment(parent=User(user_id))
comment.put()
comments = db.GqlQuery("SELECT * FROM Comment where user_id = :1", user_id)
return comments
Questions:
Will I get an exception here because I make two writes into the same EntityGroup within one second (user.put and comment.put)? Is there a simple way around it?
If I remove the parent=user(user_id), the two entities will no longer belong to the same EntityGroup. Does it mean that the list of comments returned from the function might not contain the last added comment?
Am I doing something inherently wrong?
I know that I got the entity referencing part wrong. It doesn't matter for the question (or does it?)
This seems to be a soft limit. In practice I see up to 5 writes/s allowed.
Yes and it also happens now, because you are not using ancestor query.
Nothing, except as mentioned in point 2.
How to implement this query counter into existing class? The main purpose is i have datastore model of members with more than 3000 records. I just want to count its total record, and found this on app engine cookbook:
def query_counter (q, cursor=None, limit=500):
if cursor:
q.with_cursor (cursor)
count = q.count (limit=limit)
if count == limit:
return count + query_counter (q, q.cursor (), limit=limit)
return count
My existing model is:
class Members(search.SearchableModel):
group = db.ListProperty(db.Key,default=[])
email = db.EmailProperty()
name = db.TextProperty()
gender = db.StringProperty()
Further i want to count members that join certain group with list reference. It might content more than 1000 records also.
Anyone have experience with query.cursor for this purpose?
To find all members you would use it like this:
num_members = query_counter(Members.all())
However, you may find that this runs slowly, because it's making a lot of datastore calls.
A faster way would be to have a separate model class (eg MembersCount), and maintain the count in there (ie add 1 when you create a member, subtract 1 when you delete a member).
If you are creating and deleting members frequently you may need to create a sharded counter in order to get good performance - see here for details:
http://code.google.com/appengine/articles/sharding_counters.html
To count members in a certain group, you could do something like this:
group = ...
num_members = query_counter(Members.all().filter('group =', group.key()))
If you expect to have large numbers of members in a group, you could also do that more efficiently by using a counter model which is sharded by the group.