Objectify - test if object exists using only the key? - google-app-engine

I want to test if an object exists in the datastore. I know its key. I am doing this right now by loading the entire object:
public boolean doesObjectExist(String knownFooId) {
Key<Foo> key = Key.create(Foo.class, knownFooId);
Foo foo = ofy().load().key(key).now();
if (foo != null) {
// yes it exists.
return true;
}
return false;
}
That must cost 1 read operation from the datastore. Is there a way to do it without having to load the entire object, that might be cheaper? In other words, a way that it would only cost 1 "small" operation?
Thanks

There's no way to do it cheaper.
Even if you just do a keys only query, the query is 1 Read operation + 1 Small operation per key fetched. (https://cloud.google.com/appengine/pricing#costs-for-datastore-calls)
Keep doing a get by key, which is just 1 Read.

public boolean doesObjectExist(String knownFooId) {
Key<Foo> fooKey = Key.create(Foo.class, knownFooId);
Key<Foo> datastoreKey = ofy().load().type(Foo.class).filterKey(fooKey).keys().first().now();
return datastoreKey.equals(fooKey);
}
From documentation:
QueryKeys keys()
Switches to a keys-only query. Keys-only responses are billed as "minor datastore operations" which are faster and free compared to fetching whole entities.

You could try to fetch the key, as far as I understand it'd be only a small operation.
// You can query for just keys, which will return Key objects much more efficiently than fetching whole objects
Iterable<Key<F>> allKeys = ofy().load().type(Foo.class).filter("id", knownFooId).keys();
It should work. Also take a look at the objectfy docs: https://github.com/objectify/objectify/wiki/Queries

Related

objectify cache not working

I am using GAE for my server where I have all my entities in Datastore. One of the entity has more than 2000 records, and it is taking almost 30 secs to read whole entity. So I wanted to use cache to improve performance.
I have tried Datastore objectify #cache annotation, but not finding
how to read from the stored cache. I have declared entity as below:
#Entity
#Cache
public class Devices{
}
Second thing I tried is memcache. I am storing whole List s
in key, but this is not storing, I couldn't see in console memcache,
but at the same time not showing any errors or exceptions while
storing objects.
putvalue("temp", List<Devices>)
public void putValue(String key, Object value) {
Cache cache = getCache();
logger.info(TAG + "getCache() :: storing memcache for key : " + key);
try {
if (cache != null) {
cache.put(key, value);
}
}catch (Exception e) {
logger.info(TAG + "getCache() :: exception : " + e);
}
}
When I tried to retrieve using getValue("temp"), it is returning
null or empty.
Object object = cache.get(key);
My main object is to limit the time to 5secs to get all the records of entity.
Can anyone suggest what I am doing wrong here? Or any better solution to retrieve the records fast from Datastore.
Datastore Objectify actually uses the App Engine Memcache service to cache your entity data globally when you use the #Cache annotation. However, as explained in the doc here, only get-by-key, save(), and delete() interact with the cache. Query operations are not cached.
Regarding the App Engine Memcache method, you may be hitting the limit for the maximum size of a cached data value which is 1 MiB, although I believe this raise an exception indeed.
Regarding the query itself, you may be better off using a keys_only query and then doing a key.get() on each returned key. That way, Memcache will be used for each record.

id() not working (not completely) in Objectify

One of my entities has the following declaration for its ID:
#Id
private String oInstID;
public String getInstID(){return oInstID;}
public void initID(){
oInstID = OfyController.makeID(Partner.class, null);
}
Keep in mind that I have same declaration for my other entities as well.
I have the following testing statements after the ofy.save():
Sticky persisted = OfyController.ofy().load().type(Sticky.class).first().now();
String id = persisted.getInstID();
Sticky queried = OfyController.ofy().load().type(Sticky.class).id(id).now();
Sticky queried2 = OfyController.ofy().load().entity(persisted).now();
The persisted returned the entity
The id returned the ID of the entity
The queried returned null...which is what my problem is.
The queried2 returned the same entity as the persisted
Any idea why queried returned null?
Thanks!
My prior experience with Objectify is tiny and very stale, but what you're describing is consistent with eventual consistency. There's a bit of useful info in Storing Data with Objectify and Datastore.

How to increase query speed in db4o?

OutOfMemoryError caused when db4o databse has 15000+ objects
My question is in reference to my previous question (above). For the same PostedMessage model and same query.
With 100,000 PostedMessage objects, the query takes about 1243 ms to return first 20 PostedMessages.
Now, I have saved 1,000,000 PostedMessage objects in db4o. The same query took 342,132 ms. Which is non-linearly high.
How can I optimize the query speed?
FYR:
The timeSent and timeReceived are Indexed fields.
I am using SNAPSHOT query mode.
I am not using TA/TP.
Do you sort the result? Unfortunatly db4o doesn't use the index for sorting / orderBy. That means it will run a regular sort algorith, with O(n*log(n)). It won't scala liniearly.
Also db4o doesn't support a TOP operator. That means even without sorting it takes quite a bit of time to copy the ids to the results set, even when you never read the entities afterwards.
So, there's no real good solution for this, except trying to use some criteria which cut down the result size.
Some adventerous people might use a different query evaluation, but personally don't recommend that.
#Gamlor No, I am not sorting at all. The code is as follows:
public static ObjectSet<PostedMessage> getMessagesBetweenDates(
Calendar after,
Calendar before,
ObjectContainer db) {
if (after == null || before == null || db == null) {
return null;
}
Query q = db.query(); //db is pre-configured to use SNAPSHOT mode.
q.constrain(PostedMessage.class);
Constraint from = q.descend("timeRecieved").constrain(new Long(after.getTimeInMillis())).greater().equal();
q.descend("timeRecieved").constrain(new Long(before.getTimeInMillis())).smaller().equal().and(from);
ObjectSet<EmailMessage> results = q.execute();
return results;
}
The arguments to this method are as follows:
after = 13-09-2011 10:55:55
before = 13-09-2011 10:56:10
And I expect only 10 PostedMessages to be returned between "after" and "before". (I am generating dummy PostedMessage with timeReceived incremented by 1 sec each.)

GAE caching objectify queries

i have a simple question
in the objectify documentation it says that "Only get(), put(), and delete() interact with the cache. query() is not cached"
http://code.google.com/p/objectify-appengine/wiki/IntroductionToObjectify#Global_Cache.
what i'm wondering - if you have one root entity (i did not use #Parent due to all the scalability issues that it seems to have) that all the other entities have a Key to, and you do a query such as
ofy.query(ChildEntity.class).filter("rootEntity", rootEntity).list()
is this completely bypassing the cache?
If this is the case, is there an efficient caching way to do a query on conditions - or for that matter can you cache a query with a parent where you would have to make an actual ancestor query like the following
Key<Parent> rootKey = ObjectifyService.factory().getKey(root)
ofy.query(ChildEntity.class).ancestor(rootKey)
Thank you
as to one of the comments below i've added an edit
sample dao (ignore the validate method - it just does some null & quantity checks):
this is a sample find all method inside a delegate called from the DAO that the request factory ServiceLocator is using
public List<EquipmentCheckin> findAll(Subject subject, Objectify ofy, Event event) {
final Business business = (Business) subject.getSession().getAttribute(BUSINESS_ATTRIBUTE);
final List<EquipmentCheckin> checkins = ofy.query(EquipmentCheckin.class).filter(BUSINESS_ATTRIBUTE, business)
.filter(EVENT_CONDITION, event).list();
return validate(ofy, checkins);
}
now, when this is executed i find that the following method is actually being called in my AbstractDAO.
/**
*
* #param id
* #return
*/
public T find(Long id) {
System.out.println("finding " + clazz.getSimpleName() + " id = " + id);
return ObjectifyService.begin().find(clazz, id);
}
Yes, all queries bypass Objectify's integrated memcache and fetch results directly from the datastore. The datastore provides the (increasingly sophisticated) query engine that understands how to return results; determining cache invalidation for query results is pretty much impossible from the client side.
On the other hand, Objectify4 does offer a hybrid query cache whereby queries are automagically converted to a keys-only query followed by a batch get. The keys-only query still requires the datastore, but any entity instances are pulled from (and populate on miss) memcache. It might save you money.

Autoincrement ID in App Engine datastore

I'm using App Engine datastore, and would like to make sure that row IDs behave similarly to "auto-increment" fields in mySQL DB.
Tried several generation strategies, but can't seem to take control over what happens:
the IDs are not consecutive, there seem to be several "streams" growing in parallel.
the ids get "recycled" after old rows are deleted
Is such a thing at all possible ?
I really would like to refrain from keeping (indexed) timestamps for each row.
It sounds like you can't rely on IDs being sequential without a fair amount of extra work. However, there is an easy way to achieve what you are trying to do:
We'd like to delete old items (older than two month worth, for
example)
Here is a model that automatically keeps track of both its creation and its modification times. Simply using the auto_now_add and auto_now parameters makes this trivial.
from google.appengine.ext import db
class Document(db.Model):
user = db.UserProperty(required=True)
title = db.StringProperty(default="Untitled")
content = db.TextProperty(default=DEFAULT_DOC)
created = db.DateTimeProperty(auto_now_add=True)
modified = db.DateTimeProperty(auto_now=True)
Then you can use cron jobs or the task queue to schedule your maintenance task of deleting old stuff. Find the oldest stuff is as easy as sorting by created date or modified date:
db.Query(Document).order("modified")
# or
db.Query(Document).order("created")
What I know, is that auto-generated ID's as Long Integers are available in Google App Engine, but there's no guarantee that the value's are increasing and there's also no guarantee that the numbers are real one-increments.
So, if you nee timestamping and increments, add a DateTime field with milliseconds, but then you don't know that the numbers are unique.
So, the best this to do (what we are using) is: (sorry for that, but this is indeed IMHO the best option)
use a autogenerated ID as Long (we use Objectify in Java)
use a timestamp on each entity and use a index to query the entities (use a descending index) to get the top X
I think this is probably a fairly good solution, however be aware that I have not tested it in any way, shape or form. The syntax may even be incorrect!
The principle is to use memcache to generate a monotonic sequence, using the datastore to provide a fall-back if memcache fails.
class IndexEndPoint(db.Model):
index = db.IntegerProperty (indexed = False, default = 0)
def find_next_index (cls):
""" finds the next free index for an entity type """
name = 'seqindex-%s' % ( cls.kind() )
def _from_ds ():
"""A very naive way to find the next free key.
We just take the last known end point and loop untill its free.
"""
tmp_index = IndexEndPoint.get_or_insert (name).index
index = None
while index is None:
key = db.key.from_path (cls.kind(), tmp_index))
free = db.get(key) is None
if free:
index = tmp_index
tmp_index += 1
return index
index = None
while index is None:
index = memcache.incr (index_name)
if index is None: # Our index might have been evicted
index = _from_ds ()
if memcache.add (index_name, index): # if false someone beat us to it
index = None
# ToDo:
# Use a named task to update IndexEndPoint so if the memcache index gets evicted
# we don't have too many items to cycle over to find the end point again.
return index
def make_new (cls):
""" Makes a new entity with an incrementing ID """
result = None
while result is None:
index = find_next_index (cls)
def txn ():
"""Makes a new entity if index is free.
This should only fail if we had a memcache miss
(does not have to be on this instance).
"""
key = db.key.from_path (cls.kind(), index)
if db.get (key) is not None:
return
result = cls (key)
result.put()
return result
result = db.run_in_transaction (txn)

Resources