Searching on arrays using Datastore Queries - google-app-engine

I'm interested in migrating from JDO queries to Datastore queries to make use of the AsyncDatastore API.
However, I'm unable to make the following query work in Datastore queries:
//JDO query (working correctly)
PersistenceManager pm = PMF.get().getPersistenceManager();
Query query = pm.newQuery("SELECT FROM "
+ Tasks.class.getName()
+ " WHERE archivado==false & arrayUsers=="
+ user.getId()
+ " & taskDate != null & taskDate > best_before_limit "
+ "PARAMETERS Date best_before_limit "
+ "import java.util.Date");
List <Tasks> results= (List<Tasks>) pm.newQuery(query).execute(new Date());
//Datastore query (returning zero entities)
AsyncDatastoreService datastore = DatastoreServiceFactory.getAsyncDatastoreService();
com.google.appengine.api.datastore.Query query = new com.google.appengine.api.datastore.Query("Tasks");
Filter userFilter = new FilterPredicate("arrayUsers", FilterOperator.EQUAL,user.getId());
Filter filterPendingTasks = new FilterPredicate("taskDate", FilterOperator.LESS_THAN_OR_EQUAL , new Date());
Filter completeFilter = CompositeFilterOperator.and(filterPendingTasks,userFilter);
query.setFilter(completeFilter);
List<Entity> results = datastore.prepare(query).asList(FetchOptions.Builder.withDefaults());
Apart from the fact that I have to build my Task objects out of the Entities resulting from the query, these should be the same.
The problem is that the query must look up if the passed user id (user.getId()) is present in the array (arrayUsers). JDO does this without any issues, but no joy with Datastore queries so far.
Any ideas about what is wrong with my code?

As was pointed out by the users commenting, you use different properties in your datastore query. If you have such a query and you don't have EXACTLY the index for this, it won't work. Without seeing what indexes you have, I say this query looks good to me, so either you don't have data that returns there (unlikely, since your JDO query does it), or you're missing a filter.
In general, in datastore when querying for one of the values to equal something specific, you indeed would use something like this :
new Query("Widget").setFilter(new FilterPredicate("x", FilterOperator.EQUAL, 1))
Since you're using an equality filter, you won't get funky results (as you can see in the docs (look for "Properties with multiple values can behave in surprising ways")).

Currently, there is no way to do this using Datastore Queries due to the lack of a "CONTAINS" operator or similar.
The alternative to keep using JDO (at least for this kind of queries).

Related

Not Able to retrive all data from datastore

In my datasore have one table EFlow and this table have 7000 entries but first 1000 entries have these fileds :
(ID/Name, appliedBy, approved, childEflowName, completed, completedApprovers, created_on, dueDate, eflowDispName, eflowName, isResubmitted, modified_on, nextApprover, parentEflowName, ruleEmailReceivers, ruleNames, upComingApprovers, workFlowName, workFlowVersion, approvalStateValues)
and remaining 6000 entries have these feilds:
(ID/Name, appliedBy, approvalStateValues, approved, childEflowName, completed, completedApprovers, created_on, draft, dueDate, dynamicApprovalStates, eflowApprovers, eflowDispName, eflowName, fieldValues, isResubmitted, modified_on, nextApprover, parentEflowName, ruleEmailReceivers, ruleNames, upComingApprovers, workFlowName, workFlowVersion)
I have added draft,dynamicApprovalStates,eflowApprovers and fieldValues this new field.
my problem is when I retrieve data from datastore then I got only first 1000 entries record.
How to retrieve all records?
My query is:
List<EFlow> lst = this.entityManager.createQuery("select from " + this.clazz.getName() + " i where i.completed = false and i.approved = false").getResultList();
First, it looks like you are using JPA. From our docs:
Warning: We think most developers will have a better experience using
the low-level Datastore API, or one of the open-source APIs developed
specifically for Datastore, such as Objectify. JPA was designed for
use with traditional relational databases, and so has no way to
explicitly represent some of the aspects of Datastore that make it
different from relational databases, such as entity groups and
ancestor queries. This can lead to subtle issues that are difficult to
understand and fix.
However, if you need to keep using JPA:
As the number of results can be large, you need to handle pagination with your query.
The best way to achieve this is with cursors.
import com.google.appengine.api.datastore.Cursor;
import com.google.appengine.datanucleus.query.JPACursorHelper;
...
Query query = this.entityManager.createQuery("select from " + this.clazz.getName() + " i where i.completed = false and i.approved = false")
Cursor cursor = Cursor.newBuilder().build();
do {
query.setHint(JPACursorHelper.CURSOR_HINT, cursor);
List<EFlow> lst = query.getResultList();
// ... Do stuff on lst here .. //
// Get the cursor so you can see if there are more results
cursor = JPACursorHelper.getCursor(lst);
} while (cursor != null)

DatastoreTimeoutException in app engine for basic query

We have started to get a lot of DatastoreTimeoutException lately for this basic query:
select id from Later where at < '2013-07-04' limit 500 (Some Pseudo SQL)
In code it looks like:
DatastoreService datastore = DatastoreServiceFactory.getDatastoreService();
Query query = new Query("Later").setKeysOnly();
Filter f = new FilterPredicate("at",FilterOperator.LESS_THAN, new Date());
query.setFilter(f);
query.addSort("at", SortDirection.DESCENDING);
PreparedQuery pq = datastore.prepare(query);
return pq.asList(FetchOptions.Builder.withLimit(500));
The table have aprox. 1.5 million entities. Is this normal datastore behavior?
UPDATE: If we remove the filter it is working better. Of course we need the filter so thats not an solution in the long run.
Change to using asIterator instead of asList.
From the docs:
When iterating through the results of a query using the
PreparedQuery.asIterable() and PreparedQuery.asIterator() methods, the
Datastore retrieves the results in batches. By default each batch
contains 20 results, but you can change this value using
FetchOptions.chunkSize(). You can continue iterating through query
results until all are returned or the request times out.

In AppEngine (JDO), what is the difference between equality (==) of an item with list and contains() function?

For example, if I have: List A; and a String B;
What is the difference, in JDO (AppEngine), between the following two conditions in a query: B == A; and A.contains(B);?
Also, does the query in Slides 23-25 of http://dl.google.com/io/2009/pres/W_0415_Building_Scalable_Complex_App_Engines.pdf work efficiently in AppEngine (JDO) for more than 30 receivers? How so, especially since I read in AppEngine documentation that each contains() query can have a maximum of 30 items in the list. Do I not use a contains() query to imitate the above slides (written in Python)? If not, then how can I achieve the same results in JDO?
Any suggestions/comments are highly welcome. I'm trying to build a messaging system in AppEngine but having trouble trying to get used to the platform.
Thanks.
There's no difference - in App Engine, equality checks on lists are the same as checking for containment, due to the way things are indexed in the datastore.
By the query on slides 23-25, I presume you mean this one?
indexes = db.GqlQuery(
"SELECT __key__ FROM MessageIndex "
"WHERE receivers = :1", me)
keys = [k.parent() for k in indexes]
messages = db.get(keys)
This works just fine, as it's a list containment check as described above, and results in a single datastore query. The limitation you're thinking about is on the reverse: if you have a list, and you want to find a record that contains any item in that list, a subquery will be created for each element in the list.

GQL query with "like" operator [duplicate]

Simple one really. In SQL, if I want to search a text field for a couple of characters, I can do:
SELECT blah FROM blah WHERE blah LIKE '%text%'
The documentation for App Engine makes no mention of how to achieve this, but surely it's a common enough problem?
BigTable, which is the database back end for App Engine, will scale to millions of records. Due to this, App Engine will not allow you to do any query that will result in a table scan, as performance would be dreadful for a well populated table.
In other words, every query must use an index. This is why you can only do =, > and < queries. (In fact you can also do != but the API does this using a a combination of > and < queries.) This is also why the development environment monitors all the queries you do and automatically adds any missing indexes to your index.yaml file.
There is no way to index for a LIKE query so it's simply not available.
Have a watch of this Google IO session for a much better and more detailed explanation of this.
i'm facing the same problem, but i found something on google app engine pages:
Tip: Query filters do not have an explicit way to match just part of a string value, but you can fake a prefix match using inequality filters:
db.GqlQuery("SELECT * FROM MyModel WHERE prop >= :1 AND prop < :2",
"abc",
u"abc" + u"\ufffd")
This matches every MyModel entity with a string property prop that begins with the characters abc. The unicode string u"\ufffd" represents the largest possible Unicode character. When the property values are sorted in an index, the values that fall in this range are all of the values that begin with the given prefix.
http://code.google.com/appengine/docs/python/datastore/queriesandindexes.html
maybe this could do the trick ;)
Altough App Engine does not support LIKE queries, have a look at the properties ListProperty and StringListProperty. When an equality test is done on these properties, the test will actually be applied on all list members, e.g., list_property = value tests if the value appears anywhere in the list.
Sometimes this feature might be used as a workaround to the lack of LIKE queries. For instance, it makes it possible to do simple text search, as described on this post.
You need to use search service to perform full text search queries similar to SQL LIKE.
Gaelyk provides domain specific language to perform more user friendly search queries. For example following snippet will find first ten books sorted from the latest ones with title containing fern
and the genre exactly matching thriller:
def documents = search.search {
select all from books
sort desc by published, SearchApiLimits.MINIMUM_DATE_VALUE
where title =~ 'fern'
and genre = 'thriller'
limit 10
}
Like is written as Groovy's match operator =~.
It supports functions such as distance(geopoint(lat, lon), location) as well.
App engine launched a general-purpose full text search service in version 1.7.0 that supports the datastore.
Details in the announcement.
More information on how to use this: https://cloud.google.com/appengine/training/fts_intro/lesson2
Have a look at Objectify here , it is like a Datastore access API. There is a FAQ with this question specifically, here is the answer
How do I do a like query (LIKE "foo%")
You can do something like a startWith, or endWith if you reverse the order when stored and searched. You do a range query with the starting value you want, and a value just above the one you want.
String start = "foo";
... = ofy.query(MyEntity.class).filter("field >=", start).filter("field <", start + "\uFFFD");
Just follow here:
init.py#354">http://code.google.com/p/googleappengine/source/browse/trunk/python/google/appengine/ext/search/init.py#354
It works!
class Article(search.SearchableModel):
text = db.TextProperty()
...
article = Article(text=...)
article.save()
To search the full text index, use the SearchableModel.all() method to get an
instance of SearchableModel.Query, which subclasses db.Query. Use its search()
method to provide a search query, in addition to any other filters or sort
orders, e.g.:
query = article.all().search('a search query').filter(...).order(...)
I tested this with GAE Datastore low-level Java API. Me and works perfectly
Query q = new Query(Directorio.class.getSimpleName());
Filter filterNombreGreater = new FilterPredicate("nombre", FilterOperator.GREATER_THAN_OR_EQUAL, query);
Filter filterNombreLess = new FilterPredicate("nombre", FilterOperator.LESS_THAN, query+"\uFFFD");
Filter filterNombre = CompositeFilterOperator.and(filterNombreGreater, filterNombreLess);
q.setFilter(filter);
In general, even though this is an old post, a way to produce a 'LIKE' or 'ILIKE' is to gather all results from a '>=' query, then loop results in python (or Java) for elements containing what you're looking for.
Let's say you want to filter users given a q='luigi'
users = []
qry = self.user_model.query(ndb.OR(self.user_model.name >= q.lower(),self.user_model.email >= q.lower(),self.user_model.username >= q.lower()))
for _qry in qry:
if q.lower() in _qry.name.lower() or q.lower() in _qry.email.lower() or q.lower() in _qry.username.lower():
users.append(_qry)
It is not possible to do a LIKE search on datastore app engine, how ever creating an Arraylist would do the trick if you need to search a word in a string.
#Index
public ArrayList<String> searchName;
and then to search in the index using objectify.
List<Profiles> list1 = ofy().load().type(Profiles.class).filter("searchName =",search).list();
and this will give you a list with all the items that contain the world you did on the search
If the LIKE '%text%' always compares to a word or a few (think permutations) and your data changes slowly (slowly means that it's not prohibitively expensive - both price-wise and performance-wise - to create and updates indexes) then Relation Index Entity (RIE) may be the answer.
Yes, you will have to build additional datastore entity and populate it appropriately. Yes, there are some constraints that you will have to play around (one is 5000 limit on the length of list property in GAE datastore). But the resulting searches are lightning fast.
For details see my RIE with Java and Ojbectify and RIE with Python posts.
"Like" is often uses as a poor-man's substitute for text search. For text search, it is possible to use Whoosh-AppEngine.

Google App Engine: Is it possible to do a Gql LIKE query?

Simple one really. In SQL, if I want to search a text field for a couple of characters, I can do:
SELECT blah FROM blah WHERE blah LIKE '%text%'
The documentation for App Engine makes no mention of how to achieve this, but surely it's a common enough problem?
BigTable, which is the database back end for App Engine, will scale to millions of records. Due to this, App Engine will not allow you to do any query that will result in a table scan, as performance would be dreadful for a well populated table.
In other words, every query must use an index. This is why you can only do =, > and < queries. (In fact you can also do != but the API does this using a a combination of > and < queries.) This is also why the development environment monitors all the queries you do and automatically adds any missing indexes to your index.yaml file.
There is no way to index for a LIKE query so it's simply not available.
Have a watch of this Google IO session for a much better and more detailed explanation of this.
i'm facing the same problem, but i found something on google app engine pages:
Tip: Query filters do not have an explicit way to match just part of a string value, but you can fake a prefix match using inequality filters:
db.GqlQuery("SELECT * FROM MyModel WHERE prop >= :1 AND prop < :2",
"abc",
u"abc" + u"\ufffd")
This matches every MyModel entity with a string property prop that begins with the characters abc. The unicode string u"\ufffd" represents the largest possible Unicode character. When the property values are sorted in an index, the values that fall in this range are all of the values that begin with the given prefix.
http://code.google.com/appengine/docs/python/datastore/queriesandindexes.html
maybe this could do the trick ;)
Altough App Engine does not support LIKE queries, have a look at the properties ListProperty and StringListProperty. When an equality test is done on these properties, the test will actually be applied on all list members, e.g., list_property = value tests if the value appears anywhere in the list.
Sometimes this feature might be used as a workaround to the lack of LIKE queries. For instance, it makes it possible to do simple text search, as described on this post.
You need to use search service to perform full text search queries similar to SQL LIKE.
Gaelyk provides domain specific language to perform more user friendly search queries. For example following snippet will find first ten books sorted from the latest ones with title containing fern
and the genre exactly matching thriller:
def documents = search.search {
select all from books
sort desc by published, SearchApiLimits.MINIMUM_DATE_VALUE
where title =~ 'fern'
and genre = 'thriller'
limit 10
}
Like is written as Groovy's match operator =~.
It supports functions such as distance(geopoint(lat, lon), location) as well.
App engine launched a general-purpose full text search service in version 1.7.0 that supports the datastore.
Details in the announcement.
More information on how to use this: https://cloud.google.com/appengine/training/fts_intro/lesson2
Have a look at Objectify here , it is like a Datastore access API. There is a FAQ with this question specifically, here is the answer
How do I do a like query (LIKE "foo%")
You can do something like a startWith, or endWith if you reverse the order when stored and searched. You do a range query with the starting value you want, and a value just above the one you want.
String start = "foo";
... = ofy.query(MyEntity.class).filter("field >=", start).filter("field <", start + "\uFFFD");
Just follow here:
init.py#354">http://code.google.com/p/googleappengine/source/browse/trunk/python/google/appengine/ext/search/init.py#354
It works!
class Article(search.SearchableModel):
text = db.TextProperty()
...
article = Article(text=...)
article.save()
To search the full text index, use the SearchableModel.all() method to get an
instance of SearchableModel.Query, which subclasses db.Query. Use its search()
method to provide a search query, in addition to any other filters or sort
orders, e.g.:
query = article.all().search('a search query').filter(...).order(...)
I tested this with GAE Datastore low-level Java API. Me and works perfectly
Query q = new Query(Directorio.class.getSimpleName());
Filter filterNombreGreater = new FilterPredicate("nombre", FilterOperator.GREATER_THAN_OR_EQUAL, query);
Filter filterNombreLess = new FilterPredicate("nombre", FilterOperator.LESS_THAN, query+"\uFFFD");
Filter filterNombre = CompositeFilterOperator.and(filterNombreGreater, filterNombreLess);
q.setFilter(filter);
In general, even though this is an old post, a way to produce a 'LIKE' or 'ILIKE' is to gather all results from a '>=' query, then loop results in python (or Java) for elements containing what you're looking for.
Let's say you want to filter users given a q='luigi'
users = []
qry = self.user_model.query(ndb.OR(self.user_model.name >= q.lower(),self.user_model.email >= q.lower(),self.user_model.username >= q.lower()))
for _qry in qry:
if q.lower() in _qry.name.lower() or q.lower() in _qry.email.lower() or q.lower() in _qry.username.lower():
users.append(_qry)
It is not possible to do a LIKE search on datastore app engine, how ever creating an Arraylist would do the trick if you need to search a word in a string.
#Index
public ArrayList<String> searchName;
and then to search in the index using objectify.
List<Profiles> list1 = ofy().load().type(Profiles.class).filter("searchName =",search).list();
and this will give you a list with all the items that contain the world you did on the search
If the LIKE '%text%' always compares to a word or a few (think permutations) and your data changes slowly (slowly means that it's not prohibitively expensive - both price-wise and performance-wise - to create and updates indexes) then Relation Index Entity (RIE) may be the answer.
Yes, you will have to build additional datastore entity and populate it appropriately. Yes, there are some constraints that you will have to play around (one is 5000 limit on the length of list property in GAE datastore). But the resulting searches are lightning fast.
For details see my RIE with Java and Ojbectify and RIE with Python posts.
"Like" is often uses as a poor-man's substitute for text search. For text search, it is possible to use Whoosh-AppEngine.

Resources