How to search Entities effectively in google datastore - google-app-engine

I am using Datastore low level APIs for an application (Internal job portal).
I have an entity called Candidate with following properties
Entity cadidate=new Entity("Candidate",getEmpNum());
candidate.setProperty("name","some name");
candidate.setProperty("skill","c++,Java,C#,GAE");
candiate.setProperty("empNum");
The property 'skill' takes the a comma seperated string. If an HR do a search for a candidate with a key
"GAE,Java" , how do I effectively search for entities?
will the following query works?
q.addFilter(searchBy, FilterOperator.GREATER_THAN_OR_EQUAL, searchFor);
q.addFilter(searchBy, Query.FilterOperator.LESS_THAN, searchFor+"Z");
If the above query works on say 1000 Entities ..does it work with same latency when there are 50,000 + entities in the datastore?
Please suggest me .. I am new to GAE and Datastore
-thanks
Ma

You should use list property.
List<String> skills = Arrays.asList("c++", "Java", "C#", "GAE");
candidate.setProperty("skill", skills);
then query with IN operator:
List<String> findSkills = Arrays.asList("Java","GAE");
q.addFilter("skill", FilterOperator.IN, findSkills);
This

Related

How to get the count of property in a Kind?

I have a Kind Students which stores the details of favorite colors of all students. They are allowed to pick their favorite color from a set of three colors : {Red,Blue,Green}
Let us assume there are 100 students, my code is like this for every student :
Entity arya = new Entity("Student","Arya");
arya.setProperty("Color","Red");
Entity robb = new Entity("Student","Robb");
robb.setProperty("Color","Green");
..
..
Entity jon = new Entity("Student","Jon");
jon.setProperty("Color","Blue");
How to find out how many students liked a particular color(say Red) in this Student Kind ? What Query I should write to fetch the count ?
Thanks in advance
The number you seek would be the number of items in the result of a query with an equality filter on the Color property.
You could use a keys-only query (a special kind of projection query) for this purpose, faster and less expensive:
Keys-only queries
A keys-only query (which is a type of projection query) returns just
the keys of the result entities instead of the entities themselves, at
lower latency and cost than retrieving entire entities.
...
A keys-only query is a small operation and counts as only a single
entity read for the query itself.
Something along these lines (but note that I'm not a java user, the snippet is based only on the documentation examples)
Query<Key> query = Query.newKeyQueryBuilder()
.setKind("Student")
.setFilter(PropertyFilter.eq("Color", "Red")
.build();
I agree with the Dan Cornilescu's answer. Here is a direct Datastore API usage. I have prepared the request body for your use-case. You can run it by just adding your Project Id. This will return the entities that matches with the filter then you can count the number of them.

Cloud Datastore 'like' query

I've an entity in Google Cloud Datastore. One of the properties is array of strings. For example:
property: skills
Entity 1:
value: ["mysql","sqlserver","postgresql","sqllite","sql-server-2008","sql"]
Entity 2:
value: ["css","css3"]
Now, I need to query for those entities that contain array elements css*
In typical SQL, it'll be select * from kindName where skills like 'css%'
I tried select * from kindName where skills = 'css', which works fine but how can I get entities that have css* elements similar to the SQL query?
Or
What's the best way to model the data for this?
You can do inequality range checks on a single indexed property as given in the example below. Range checks on strings are essentially how you can perform prefix searching on strings.
SELECT * from yourKind WHERE skills >= "css" AND skills < "cst"
As an example, here is the query performed on some sample data I created in the UI Console for Cloud Datastore:

GQL Query Not Returning Results on StringProperty Query Test for Equality

class MyEntity(db.Model):
timestamp = db.DateTimeProperty()
title = db.StringProperty()
number = db.FloatProperty()
db.GqlQuery("SELECT * FROM MyEntity WHERE title = 'mystring' AND timestamp >= date('2012-01-01') AND timestamp <= date('2012-12-31') ORDER BY timestamp DESC").fetch(1000)
This should fetch ~600 entities on app engine. On my dev server it behaves as expected, builds the index.yaml, I upload it, test on server but on app engine it does not return anything.
Index:
- kind: MyEntity
properties:
- name: title
- name: timestamp
direction: desc
I try splitting the query down on datastore viewer to see where the issue is and the timestamp constraints work as expected. The query returns nothing on WHERE title = 'mystring' when it should be returning a bunch of entities.
I vaguely remember fussy filtering where you had to call .filter("prop =",propValue) with the space between property and operator, but this is a GqlQuery so it's not that (and I tried that format with the GQL too).
Anyone know what my issue is?
One thing I can think of:
I added the list of MyEntity entities into the app via BulkLoader.py prior to the new index being created on my devserver & uploaded. Would that make a difference?
The last line you wrote is probably the problem.
Your entities in the actual real datastore are missing the index required for the query.
As far as I know, when you add a new index, App Engine is supposed to rebuild your indexes for you. This may take some time. You can check your admin page to check the state of your indexes and see if it's still building.
Turns out there's a slight bug in the bulkloader supplied with App Engine SDK - basically autogenerated config transforms strings as db.Text, which is no good if you want these fields indexed. The correct import_transform directive should be:
transform.none_if_empty(str)
This will instruct App Engine to index the uploaded field as a db.StringProperty().

In AppEngine (JDO), what is the difference between equality (==) of an item with list and contains() function?

For example, if I have: List A; and a String B;
What is the difference, in JDO (AppEngine), between the following two conditions in a query: B == A; and A.contains(B);?
Also, does the query in Slides 23-25 of http://dl.google.com/io/2009/pres/W_0415_Building_Scalable_Complex_App_Engines.pdf work efficiently in AppEngine (JDO) for more than 30 receivers? How so, especially since I read in AppEngine documentation that each contains() query can have a maximum of 30 items in the list. Do I not use a contains() query to imitate the above slides (written in Python)? If not, then how can I achieve the same results in JDO?
Any suggestions/comments are highly welcome. I'm trying to build a messaging system in AppEngine but having trouble trying to get used to the platform.
Thanks.
There's no difference - in App Engine, equality checks on lists are the same as checking for containment, due to the way things are indexed in the datastore.
By the query on slides 23-25, I presume you mean this one?
indexes = db.GqlQuery(
"SELECT __key__ FROM MessageIndex "
"WHERE receivers = :1", me)
keys = [k.parent() for k in indexes]
messages = db.get(keys)
This works just fine, as it's a list containment check as described above, and results in a single datastore query. The limitation you're thinking about is on the reverse: if you have a list, and you want to find a record that contains any item in that list, a subquery will be created for each element in the list.

Google App Engine: Is it possible to do a Gql LIKE query?

Simple one really. In SQL, if I want to search a text field for a couple of characters, I can do:
SELECT blah FROM blah WHERE blah LIKE '%text%'
The documentation for App Engine makes no mention of how to achieve this, but surely it's a common enough problem?
BigTable, which is the database back end for App Engine, will scale to millions of records. Due to this, App Engine will not allow you to do any query that will result in a table scan, as performance would be dreadful for a well populated table.
In other words, every query must use an index. This is why you can only do =, > and < queries. (In fact you can also do != but the API does this using a a combination of > and < queries.) This is also why the development environment monitors all the queries you do and automatically adds any missing indexes to your index.yaml file.
There is no way to index for a LIKE query so it's simply not available.
Have a watch of this Google IO session for a much better and more detailed explanation of this.
i'm facing the same problem, but i found something on google app engine pages:
Tip: Query filters do not have an explicit way to match just part of a string value, but you can fake a prefix match using inequality filters:
db.GqlQuery("SELECT * FROM MyModel WHERE prop >= :1 AND prop < :2",
"abc",
u"abc" + u"\ufffd")
This matches every MyModel entity with a string property prop that begins with the characters abc. The unicode string u"\ufffd" represents the largest possible Unicode character. When the property values are sorted in an index, the values that fall in this range are all of the values that begin with the given prefix.
http://code.google.com/appengine/docs/python/datastore/queriesandindexes.html
maybe this could do the trick ;)
Altough App Engine does not support LIKE queries, have a look at the properties ListProperty and StringListProperty. When an equality test is done on these properties, the test will actually be applied on all list members, e.g., list_property = value tests if the value appears anywhere in the list.
Sometimes this feature might be used as a workaround to the lack of LIKE queries. For instance, it makes it possible to do simple text search, as described on this post.
You need to use search service to perform full text search queries similar to SQL LIKE.
Gaelyk provides domain specific language to perform more user friendly search queries. For example following snippet will find first ten books sorted from the latest ones with title containing fern
and the genre exactly matching thriller:
def documents = search.search {
select all from books
sort desc by published, SearchApiLimits.MINIMUM_DATE_VALUE
where title =~ 'fern'
and genre = 'thriller'
limit 10
}
Like is written as Groovy's match operator =~.
It supports functions such as distance(geopoint(lat, lon), location) as well.
App engine launched a general-purpose full text search service in version 1.7.0 that supports the datastore.
Details in the announcement.
More information on how to use this: https://cloud.google.com/appengine/training/fts_intro/lesson2
Have a look at Objectify here , it is like a Datastore access API. There is a FAQ with this question specifically, here is the answer
How do I do a like query (LIKE "foo%")
You can do something like a startWith, or endWith if you reverse the order when stored and searched. You do a range query with the starting value you want, and a value just above the one you want.
String start = "foo";
... = ofy.query(MyEntity.class).filter("field >=", start).filter("field <", start + "\uFFFD");
Just follow here:
init.py#354">http://code.google.com/p/googleappengine/source/browse/trunk/python/google/appengine/ext/search/init.py#354
It works!
class Article(search.SearchableModel):
text = db.TextProperty()
...
article = Article(text=...)
article.save()
To search the full text index, use the SearchableModel.all() method to get an
instance of SearchableModel.Query, which subclasses db.Query. Use its search()
method to provide a search query, in addition to any other filters or sort
orders, e.g.:
query = article.all().search('a search query').filter(...).order(...)
I tested this with GAE Datastore low-level Java API. Me and works perfectly
Query q = new Query(Directorio.class.getSimpleName());
Filter filterNombreGreater = new FilterPredicate("nombre", FilterOperator.GREATER_THAN_OR_EQUAL, query);
Filter filterNombreLess = new FilterPredicate("nombre", FilterOperator.LESS_THAN, query+"\uFFFD");
Filter filterNombre = CompositeFilterOperator.and(filterNombreGreater, filterNombreLess);
q.setFilter(filter);
In general, even though this is an old post, a way to produce a 'LIKE' or 'ILIKE' is to gather all results from a '>=' query, then loop results in python (or Java) for elements containing what you're looking for.
Let's say you want to filter users given a q='luigi'
users = []
qry = self.user_model.query(ndb.OR(self.user_model.name >= q.lower(),self.user_model.email >= q.lower(),self.user_model.username >= q.lower()))
for _qry in qry:
if q.lower() in _qry.name.lower() or q.lower() in _qry.email.lower() or q.lower() in _qry.username.lower():
users.append(_qry)
It is not possible to do a LIKE search on datastore app engine, how ever creating an Arraylist would do the trick if you need to search a word in a string.
#Index
public ArrayList<String> searchName;
and then to search in the index using objectify.
List<Profiles> list1 = ofy().load().type(Profiles.class).filter("searchName =",search).list();
and this will give you a list with all the items that contain the world you did on the search
If the LIKE '%text%' always compares to a word or a few (think permutations) and your data changes slowly (slowly means that it's not prohibitively expensive - both price-wise and performance-wise - to create and updates indexes) then Relation Index Entity (RIE) may be the answer.
Yes, you will have to build additional datastore entity and populate it appropriately. Yes, there are some constraints that you will have to play around (one is 5000 limit on the length of list property in GAE datastore). But the resulting searches are lightning fast.
For details see my RIE with Java and Ojbectify and RIE with Python posts.
"Like" is often uses as a poor-man's substitute for text search. For text search, it is possible to use Whoosh-AppEngine.

Resources