How to select out of a one to many property - google-app-engine

I have an appengine app which has been running for about a year now, i have mainly been using JDO queries until now, but i am trying to collect stats and the queries are taking too long. I have the following entity (Device)
public class Device implements Serializable{
...
#Persistent
private Set<Key> feeds;// Key for the Feed entity
...
}
So I want to get a count of how many Devices have a certain Feed. I was doing it in JDOQL before as such (uses javax.jdo.Query):
Query query = pm.newQuery("select from Device where feeds.contains(:feedKey)");
Map<String, Object> paramsf = new HashMap<String, Object>();
paramsf.put("feedKey",feed.getId());
List<Device> results = (List<Device>) query.executeWithMap(paramsf);
Though this code times out now. I was trying to use the Datastore API so I could set chunk size,etc to see if i could speed the query up or use a cursor, but I am unsure how to search in a Set field. I was trying this (uses com.google.appengine.api.datastore.Query)
Query query = new Query("Device");
query.addFilter("feeds", FilterOperator.IN, feed.getId());
query.setKeysOnly();
final FetchOptions fetchOptions = FetchOptions.Builder.withPrefetchSize(100).chunkSize(100);
QueryResultList<Entity> results = dss.prepare(query).asQueryResultList(fetchOptions);
Essentially i am unsure how to search in the one-to-many filed (feeds) for a single key. Is it possible to index it somehow?
hope it makes sense....

Lists (and other things that are implemented as lists, like sets) are indexed individually. As a result, you can simply use an equality filter in your query, the same as if you were filtering on a single value rather than a list. A record will be returned if any of the items in the list match.

Related

data retrievement is so slow in mongoDB

im using mongoDB alongside spring boot
i'm wonder is there any way to retrieve data from DBref collection in an objective way?
for example in my person class there is #DBref private UserAccountType userAccountType and in userAccount we have an enum calls Role. If in a method i get a role and want to find all persons with Role.Driver, i should first find all userAccountType that their role is driver and then find all persons who has that userAccountType id.
something like this
Query query = new Query();
query.addCriteria(Criteria.where("staticRole").in(StaticRole.DRIVER_BRONZE, StaticRole.DRIVER_SILVER));
query.fields().include("id");
List<UserAccountType> userAccountTypeList = mongoOperations.find(query, UserAccountType.class);
query = new Query();
Criteria criteria = new Criteria();
criteria.orOperator(
Criteria.where("phone").is(phone),
Criteria.where("phone2").is(phone)
);
query.addCriteria(criteria);
query.addCriteria(Criteria.where("userAccount.userAccountType").in(userAccountTypeList));
Person person = mongoOperations.findOne(query, Person.class);
and all this lookups make my app very slow. We know our database design is so bad. but we have no time. is there any way to do this operations simply?

Controlling NHIbernate search query output regarding parameters

When you use NHibernate to "fetch" a mapped object, it outputs a SELECT query to the database. It outputs this using parameters; so if I query a list of cars based on tenant ID and name, I get:
select Name, Location from Car where tenantID=#p0 and Name=#p1
This has the nice benefit of our database creating (and caching) a query plan based on this query and the result, so when it is run again, the query is much faster as it can load the plan from the cache.
The problem with this is that we are a multi-tenant database, and almost all of our indexes are partition aligned. Our tenants have vastly different data sets; one tenant could have 5 cars, while another could have 50,000. And so because NHibernate does this, it has the net effect of our database creating and caching a plan for the FIRST tenant that runs it. This plan is likely not efficient for subsequent tenants who run the query.
What I WANT to do is force NHibernate NOT to parameterize certain parameters; namely, the tenant ID. So I'd want the query to read:
select Name, Location from Car where tenantID=55 and Name=#p0
I can't figure out how to do this in the HBM.XML mapping. How can I dictate to NHibernate how to use parameters? Or can I just turn parameters off altogether?
OK everyone, I figured it out.
The way I did it was overriding the SqlClientDriver with my own custom driver that looks like this:
public class CustomSqlClientDriver : SqlClientDriver
{
private static Regex _partitionKeyReplacer = new Regex(#".PartitionKey=(#p0)", RegexOptions.Compiled);
public override void AdjustCommand(IDbCommand command)
{
var m = _tenantIDReplacer.Match(command.CommandText);
if (!m.Success)
return;
// replace the first parameter with the actual partition key
var parameterName = m.Groups[1].Value;
// find the parameter value
var tenantID = (IDbDataParameter ) command.Parameters[parameterName];
var valueOfTenantID = tenantID.Value;
// now replace the string
command.CommandText = _tenantIDReplacer.Replace(command.CommandText, ".TenantID=" + valueOfTenantID);
}
} }
I override the AdjustCommand method and use a Regex to replace the tenantID. This works; not sure if there's a better way, but I really didn't want to have to open up NHibernate and start messing with core code.
You'll have to register this custom driver in the connection.driver_class property of the SessionFactory upon initialization.
Hope this helps somebody!

GAE fetch() distinct values ordered by time

I have a model:
class MyModel(db.Model):
ts = db.DateTimeProperty(auto_now_add=True)
id_from_other_source = db.StringProperty(default='')
#some data
Now I have a list of some ids which match id_from_other_source field.
Data about id changes in time, so for one id there is a lot of data.
I'd like to run such a query that fetches me for each id only one entry of that id that is the youngest.
Something like:
MyModel.all().filter('id_from_other_source IN', my_id_list).order('-ts').fetch(1000)
But with disctinction in id_from_other_source. I understand that you can't run GQL queries with DISTINCT, but maybe you can see any solution that won't run too much queries?
One solution is to take them one by one and fetch the result, but I'd really like to do it with lesser number of queries.

objectify query filter by list in entity contains search parameter

in an app i have an entity that contains a list of other entities (let's say an event holding a list of assigned employees)
using objectify - i need to find all the events a particular employee is assigned to.
is there a basic way to filter a query if it contains the parameter - kind of the opposite of the query in
... quick pseudocode
findAll(Employee employee) {
...
return ofy.query(Event.class).filter("employees.contains", employee).list();
}
any help would be greatly appreciated
i tried just doing filter("employees", employee) after seeing this http://groups.google.com/group/objectify-appengine/browse_thread/thread/77ba676192c08e20 - but unfortunately this returns me an empty list
currently i'm doing something really inefficient - going through each event, iterating through the employees and adding them to a new list if it contains the given employee just to have something that works - i know this is not right though
let me add one thing,
the above query is not actually what it is, i was just using that because i did not think this would make a difference.
The Employee and Events are in the same entity group with Business as a parent
the actual query i am using is the following
ofy.query(Event.class).ancestor(businessKey).filter("employees", employee).list();
unfortunately this is still returning an empty list - does having the ancestor(key) in there mess up the filter?
solution, the employees field was not indexed correctly.
I added the datastore-indexes file to create a composite index, but was testing originally on a value that I added before the employees field was indexed, this was something stupid i was doing - simply having an index on the "business" field and the "employees" field fixed everything. the datastore-indexes file did not appear to be necessary, after deleting it and trying again everything worked fine.
Generally, you do this one of two ways:
Put a property of Set<Key<Employee>> on the Event
or
Put a property of Set<Key<Event>> on the Employee
You could also create a relationship entity, but if you're just doing filtering on values with relatively low counts, usually it's easier to just put the set property on one entity or the other.
Then filter as you describe:
ofy.query(Event.class).filter("employees", employee).list()
or
ofy.query(Employee.class).filter("events", event).list()
The list property should hold a Keys to the target entity. If you pass in an entity to the filter() method, Objectify will understand that you want to filter by the key instead.
Example :
/***************************************************/
#Entity
#Cache
public class News {
#Id Long id;
String news ;
#Index List<Long> friend_list = new ArrayList<Long>();
// My friends who can see my news , exemele : friend_list.add(id_f1); friend_list.add(id_f2); friend_list.add(id_f3);
//To make an operation on "friend_list", it is obligatory to index it
}
/*************************************************/
public News(Long id_f){
List<Long> friend_id = new ArrayList<Long>();
friend_id.add(id_f);
Query<Nesw> query = ofy().load().type(News.class).filter("friend_list in",friend_id).limit(limit);
//To filter a list, just after the name of the field you want to filter, add "IN".
//here ==> .filter("friend_list in",friend_id);
// if friend_list contains "id_friend" ==> the query return value
.........
}

Solr Custom RequestHandler - optimizing results

Yet another potentially embarrassing question. Please feel free to point any obvious solution that may have been overlooked - I have searched for solutions previously and found nothing, but sometimes it's a matter of choosing the wrong keywords to search for.
Here's the situation: coded my own RequestHandler a few months ago for an enterprise-y system, in order to inject a few necessary security parameters as an extra filter in all queries made to the solr core. Everything runs smoothly until the part where the docs resulting from a query to the index are collected and then returned to the user.
Basically after the filter is created and the query is executed we get a set of document ids (and scores), but then we have to iterate through the ids in order to build the result set, one hit at a time - which is a good 10x slower that querying the standard requesthandler, and only bound to get worse as the number of results increase. Even worse, since our schema heavily relies on dynamic fields for flexibility, there is no way (that I know of) of previously retrieving the list of fields to retrieve per document, other than testing all possible combinations per doc.
The code below is a simplified version of the one running in production, for querying the SolrIndexSearcher and building the response.
Without further ado, my questions are:
is there any way of retrieving all results at once, instead of building a response document by document?
is there any possibility of getting the list of fields on each result, instead of testing all possible combinations?
any particular WTFs in this code that I should be aware of? Feel free to kick me!
//function that queries index and handles results
private void searchCore(SolrIndexSearcher searcher, Query query,
Filter filter, int num, SolrDocumentList results) {
//Executes the query
TopDocs col = searcher.search(query,filter, num);
//results
ScoreDoc[] docs = col.scoreDocs;
//iterate & build documents
for (ScoreDoc hit : docs) {
Document doc = reader.document(hit.doc);
SolrDocument sdoc = new SolrDocument();
for(Object f : doc.getFields()) {
Field fd = ((Field) f);
//strings
if (fd.isStored() && (fd.stringValue() != null))
sdoc.addField(fd.name(), fd.stringValue());
else if(fd.isStored()) {
//Dynamic Longs
if (fd.name().matches(".*_l") ) {
ByteBuffer a = ByteBuffer.wrap(fd.getBinaryValue(),
fd.getBinaryOffset(), fd.getBinaryLength());
long testLong = a.getLong(0);
sdoc.addField(fd.name(), testLong );
}
//Dynamic Dates
else if(fd.name().matches(".*_dt")) {
ByteBuffer a = ByteBuffer.wrap(fd.getBinaryValue(),
fd.getBinaryOffset(), fd.getBinaryLength());
Date dt = new Date(a.getLong());
sdoc.addField(fd.name(), dt );
}
//...
}
}
results.add(sdoc);
}
}
Per OPs request:
Although this doesn't answer your specific question, I would suggest another option to solve your problem.
To add a Filter to all queries, you can add an "appends" section to the StandardRequestHandler in the SolrConfig.xml file. Add a "fl" (stands for filter) section and add your filter. Every request piped through the StandardRequestHandler will have the filter appended to it automatically.
This filter is treated like any other, so it is cached in the FilterCache. The result is fairly fast filtering (through docIds) at query time. This may allow you to avoid having to pull the individual documents in your solution to apply the filtering criteria.

Resources