How to query a particular object without it's embedded objects or collections? - google-app-engine

I have a class, lets say Blarkar. Blarkar has an embed class kar. Sometimes when I query for an instance of Blarkar I want the complete object, but other times I don't need all its embed objects and their embed objects. How do I load an object without its embed objects?

You can't. GAE loads an entity whole or not at all. Generally this is not a problem and you shouldn't try to optimize unless you know you have a real issue. But if so, you can split your entity into multiple parts, eg User and UserExtraStuff.
There is a special type of query called a projection query, but this is not likely going to be useful - it lets you select some data out of an index without doing a full entity lookup. It's only useful in limited types of inequality queries. The data has to be in the index.

Related

Excluding bad data models during query

We have mongo data models that are written by multiple systems; currently, a bug in a different system can corrupt a single document in a collection such that it can no longer be mapped to the correct Java object (for example, a missing _class attribute in a subdocument will cause an instantiation exception). When we then query for all documents in the collection using Java, the entire query fails due to the single bad document.
We would like to use an approach which is tolerant of instantiation exceptions; the intent is for any bad documents to be discarded, while still returning objects for all the documents that can be mapped.
Could you please advise the best approach to achieve this outcome?
I think you should be able to mark this field as #Transient in entity to make SpringData to ignore this field in MongoDB communication.

How to filter a parent entity using properties of child entity in datastore

I am using Google App Engine (Java) for my REST backend and google-datastore as the database and using objectify to access the database.
I want to create a unit entity to store Units where a unit can be a component or an assembled unit , basically an assembled unit is made up of multiple components and also has some properties of its own. There can be multiple types of assembled units and multiple types of components.
The Entity class would be something like
public class UnitEntity {
Long unitId;
String serialNumber;
String state;
String unitType;
Long parentId;// -> this would be null for all assembled units and in case of components, if they are part of any assembled unit, it will be the Id of the assembled unit (This is added so that I can list all components of a particular assembled unit)
UnitParameters unitParameters;
}
Here UnitParameters would be a polymorphic class to contain properties specific to a unit type, that is, based on the value of "unitType", there would be different classes which extend "UnitParameters".
Let's assume one the components (let's say component1, that is , unitType=component1) has a property called modelNumber. This property would be stored in the unitParameters of the all the entities where unitType=component1.
Now I want to able to list units where unitType=assembledUnit1 and which have a child component1 whose modelNumber is 2.0.
(I can easily get list units of type component1 where modelNumber is 2.0 , but I want to be able to get the parent entity also)
So basically here I am trying to get parent entities by filtering on the properties of children.
I want to know whether this is possible with datastore and objectify? Is there any way to achieve this functionality?
Update - Follow-up question based on the Answer by #stickfigure:
If I go with google cloud sql (which is based on mysql) for my use case, then how should I model my data ?
I initially thought of having a table for each unitType. Let's say there are 3 unitTypes - assembledUnit1, assembledUnit2 and component1. Now if I want to have an API which lists the details of each unit , how can I achieve this with cloud sql.
This is something I could have done with datastore since all the entities were of the same "kind".
I can obviously have separate APIs to list all units of type assembledUnit1, assembledUnit2 etc., but how can I have a single API which could list could list all the units ?
Also in this approach, if someone calls the REST API GET /units/{unitId} , I suppose I would have to check for the unitId in each of the tables which doesn't seem correct?
I suppose one way by which this could be solved is to just have one table called "Unit" whose columns would be a superset of the columns of all the unitTypes. However I don't think this is a good way of designing since there would be a lot of empty columns for each row and also the schema would have to be changed if a new unitType is added.
The datastore doesn’t do joins. So you’re left with two options, either 1) do the join yourself via fetching or 2) denormalize some of the child data into the parent and index it there. Which strategy works best will vary depending on the shape of your data and performance/cost considerations.
I should add there is a third option which is “store an external index of some of your data in another kind of database, such as the search api or an RDBMS”.
This is not always a very satisfying answer - the ability to do joins and aggregations in an RDBMS is incredibly useful. If you have highly relational data with modest size/traffic/reliability requirements, you may want to use something like Cloud SQL instead.

Use Objectify to get projection of entity by Id

I have the id of an entity from which I only need a single field. Is there a way to get that projection or must I fetch the whole entity? Here is the code that I thought should do it.
bookKey =OfyService.ofy().load().type(Page.class).id(pageId).project("bookKey").now();
The datastore is a key-value store which loads objects whole, not field-by-field. This is quite different from how you work with a relational database.
There is an exception to this which allows you to load data directly out of an index (projection queries), however it is a performance optimization with very limited and specific use. In general, if you don't understand the fairly exotic detail of how projections work, you should not be using them - it's a premature optimization.

What pattern applies to encapsulating "contextual" queries?

At the moment, my project at work has a very inefficient loop which is suffering the n + 1 problem to a great degree. (6n + 1, I think.) Currently, a number of web services instantiate an object whose constructor builds a canonical representation of one of our ORM objects -- call them Foo and FooView(). There are a number of places where a collection of Foo is built; each instance of Foo is passed to FooView and has its (pseudo-)foreign key fields queried in another database to build a textual representation, so that, for example, we can return <fooColor>Blue</fooColor> rather than <fooColor>5</fooColor>. The sets of these properties--Colors, Shapes, and other similarly general properties--are relatively small, and obviously should be pulled into memory.
There is also another, more complex query, which is contributing to the 6n + 1 problem. This is a set of metadata fields. Each Foo has a Source. Each Source can have one, none, or many metadata fields defined for their subset of Foos. Empty XML tags are required for metadata fields which apply to a given Foo's Source. Currently, the four(!) ORM queries(!) used to build this XML are located inside the FooView constructor, meaning they get executed for each and every Foo.
My goal is as follows:
Query for general properties, like Color, Shapes, etc. before anything else.
Run the query to generate the collection of Foo. Store the primary keys in a list.
Using the list of primary keys, run the heinous multi-join, raw SQL query to generate Foo.Metadata.
Call FooView, providing the collection of Foo along with a context object containing the items built in steps 1 and 3. FooView will provide the interleaving logic, using the context object rather than database lookups.
Is this a sound practice? It will certainly solve some of the performance problems in generating the FooView, but where should this thing live? Should I call it FooHelper? FooContext? FooService? Is this a design pattern, or is there one I should be using to make this more logical?
Thanks!

Data storage: "grouping" entities by property value? (like a dictionary/map?)

Using AppEngine datastore, but this might be agnostic, no idea.
Assume a database entity called Comment. Each Comment belongs to a User. Every Comment has a date property, pretty standard so far.
I want something that will let me: specify a User and get back a dictionary-ish (coming from a Python background, pardon. Hash table, map, however it should be called in this context) data structure where:
keys: every date appearing in the User's comment
values: Comments that were made on date.
I guess I could just iterate over a range of dates an build a map like this myself, but I seriously doubt I need to "invent" my own solution here.
Is there a way/tool/technique to do this?
Datastore supports both references and list properties. This let's you build one-to-many relationships in two ways:
Parent (User) has a list property containing keys of Child entities (Comment).
Child has a key property pointing to Parent.
Since you need to limit Comments by date, you'd best go with option two. Then you could query Comments which have date=somedate (or date range) and where user=someuserkey.
There is no native grouping functionality in Datastore, so to also "group" by date, you can add a sort on date to the query. Than when you iterate over the result, when the date changes you can use/store it as a grouping key.
Update
Designing no-sql databases should be access-oriented (versus datamodel oriented in sql): for often-used operations you should be getting data out as cheaply (= as few operations) as possible.
So, as a rule of thumb you should, in one operation, only get data that is needed at that moment (= shown on that page to user). I'm not sure about your app's design, but I doubt you need all user's full comments (with text and everything) at one time.
I'd start by saying you shouldn't apologize for having a Python background. App Engine started supporting only Python. Using the db module, you could have a User entity as the parent of several DailyCommentBatch entities each a parent of a couple Comment entities. IIRC, this will keep all related entities stored together (or close).
If you are using the NDB (I love it) you may have employ a StructuredProperty either at the User or DailyCommentBatch levels.

Resources