Need to query a Collection attribute that has #IgnoreLoad annotation - google-app-engine

I have a LinkedHashSet<> attribute in my entity that I don't want to load when the entity is retrieved. However, I do need to query against it.
When I tried to construct the IN clause inside the filter() method from the Query class, it requires a Collection as the 2nd parameter. Since the LinkedHashSet attribute is not loaded, the query doesn't work. Is there another way that I can query the LinkedHashSet<> attribute or I have to build out a separate entity (which I really hate to do).
Thanks!

The ability to query has nothing to do with what is loaded in your entity. Queries operate on indexes in the datastore. Putting #IgnoreLoad on a field does not affect what is in the datastore. However, if you load and then subsequently save an entity with that annotation, you will wipe out the indexed data in the datastore.

Related

Excluding bad data models during query

We have mongo data models that are written by multiple systems; currently, a bug in a different system can corrupt a single document in a collection such that it can no longer be mapped to the correct Java object (for example, a missing _class attribute in a subdocument will cause an instantiation exception). When we then query for all documents in the collection using Java, the entire query fails due to the single bad document.
We would like to use an approach which is tolerant of instantiation exceptions; the intent is for any bad documents to be discarded, while still returning objects for all the documents that can be mapped.
Could you please advise the best approach to achieve this outcome?
I think you should be able to mark this field as #Transient in entity to make SpringData to ignore this field in MongoDB communication.

Google Datastore, "IN" query filter and pagination

We have an application running in Google App Engine and storing data in Google Datastore.
For a given Datastore kind, all our entities have a property type.
We are interested in running a query with an IN query filter to fetch multiple types at once, something like:
type in ['event', 'comment', 'custom']
As there are thousands of entities within this kind, pagination is needed.
The problem we are having is that it is a known limitation of the Datastore that queries with "IN" filters do not support cursor.
Are there sensible ways to get around this limitation?
Using offset would be costy and not performant. Also we can't fetch all entities and filter in the client as we are building an API, hence we don't develop the client ourselves.
Any hint would be really appreciated, thanks!
IN filter results in individual EQUAL queries for each item on the list. This is why they do not support cursors - in your case, there will be 3 distinct positions in the index after your run the IN query.
Consider instead adding another property to your entity, which will serve as a flag for this type of API call: its value will be "true" if a type is in ['event', 'comment', 'custom'], or "false" otherwise. Maybe this flag may allow you to make "type" property unindexed - that would be an additional benefit.
With this new indexed property you can use a regular EQUAL filter. It will be faster (1 query instead of 3), and you can use cursors for pagination.

How to query a particular object without it's embedded objects or collections?

I have a class, lets say Blarkar. Blarkar has an embed class kar. Sometimes when I query for an instance of Blarkar I want the complete object, but other times I don't need all its embed objects and their embed objects. How do I load an object without its embed objects?
You can't. GAE loads an entity whole or not at all. Generally this is not a problem and you shouldn't try to optimize unless you know you have a real issue. But if so, you can split your entity into multiple parts, eg User and UserExtraStuff.
There is a special type of query called a projection query, but this is not likely going to be useful - it lets you select some data out of an index without doing a full entity lookup. It's only useful in limited types of inequality queries. The data has to be in the index.

How can I discover if a property of a stored Entity is indexed or unindexed?

I have several entities in datastore, but I don't know if some of their properties are indexed or unindexed.
How can I discover (with admin console or programatically) if a property of a stored Entity is indexed or unindexed?
By default each entity is indexed (unless its TextProperty or BlobProperty), you need (and should) set the property indexed property to False if you don't want it to be indexed (to improve performance and entity writing costs).
There is no indication in the admin console on if a property is indexed or not, You can try to execute "select * from EntityType order by Property" in the GQL of the datastore views and see if it fails.
If you've been flipping between indexed=True and indexed=False on some properties over time, and have a set of entities written under both regimes, then you'll have some properties that are indexed and some that aren't. Is this the situation you're in?
If you don't have reliable history on your code, trying to determine if you're in this situation is a bit tricky, depending on how many entities you have. You can determine if you're in an inconsistent state by noting if a keys-only query on an Entity returns a different number of keys than a query that filters on the suspect property. A filter won't find unindexed properties. If you've got a lot of entities, you'll have to shard the counting somehow (to avoid timing out on a long query that returns lots of entities).
If you determine that you do have inconsistent indexing and want to repair your entities to be consistent, the usual approach is to write a mapreduce that touches all of your unstable entities and issues puts on the necessary properties.
Take a look at "Datastore Indexes" interface, link for which is located on the left navigation menu in app engine dashboard.
There you'll see list of indexes and the specific properties on which an index has been applied.
For composite indexes (i.e. the one defined in datastore-indexes.xml or index.yaml), you could use the low-level API to get the list of indexes that are present in your app's datastore.
In GAE/J, you would need to invoke DatastoreServiceFactory.getDatastoreService().getIndexes(), while in Python, the same function is provided by db.get_indexes().

Different EF Property DataType than Storage Layer Possible?

I am putting together a WCF Data Service for PatientEntities using Entity Framework.
My solution needs to address these requirements:
Property DateOfBirth of entity Patient is stored in SQL Server as string. It would be ideal if the entity class did not also use the "string" type but rather a DateTime type. (I would expect this to be possible since we're abstracting away from the storage layer). Where could a conversion mechanism be put in place that would convert to and from DateTime/string so that the entity and SQL Server are in sync?. I cannot change the storage layer's structure, so I have to work around it.
WCF Data Services (Read-only, so no need for saving changes) need to be used since clients will be able to use LINQ expressions to consume the service. They can generate results based on any given query scenario they need and not be constrained by a single method such as GetPatient(int ID).
I've tried to use DTOs, but run into problem of mapping the ObjectContext to a DTO, I don't think that is theoretically possible...or too complicated if it is.
I've tried to use Self Tracking Entities but they require the metadata from the .edmx file if I'm correct, and this isn't allowing a different property data type.
I also want to add customizations to my Entity getter methods so that a property "MRN" of type "string" needs to have .Replace("MR~", string.Empty) performed before it is returned. I can add this to the getter methods but the problem with that is Entity Framework will overwrite that next time it refreshes the entity classes. Is there a permanent place I can put these?
Should I use POCO instead? How would that work with WCF Data Services? Where would the service grab the metadata?
This is definitely possible. What you need to use is QueryView which lets you control how a given column maps to a property on your entity. for instance here is something u could do on patients entity.
<EntitySetMapping Name="Patients">
<QueryView>
select value conceptualnamespace.Patient(p.PatientId,
cast(p.DateOfBirth as Edm.DateTime),
replace(p.Name,'MR~','')
from entitycontainer.Patients as p
</QueryView>
</EntitySetMapping>
i cover more on this concept in my book. the recipe is called.
15-2. Mapping an Entity to Customized Parts of One or More Tables

Resources