Elasticsearch equivalent of NOT IN (SELECT ..) - arrays

We have something like this (we save an item for each episode, show and user. And all items are saved on the same index.):
and we need to list all episodes from a show, but we need to filter the episodes already watched by the user.
In PostgreSQL i would have run something like this:
SELECT * FROM Episode WHERE show_id = 'showID' AND episode_id NOT IN (SELECT episode_id FROM watched WHERE user_id = 'userId')
But since we are using Elasticsearch, we are not sure what would be the alternative.
We also assume that the watched list is going to scale, so we can't just send an array of ids to ES.
Is there a way for Elasticsearch to handle all the logic, and filter from a dynamic array of values on a single query.

We just used joins, and has_child.
We filter all parents that have a watched child item.

Related

Query of records based on relationships

I have an object Contract that has a look-up to another object Indexationtype. I have another object IndexationEntry that has master-detail to Indexationtype. Now I would like to get the value of the percentage field in the IndexationEntry onto Contract based on the yer fields. The year in the IndexationEntry matches Year in Contract. How should I achieve this?
From Contract "up" to IndexationType__c, then "down" to IndexationEntry__c?
If there's no direct link between them it's not going to be pretty. One way would be something like this
SELECT Id, Name,
(SELECT Id, ContractNumber FROM Contracts__r WHERE Year__c = '2021'),
(SELECT Id, Percent__c FROM IndexationEntries__r WHERE Year__c = '2021')
FROM IndexationType__c
You'd have to run it once for each year. Or (since you tagged it Apex) maybe you can prepare the reference data a bit, query Indexation Types + Entries and build something like Map<Id, Map<Integer, IndexationEntry__c>> (1st key is by Indexation Type Id, then by year). Query them, populate the Map, then loop through your contracts and use map.get() to fetch your values.

Query of Arrays in Salesforce

I need to do 1 of two things (I believe):
1- Get a Custom Object ID so I can query it directly
2- Get a list of values of a specific field within the Object entries.
Ultimate End goal:
Add and modify rows in my custom object via external API. However to do this I need to check and make sure my new entry/row does not already exist.
What I have:
I have a custom object (called Customer_Arrays__c). It is a table that I can add new rows to (I will call entrys). Each entry has 6 or 7 fields. 1 of these fields is called (external_ID__c). This is the field I utilize to match to new incoming data to see if the entry already exists, or if it needs to add a new row to my table. This Customer_Arrays__c is a child to my opportunity I believe – it is part of every opportunity and each line item I add has a field defaulted to the opportunity.
Help I need:
1- How do I query the value of my Cutomer_Arrays__c based upon an opportunity ID?
2- How do I query a list of values in my (external_ID__c) based upon an opportunity ID?
Thanks for your help! I have read half a dozen+ posts on similar topics and am missing something. Examples of some Past try's that failed:
Select external_ID__c,FROM Custom_Arrays__c WHERE Opportunity='00...'
Select Id (Select ID, Custom_Arrays__c from Custom_Arrays__c) from Opportunity where id ='00...'
List FROM Custom_Arrays__c WHERE Opportunity='00...'
Select Id, external_ID__c, (Select external_ID__c FROM Custom_Arrays__c) WHERE Opportunity__c='00...'
Thanks again!
Only you know how did you name the lookup field (foreign key) from arrays to Opportunity. You'll need to check in setup, next to where external_ID__c is. Since it's a custom field (gets __c at the end), my guess is you went with default.
Try
SELECT Id, Name, External_Id__c
FROM Customer_Arrays__c
WHERE Opportunity__c = '006...'
Thank you eyescream, that got me almost all the way there. Turns out I also needed a __r for the parent child relationship.
Here is a snip out of my final code that works - I think it covers everything:
SELECT Field1__c, Opportunity__r.Id, Opportunity__r.Opportunity__c,
FROM Customer_Arrays__c
WHERE Opportunity__r.Id = '006...'.
Thank you so very much!!!

GAE fetch() distinct values ordered by time

I have a model:
class MyModel(db.Model):
ts = db.DateTimeProperty(auto_now_add=True)
id_from_other_source = db.StringProperty(default='')
#some data
Now I have a list of some ids which match id_from_other_source field.
Data about id changes in time, so for one id there is a lot of data.
I'd like to run such a query that fetches me for each id only one entry of that id that is the youngest.
Something like:
MyModel.all().filter('id_from_other_source IN', my_id_list).order('-ts').fetch(1000)
But with disctinction in id_from_other_source. I understand that you can't run GQL queries with DISTINCT, but maybe you can see any solution that won't run too much queries?
One solution is to take them one by one and fetch the result, but I'd really like to do it with lesser number of queries.

Datastore fetch on two filters alternative?

I have a datastore entity called Game and two fields in it called playerOne and playerTwo. Either of these fields stores a username.
I need to search on the Game entity and return a MAX of 30 games where the username can be either playerOne OR playerTwo...
So in a relational database you would go:
SELECT * FROM Game WHERE playerOne='username' OR playerTwo='username' LIMIT 30
But in big table you can't filter on more than one field! I can't fetch 10 from one and 10 from the other as the number from each can be variable and in createdDate order.
How would you do this in your datastore?
The quick answer is create a StringListProperty that contains [player_a, player_b] and then simply use the multi-value index made out of that:
games = Game.all().filter("players =", player_find)
You can not do an OR query on the datastore using different fields. If you have to keep your current entity model then you have to do two queries.
1) filtering on playerOne and limiting to 30
2) filtering on playerTwo and limiting to (30 - result size of query one)
Then merge the results in memory to produce the final set of 30.
Now if you also want some ordering by date, then it will get more tricky. However the SQL query you wrote doesn't have any ordering so I omitted it aswell.
However if you can change the entity model then a good way to achive what you want is to have a single field containing a list of both usernames.
Then you can do a simple query in the style of:
SELECT * FROM Game WHERE playerBoth = 'username'

Adding a projection to an NHibernate criteria stops it from performing default entity selection

I'm writing an NHibernate criteria that selects data supporting paging. I'm using the COUNT(*) OVER() expression from SQL Server 2005(+) to get hold of the total number of available rows, as suggested by Ayende Rahien. I need that number to be able to calculate how many pages there are in total. The beauty of this solution is that I don't need to execute a second query to get hold of the row count.
However, I can't seem to manage to write a working criteria (Ayende only provides an HQL query).
Here's an SQL query that shows what I want and it works just fine. Note that I intentionally left out the actual paging logic to focus on the problem:
SELECT Items.*, COUNT(*) OVER() AS rowcount
FROM Items
Here's the HQL:
select
item, rowcount()
from
Item item
Note that the rowcount() function is registered in a custom NHibernate dialect and resolves to COUNT(*) OVER() in SQL.
A requirement is that the query is expressed using a criteria. Unfortunately, I don't know how to get it right:
var query = Session
.CreateCriteria<Item>("item")
.SetProjection(
Projections.SqlFunction("rowcount", NHibernateUtil.Int32));
Whenever I add a projection, NHibernate doesn't select item (like it would without a projection), just the rowcount() while I really need both. Also, I can't seem to project item as a whole, only it's properties and I really don't want to list all of them.
I hope someone has a solution to this. Thanks anyway.
I think it is not possible in Criteria, it has some limits.
You could get the id and load items in a subsequent query:
var query = Session
.CreateCriteria<Item>("item")
.SetProjection(Projections.ProjectionList()
.Add(Projections.SqlFunction("rowcount", NHibernateUtil.Int32))
.Add(Projections.Id()));
If you don't like it, use HQL, you can set the maximal number of results there too:
IList<Item> result = Session
.CreateQuery("select item, rowcount() from item where ..." )
.SetMaxResult(100)
.List<Item>();
Use CreateMultiCriteria.
You can execute 2 simple statements with only one hit to the DB that way.
I am wondering why using Criteria is a requirement. Can't you use session.CreateSQLQuery? If you really must do it in one query, I would have suggested pulling back the Item objects and the count, like:
select {item.*}, count(*) over()
from Item {item}
...this way you can get back Item objects from your query, along with the count. If you experience a problem with Hibernate's caching, you can also configure the query spaces (entity/table caches) associated with a native query so that stale query cache entries will be cleared automatically.
If I understand your question properly, I have a solution. I struggled quite a bit with this same problem.
Let me quickly describe the problem I had, to make sure we're on the same page. My problem came down to paging. I want to display 10 records in the UI, but I also want to know the total number of records that matched the filter criteria. I wanted to accomplish this using the NH criteria API, but when adding a projection for row count, my query no longer worked, and I wouldn't get any results (I don't remember the specific error, but it sounds like what you're getting).
Here's my solution (copy & paste from my current production code). Note that "SessionError" is the name of the business entity I'm retrieving paged data for, according to 3 filter criterion: IsDev, IsRead, and IsResolved.
ICriteria crit = CurrentSession.CreateCriteria(typeof (SessionError))
.Add(Restrictions.Eq("WebApp", this));
if (isDev.HasValue)
crit.Add(Restrictions.Eq("IsDev", isDev.Value));
if (isRead.HasValue)
crit.Add(Restrictions.Eq("IsRead", isRead.Value));
if (isResolved.HasValue)
crit.Add(Restrictions.Eq("IsResolved", isResolved.Value));
// Order by most recent
crit.AddOrder(Order.Desc("DateCreated"));
// Copy the ICriteria query to get a row count as well
ICriteria critCount = CriteriaTransformer.Clone(crit)
.SetProjection(Projections.RowCountInt64());
critCount.Orders.Clear();
// NOW add the paging vars to the original query
crit = crit
.SetMaxResults(pageSize)
.SetFirstResult(pageNum_oneBased * pageSize);
// Set up a multi criteria to get your data in a single trip to the database
IMultiCriteria multCrit = CurrentSession.CreateMultiCriteria()
.Add(crit)
.Add(critCount);
// Get the results
IList results = multCrit.List();
List<SessionError> sessionErrors = new List<SessionError>();
foreach (SessionError sessErr in ((IList)results[0]))
sessionErrors.Add(sessErr);
numResults = (long)((IList)results[1])[0];
So I create my base criteria, with optional restrictions. Then I CLONE it, and add a row count projection to the CLONED criteria. Note that I clone it before I add the paging restrictions. Then I set up an IMultiCriteria to contain the original and cloned ICriteria objects, and use the IMultiCriteria to execute both of them. Now I have my paged data from the original ICriteria (and I only dragged the data I need across the wire), and also a raw count of how many actual records matched my criteria (useful for display or creating paging links, or whatever). This strategy has worked well for me. I hope this is helpful.
I would suggest investigating custom result transformer by calling SetResultTransformer() on your session.
Create a formula property in the class mapping:
<property name="TotalRecords" formula="count(*) over()" type="Int32" not-null="true"/>;
IList<...> result = criteria.SetFirstResult(skip).SetMaxResults(take).List<...>();
totalRecords = (result != null && result.Count > 0) ? result[0].TotalRecords : 0;
return result;

Resources