This is an example of what my data looks like for an Elastic Search index called video_service_inventory:
{
'video_service': 'netflix',
'movies' : [
{'title': 'Mission Impossible', 'genre: 'action'},
{'title': 'The Hangover', 'genre': 'comedy'},
{'title': 'Zoolander', 'genre': 'comedy'},
{'title': 'The Ring', 'genre': 'horror'}
]
}
I have established in my index that the "movies" field is of type "nested"
I want to write a query that says "get me all video_services that contain both of these movies":
{'title': 'Mission Impossible', 'genre: 'action'}
AND
{'title': 'The Ring', 'genre': 'horror'}
where, the title and genre must match. If one movie exists, but not the other, I don't want the query to return that video service.
Ideally, I would like to do this in 1 query. So far, I haven't been able to find a solution.
Anyone have suggestions for writing this search query?
the syntax may vary depending on elasticsearch version, but in general you should combine multiple nested queries within a bool - must query. For nested queries you need to specify path to "navigate" to the nested documents, and you need to qualify the properties with the part + the field name:
{
"query": {
"bool": {
"must": [
{
"nested": {
"path": "movies",
"query": {
"bool": {
"must": [
{ "terms": { "movies.title": "Mission Impossible" } },
{ "terms": { "movies.genre": "action" } }
]
}
}
}
},
{
"nested": {
"path": "movies",
"query": {
"bool": {
"must": [
{ "terms": { "movies.title": "The Ring" } },
{ "terms": { "movies.genre": "horror" } }
]
}
}
}
}
]
}
}
}
This example assumes that the title and genre fields are not analyzed properties. In newer versions of elasticsearch you may find them as a .keyword field, and you would then use "movies.genre.keyword" to query on the not analyzed version of the data.ยจ
For details on bool queries you can have a look at the documentation on the ES website:
https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-bool-query.html
For nested queries:
https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-nested-query.html
Related
I'm on Elastic Search 6.8.22
I have multiple users and each one has multiple papers ("valid" or not):
{"name":"Amy",
"papers":[
{"type":"idcard", "country":"fr", "valid":"no"},
{"type":"idcard", "country":"us", "valid":"yes"}
]}
{"name":"Brittany",
"papers":[
{"type":"idcard", "country":"fr", "valid":"no"},
{"type":"idcard", "country":"us", "valid":"no"}
]}
{"name":"Chloe",
"papers":[
{"type":"idcard", "country":"fr", "valid":"yes"},
{"type":"idcard", "country":"us", "valid":"no"}
]}
I'm trying to find only user with a paper: "valid" for "fr":
{"query": {
"bool": {
"filter": [
{"match":{"papers.valid": "yes"}},
{"match":{"papers.country": "fr"}}
]}}}
It returns Chloe, which is fine (she has a paper which is both "valid" and "fr").
But it also returns Amy; because she has one "valid" paper and another one which is "fr".
This is due to the fact that ES doesn't understand array of objects and flattens everything into values with arrays (as far as I understand).
I've tried using "combined term queries" from this link, but I guess it only works for arrays of "primitive" (not complex objects).
I've seen that I can transform arrays into nested objects to do what I need, but it seems to be overcomplicated and would slow down the queries (because of hidden joins).
My question is:
Is there any way I can search if a document has in its array of objects, one that match multiple criteria at the same time ?
(Originally, I wanted a query that checks if every "papers" in the array matched criteria, but that seems impossible, ex. all papers of type "idcard" must be "valid")
You need to define papers as a nested field in the mapping, then you can run a nested search on it
https://www.elastic.co/guide/en/elasticsearch/reference/current/nested.html
So if for example, your mapping will be this:
{
"mappings": {
"properties": {
"name": {
"type": "keyword"
},
"papers": {
"type": "nested",
"properties": {
"type": {
"type": "keyword"
},
"country": {
"type": "keyword"
},
"valid": {
"type": "keyword"
}
}
}
}
}
}
this query will work
{
"query": {
"nested": {
"path": "papers",
"query": {
"bool": {
"filter": [
{
"term": {
"papers.valid": "yes"
}
},
{
"term": {
"papers.country": "fr"
}
}
]
}
}
}
}
}
Elasticsearch version: 7.1.1
Hi, I try a lot but could not found any solution
in my index, I have a field which is containing strings.
so, for example, I have two documents containing different values in locations array.
Document 1:
"doc" : {
"locations" : [
"Cloppenburg",
"Berlin"
]
}
Document 2:
"doc" : {
"locations" : [
"Landkreis Cloppenburg",
"Berlin"
]
}
a user requests a search for a term Cloppenburg
and I want to return only those documents which contain term Cloppenburg
and not Landkreis Cloppenburg.
the results should contain only Document-1.
but my query is returning both documents.
I am using the following query and getting both documents back.
can someone please help me out in this.
GET /my_index/_search
{
"query": {
"bool": {
"must": [
{
"match": {
"doc.locations": {
"query": "cloppenburg",
"operator": "and"
}
}
}
]
}
}
}
The issue is due to your are using the text field and match query.
Match queries are analyzed and used the same analyzer of search terms which is used at index time, which is a standard analyzer in case of text fields. which breaks text on whitespace on in your case Landkreis Cloppenburg will create two tokens landkreis and cloppenburg both index and search time and even cloppenburg will match the document.
Solution: Use the keyword field.
Index def
{
"mappings": {
"properties": {
"location": {
"type": "keyword"
}
}
}
}
Index your both docs and then use same search query
{
"query": {
"bool": {
"must": [
{
"match": {
"location": {
"query": "Cloppenburg"
}
}
}
]
}
}
}
Result
"hits": [
{
"_index": "location",
"_type": "_doc",
"_id": "2",
"_score": 0.6931471,
"_source": {
"location": "Cloppenburg"
}
}
]
I m working with mongodb and restheart.
In my nosql db i have a unique document with this structure:
{
"_id": "docID",
"users": [
{
"userID": "12",
"elements": [
{
"elementID": "1492446877599",
"events": [
{
"event1": "one"
},
{
"event2": "two",
}
]
}
},
{
"userID": "11",
"elements": [
{
"elementID": "14924",
"events": [
{
"event1": "one"
},
{
"event2": "two",
}
]
}
}
]
}
how can i build an url-query in order to get the user with id 11?
Using mongo shell it should be something like this one:
db.getCollection('collection').find({},{'users':{'$elemMatch':{'userID':'12'}}}).pretty()
I cannot find anything similar on restheart.
Could someone help me?
Using this
http://myHost:port/documents/docID?filter={%27users%27:{%27$elemMatch%27:{%27userID%27:%2712%27}}}
restheart returns me all the documents: userID 11 and 12.
Your request is against a document resource, i.e. the URL is http://myHost:port/documents/docID
The filter query parameter applies for collection requests, i.e. URLs such as http://myHost:port/documents
In any case you need to projection (the keys query parameter) to limit the returned properties.
You should achieve it with the following request (I haven't tried it) using the $elementMatch projection operator:
http://myHost:port/documents?keys={"users":{"$elemMatch":{"userID":"12"}}}
I'm using the following json to find results in a Cloudant
{
"selector": {
"$and": [
{
"type": {
"$eq": "sensor"
}
},
{
"v": {
"$eq": 2355
}
},
{
"$or": [
{
"p": "#401000103"
},
{
"p": "#401000114"
}
]
},
{
"t_max": {
"$gte": 1459554894
}
},
{
"t_min": {
"$lte": 1459509591
}
}
]
},
"fields": [
"_id",
"p"
],
"limit": 200
}
If I run this againt my cloudant database I get the following error:
{
"error": "unknown_error",
"reason": "function_clause",
"ref": 3379914628
}
If I remove one the $or elements I get the results for query.
(,{"p":"#401000114"})
Also i get a result if I replace #401000114 with #401000114 I get result.
But when I want to use both element I get the error code above.
Can anybody tell what this error_reason: function_clause mean?
error_reason: function_clause means there was a problem on the server, you should probably reach out to Cloudant Support and see if they can help you with your issue.
I had contact with the Cloudant support.
This is there answer:
The issue affects Cloudant generally
It affects both mult-tenant and dedicated clusters.
There are working on the sollution.
A workaround is in the array to which the $or operator applies has two elements, you can get the correct result by repeating one of the items in the array.
In my data, I have two fields that I want to use as an index together. They are sensorid (any string) and timestamp (yyyy-mm-dd hh:mm:ss).
So I made an index for these two using the Cloudant index generator. This was created successfully and it appears as a design document.
{
"index": {
"fields": [
{
"name": "sensorid",
"type": "string"
},
{
"name": "timestamp",
"type": "string"
}
]
},
"type": "text"
}
However, when I try to make the following query to find all documents with a timestamp newer than some value, I am told there is no index available for the selector:
{
"selector": {
"timestamp": {
"$gt": "2015-10-13 16:00:00"
}
},
"fields": [
"_id",
"_rev"
],
"sort": [
{
"_id": "asc"
}
]
}
What have I done wrong?
It seems to me like cloudant query only allows sorting on fields that are part of the selector.
Therefore your selector should include the _id field and look like:
"selector":{
"_id":{
"$gt":0
},
"timestamp":{
"$gt":"2015-10-13 16:00:00"
}
}
I hope this works for you!