Restheart query for an nested array subdocument - arrays

I m working with mongodb and restheart.
In my nosql db i have a unique document with this structure:
{
"_id": "docID",
"users": [
{
"userID": "12",
"elements": [
{
"elementID": "1492446877599",
"events": [
{
"event1": "one"
},
{
"event2": "two",
}
]
}
},
{
"userID": "11",
"elements": [
{
"elementID": "14924",
"events": [
{
"event1": "one"
},
{
"event2": "two",
}
]
}
}
]
}
how can i build an url-query in order to get the user with id 11?
Using mongo shell it should be something like this one:
db.getCollection('collection').find({},{'users':{'$elemMatch':{'userID':'12'}}}).pretty()
I cannot find anything similar on restheart.
Could someone help me?
Using this
http://myHost:port/documents/docID?filter={%27users%27:{%27$elemMatch%27:{%27userID%27:%2712%27}}}
restheart returns me all the documents: userID 11 and 12.

Your request is against a document resource, i.e. the URL is http://myHost:port/documents/docID
The filter query parameter applies for collection requests, i.e. URLs such as http://myHost:port/documents
In any case you need to projection (the keys query parameter) to limit the returned properties.
You should achieve it with the following request (I haven't tried it) using the $elementMatch projection operator:
http://myHost:port/documents?keys={"users":{"$elemMatch":{"userID":"12"}}}

Related

How to project a specific index inside a multilevel nested array in mongodb

I have a particular field in my document which has a multilevel nested array structure. The document looks like something this
{
"_id" : ObjectId("62171b4207476091a17f595f"),
"data" : [
{
"id" : "1",
"content" : [
{
"id" : "1.1",
"content" : []
},
{
"id" : "1.2",
"content" : [
{
"id" : "1.2.1",
"content" : [
{
"id" : "1.2.1.1",
"content" : []
}
]
},
{
"id" : "1.2.2",
"content" : []
}
]
}
]
}
]
}
(The ids in my actual data is a random string, I have added a more defined id here just for readability)
In my application code the nesting level is controlled so it won't go more than 5 levels deep.
I need to project a particular object inside one of the deeply nested arrays.
I have all the information needed to traverse the nested structure. For example if I need to fetch the object with id "1.2.2" my input will look something like this:
[{id: 1, index: 0}, {id: 1.2, index: 1}, {id: 1.2.2, index: 1}]
In the above array, each element represents one level of nesting. I have the Id and the index. So in the above example, I know I first need to travel to index 0 at the top level, then inside that to index 1 , then inside that to index 1 again to find my object.
Is there a way I can only get the inner object that I want directly using a query. Or will I need to get the whole "data" field and do the traversal in my application code. I have been unable to figure out any way to construct a query that would satisfy my need.
Query
if you know the path, you can do it using a series of nested
$getField
$arrayElemAt
you can do it in one stage with nested calls, or with many new fields like i did bellow, or with mongodb variables
*i am not sure what output you need, this goes inside to get the 2 using the indexes (if this is not what you need add if you can the expected output)
Test code here
Data
[
{
"_id": ObjectId( "62171b4207476091a17f595f"),
"data": [
{
"id": "1",
"content": [
{
"id": "1.1",
"content": []
},
{
"id": "1.2",
"content": [
{
"id": "1.2.1",
"content": [
{
"id": "1.2.1.1",
"content": []
}
]
},
{
"id": "1.2.2",
"content": [1,2]
}
]
}
]
}
]
}
]
Query
aggregate(
[{"$set":
{"c1":
{"$getField":
{"field":"content", "input":{"$arrayElemAt":["$data", 0]}}}}},
{"$set":
{"c2":
{"$getField":
{"field":"content", "input":{"$arrayElemAt":["$c1", 1]}}}}},
{"$set":
{"c3":
{"$getField":
{"field":"content", "input":{"$arrayElemAt":["$c2", 1]}}}}},
{"$project":{"_id":0, "c4":{"$arrayElemAt":["$c3", 1]}}}])
Results
[{
"c4": 2
}]

Elastic Search query to find documents where nested field "contains" X objects

This is an example of what my data looks like for an Elastic Search index called video_service_inventory:
{
'video_service': 'netflix',
'movies' : [
{'title': 'Mission Impossible', 'genre: 'action'},
{'title': 'The Hangover', 'genre': 'comedy'},
{'title': 'Zoolander', 'genre': 'comedy'},
{'title': 'The Ring', 'genre': 'horror'}
]
}
I have established in my index that the "movies" field is of type "nested"
I want to write a query that says "get me all video_services that contain both of these movies":
{'title': 'Mission Impossible', 'genre: 'action'}
AND
{'title': 'The Ring', 'genre': 'horror'}
where, the title and genre must match. If one movie exists, but not the other, I don't want the query to return that video service.
Ideally, I would like to do this in 1 query. So far, I haven't been able to find a solution.
Anyone have suggestions for writing this search query?
the syntax may vary depending on elasticsearch version, but in general you should combine multiple nested queries within a bool - must query. For nested queries you need to specify path to "navigate" to the nested documents, and you need to qualify the properties with the part + the field name:
{
"query": {
"bool": {
"must": [
{
"nested": {
"path": "movies",
"query": {
"bool": {
"must": [
{ "terms": { "movies.title": "Mission Impossible" } },
{ "terms": { "movies.genre": "action" } }
]
}
}
}
},
{
"nested": {
"path": "movies",
"query": {
"bool": {
"must": [
{ "terms": { "movies.title": "The Ring" } },
{ "terms": { "movies.genre": "horror" } }
]
}
}
}
}
]
}
}
}
This example assumes that the title and genre fields are not analyzed properties. In newer versions of elasticsearch you may find them as a .keyword field, and you would then use "movies.genre.keyword" to query on the not analyzed version of the data.ยจ
For details on bool queries you can have a look at the documentation on the ES website:
https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-bool-query.html
For nested queries:
https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-nested-query.html

How to prevent Elasticsearch from flattening 2D arrays in "fields"-containing query

Nested arrays get flattened when represented in "fields". I expect that values from the same path to be merged, but that the internal data structure will not be modified.
Could someone explain whether I am doing something incorrectly, or whether this belongs as an Elasticsearch issue?
Steps to reproduce:
Create the 2D data
curl -XPOST localhost:9200/test/5 -d '{ "data": [ [100],[2,3],[6,7] ] }'
Query the data, specifying fields
curl -XGET localhost:9200/test/5/_search -d '{"query":{"query_string":{"query":"*"} }, "fields":["data"] } }'
Result:
{"took":1,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":1,"max_score":1.0,"hits":[{"_index":"test","_type":"5","_id":"AVdsHJrepOClGTFyoGqo","_score":1.0,"fields":{"data":[100,2,3,6,7]}}]}}
Repeat without the use of "fields":
curl -XGET localhost:9200/test/5/_search -d '{"query":{"query_string":{"query":"*"} } } }'
Result:
{"took":1,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":1,"max_score":1.0,"hits":[{"_index":"test","_type":"5","_id":"AVdsHJrepOClGTFyoGqo","_score":1.0,"_source":{ "data": [ [100],[2,3],[6,7] ] }}]}}
Notice that _source and fields differ, in that "fields" decomposes the 2D array into a 1D array.
When you specify nothing else in your request, what you get back foreach hit is the "_source" object, that is, exactly the Json you sent to ES during indexing (even including whitespace!).
When you use source filtering, as Andrey suggests, it's the same except you can include or exclude certain fields.
When you use the "fields" directive in your query, the return values are not taken from the _source, but read directly from the Lucene Index. (see docs) Now the key in your search response will switch from "_source" to "fields" to reflect this change.
As alkis said:
https://www.elastic.co/guide/en/elasticsearch/reference/current/array.html
These docs say up front that, yes, Elasticsearch does flatten arrays.
Instead of specifying "fields" I usually do source filtering
Your query would change to something like:
curl -XGET <IPADDRESS>:9200/test/5/_search -d '{"_source":{"include": ["data"]}, "query":{"query_string":{"query":"*"} }}'
From here https://www.elastic.co/guide/en/elasticsearch/reference/current/array.html
it seems that elasticsearch considers them the same.
In Elasticsearch, there is no dedicated array type. Any field can contain zero or more values by default, however, all values in the array must be of the same datatype. For instance:
an array of strings: [ "one", "two" ]
an array of integers: [ 1, 2 ]
an array of arrays: [ 1, [ 2, 3 ]] which is the equivalent of [ 1, 2, 3 ]
an array of objects: [ { "name": "Mary", "age": 12 }, { "name": "John", "age": 10 }]
You could use an array of json objects and use nested data type with nested query.
Maybe nested data type could be helpful
PUT /my_index
PUT /my_index/_mapping/my_type
{
"properties" : {
"data" : {
"type" : "nested",
"properties": {
"value" : {"type": "long" }
}
}
}
}
POST /my_index/my_type
{
"data": [
{ "value": [1, 2] },
{ "value": [3, 4] }
]
}
POST /my_index/my_type
{
"data": [
{ "value": [1, 5] }
]
}
GET /my_index/my_type/_search
{
"query": {
"nested": {
"path": "data",
"query": {
"bool": {
"must": [
{
"match": {
"data.value": 1
}
},
{
"match": {
"data.value": 2
}
}
]
}
}
}
}
}

Cloudant find Query with $and and $or elements

I'm using the following json to find results in a Cloudant
{
"selector": {
"$and": [
{
"type": {
"$eq": "sensor"
}
},
{
"v": {
"$eq": 2355
}
},
{
"$or": [
{
"p": "#401000103"
},
{
"p": "#401000114"
}
]
},
{
"t_max": {
"$gte": 1459554894
}
},
{
"t_min": {
"$lte": 1459509591
}
}
]
},
"fields": [
"_id",
"p"
],
"limit": 200
}
If I run this againt my cloudant database I get the following error:
{
"error": "unknown_error",
"reason": "function_clause",
"ref": 3379914628
}
If I remove one the $or elements I get the results for query.
(,{"p":"#401000114"})
Also i get a result if I replace #401000114 with #401000114 I get result.
But when I want to use both element I get the error code above.
Can anybody tell what this error_reason: function_clause mean?
error_reason: function_clause means there was a problem on the server, you should probably reach out to Cloudant Support and see if they can help you with your issue.
I had contact with the Cloudant support.
This is there answer:
The issue affects Cloudant generally
It affects both mult-tenant and dedicated clusters.
There are working on the sollution.
A workaround is in the array to which the $or operator applies has two elements, you can get the correct result by repeating one of the items in the array.

"There is no index available for this selector" despite the fact I made one

In my data, I have two fields that I want to use as an index together. They are sensorid (any string) and timestamp (yyyy-mm-dd hh:mm:ss).
So I made an index for these two using the Cloudant index generator. This was created successfully and it appears as a design document.
{
"index": {
"fields": [
{
"name": "sensorid",
"type": "string"
},
{
"name": "timestamp",
"type": "string"
}
]
},
"type": "text"
}
However, when I try to make the following query to find all documents with a timestamp newer than some value, I am told there is no index available for the selector:
{
"selector": {
"timestamp": {
"$gt": "2015-10-13 16:00:00"
}
},
"fields": [
"_id",
"_rev"
],
"sort": [
{
"_id": "asc"
}
]
}
What have I done wrong?
It seems to me like cloudant query only allows sorting on fields that are part of the selector.
Therefore your selector should include the _id field and look like:
"selector":{
"_id":{
"$gt":0
},
"timestamp":{
"$gt":"2015-10-13 16:00:00"
}
}
I hope this works for you!

Resources