Filtering results in mongodb - arrays

maybe a simple question for experimented developers with MongoDB, but I'm not getting a solution:
My json with "Stations" collection:
{
"code": "XX",
"variables": [
{
"code": 1,
"items": [
{
"value": 81
},
{
"value": 77
}
]
},
{
"code":2,
"items": [
{
"value": 33
}
]
}
]
}
....
I want to filter "Station" collection to only get variable with code 1 and item with value 81, obtaining something similar to this:
{
"code": "XX",
"variables": [
{
"code": 1,
"items": [
{
"value": 81
}
]
}
]
}
Due json contains arrays in different levels, my approach (mongo shell) was:
db.stations.find(
{"code": "XX"},
{
"variables":
{ $elemMatch:
{
"code": 1,
"items":
{ $elemMatch:
{
"value": 81
}
}
}
}
}
)
But that's getting all items of the same level of 'value: 81', not only this.
Any idea? I also tried to do something with "aggregate" operator and $redact, but no result...thanks!

As per mongo $elemMatch documentation
The $elemMatch operator matches documents that contain an array field with at least one element that matches all the specified query criteria.
hence using $elemMatch it match in items.value:81 and return whole matching items array like below query
db.stations.find({
"code": "XX"
}, {
"variables": {
"$elemMatch": {
"code": 1
}
},
"variables": {
"$elemMatch": {
"items": {
"$elemMatch": {
"value": 81
}
}
}
}
}).pretty()
This return items.value:81 and items.value:77 because of elemMatch match one elements in array. Or same if used in project as below it shows same result like above query
db.stations.find({
"code": "XX",
"variables": {
"$elemMatch": {
"code": 1
}
},
"variables": {
"$elemMatch": {
"items": {
"$elemMatch": {
"value": 81
}
}
}
}
}, {
"code": 1,
"variables.code.$": 1
}).pretty()
So If you find your expected output then you should use mongo aggregation as below :
db.stations.aggregate({
"$match": {
"code": "XX",
"variables.code": 1
}
}, {
"$unwind": "$variables"
}, {
"$unwind": "$variables.items"
}, {
"$match": {
"variables.items.value": 81
}
}, {
"$group": {
"_id": "$code",
"data": {
"$push": "$variables"
}
}
}, {
"$project": {
"code": "$_id",
"variables": "$data",
"_id": 0
}
}).pretty()

Related

Normalize the Multiple document to single document in MongoDB

{
"_id": "null",
"data": [
{
"name": "abc",
"id": "123"
},
{
"name": "xzy",
"id": "123"
}
]
}
Explanation: the name value will become an object name. also want to convert it into one single document, that contains all the objects. abc and xyz is dynamically coming as a parameter.
Expected Output.
{
"data": {
"abc": {
"name": "abc",
"id": "100"
},
"xyz": {
"name": "xzy",
"id": "123"
}
}
}
Try this:
db.testCollection.aggregate([
{
$project: {
"array": {
$map: {
input: "$data",
as: "item",
in: {
k: "$$item.name",
v: {
"name": "$$item.id",
"id": "$$item.name"
}
}
}
}
}
},
{ $unwind: "$array" },
{
$group: {
_id: "$null",
"data": { $push: "$array" }
}
},
{
$project: {
"data": { $arrayToObject: "$data" }
}
}
]);

parsing the array of objects in JSON and converting it to flat JSON using JOLT transform

My input looks like the below;
{
"family": [
{
"person": {
"personId": {
"value": "12345"
},
"employeeAuthCd": {
"code": "AUTH_12345"
},
"employeeTypeCd": {
"code": "cd"
},
"status": {
"code": "New"
}
}
}
]
}
Desired Output
{
"Person_ID":"12345",
"employeeAuthCd":"AUTH_1345",
"employeeTypeCd":"cd",
"status":"New"
}
Can anyone help me out with the Jolt spec, I have tried many possible specs but couldn't reach the desired output, like the above, JSON have multiple array of objects those I need to convert those into flat JSON
This spec should work for you:
[
{
"operation": "shift",
"spec": {
"family": {
"*": {
"person": {
"personId": {
"value": "Person_ID"
},
"employeeAuthCd": {
"code": "employeeAuthCd"
},
"employeeTypeCd": {
"code": "employeeTypeCd"
},
"status": {
"code": "status"
}
}
}
}
}
}
]

Filter based on values after join operation - MongoDB

I have two collections in the following format -
collection 1
{
"_id": "col1id1",
"name": "col1doc1",
"properties": [ "<_id1>", "<_id2>", "<_id3>"]
}
collection 2
{
"_id": "<_id1>",
"name": "doc1",
"boolean_field": false
}
{
"_id": "<_id2>",
"name": "doc2",
"boolean_field": true
}
{
"_id": "<_id3>",
"name": "doc3",
"boolean_field" : false
}
the desired output is -
{
"_id": "col1id1",
"name": "col1doc1",
"property_names": ["doc1", "doc3"]
}
The field proerties of document in collection1 has three IDs of documents in collection2 but the output after join operation should contain only those which have the boolean_field value as false. How can I perform this filter with join operation in MongoDB?
$lookup can be used along with $unwind to achieve this.
db.col1.aggregate([
{
"$unwind": "$properties"
},
{
"$lookup": {
from: "col2",
localField: "properties",
"foreignField": "_id",
"as": "property_names"
}
},
{
"$match": {
"property_names": {
"$elemMatch": {
"bool_field": false
}
}
}
},
{
"$unwind": "$property_names"
},
{
"$group": {
"_id": "$_id",
"properties": {
"$push": "$properties"
},
"property_names": {
"$push": "$property_names"
}
}
},
{
"$project": {
"_id": 1,
"name": 1,
"property_names": {
"name": 1
}
}
}
]);

ElasticSearch-Kibana : filter array by key

I have data with one parameter which is an array. I know that objects in array are not well supported in Kibana, however I would like to know if there is a way to filter that array with only one value for the key. I mean :
This is a json for exemple :
{
"_index": "index",
"_type": "data",
"_id": "8",
"_version": 2,
"_score": 1,
"_source": {
"envelope": {
"version": "0.0.1",
"submitter": "VF12RBU1D53087510",
"MetaData": {
"SpecificMetaData": [
{
"key": "key1",
"value": "94"
},
{
"key": "key2",
"value": "0"
}
]
}
}
}
}
And I would like to only have the data which contains key1 in my SpecificMetaData array in order to plot them. For now, when I plot SpecificMetaData.value it takes all the values of the array (value of key1 and key2) and doesn't propose SpecificMetaData.value1 and SpecificMetaData.value2.
If you need more information, tell me. Thank you.
you may need to map your data to mappings so as SpecificMetaData should act as nested_type and inner_hits of nested filter can supply you with objects which have key1.
PUT envelope_index
{
"mappings": {
"document_type": {
"properties": {
"envelope": {
"type": "object",
"properties": {
"version": {
"type": "text"
},
"submitter": {
"type": "text"
},
"MetaData": {
"type": "object",
"properties": {
"SpecificMetaData": {
"type": "nested"
}
}
}
}
}
}
}
}
}
POST envelope_index/document_type
{
"envelope": {
"version": "0.0.1",
"submitter": "VF12RBU1D53087510",
"MetaData": {
"SpecificMetaData": [{
"key": "key1",
"value": "94"
},
{
"key": "key2",
"value": "0"
}
]
}
}
}
POST envelope_index/_search
{
"query": {
"bool": {
"must": [
{
"nested": {
"inner_hits": {},
"path": "envelope.MetaData.SpecificMetaData",
"query": {
"bool": {
"must": [
{
"term": {
"envelope.MetaData.SpecificMetaData.key": {
"value": "key1"
}
}
}
]
}
}
}
}
]
}
}
}

Aggregating array of values in elasticsearch

I need to aggregate an array as follows
Two document examples:
{
"_index": "log",
"_type": "travels",
"_id": "tnQsGy4lS0K6uT3Hwzzo-g",
"_score": 1,
"_source": {
"state": "saopaulo",
"date": "2014-10-30T17",
"traveler": "patrick",
"registry": "123123",
"cities": {
"saopaulo": 1,
"riodejaneiro": 2,
"total": 2
},
"reasons": [
"Entrega de encomenda"
],
"from": [
"CompraRapida"
]
}
},
{
"_index": "log",
"_type": "travels",
"_id": "tnQsGy4lS0K6uT3Hwzzo-g",
"_score": 1,
"_source": {
"state": "saopaulo",
"date": "2014-10-31T17",
"traveler": "patrick",
"registry": "123123",
"cities": {
"saopaulo": 1,
"curitiba": 1,
"total": 2
},
"reasons": [
"Entrega de encomenda"
],
"from": [
"CompraRapida"
]
}
},
I want to aggregate the cities array, to find out all the cities the traveler has gone to. I want something like this:
{
"traveler":{
"name":"patrick"
},
"cities":{
"saopaulo":2,
"riodejaneiro":2,
"curitiba":1,
"total":3
}
}
Where the total is the length of the cities array minus 1. I tried the terms aggregation and the sum, but couldn't output the desired output.
Changes in the document structure can be made, so if anything like that would help me, I'd be pleased to know.
in the document posted above "cities" is not a json array , it is a json object.
If changing the document structure is a possibility I would change cities in the document to be an array of object
example document:
cities : [
{
"name" :"saopaulo"
"visit_count" :"2",
},
{
"name" :"riodejaneiro"
"visit_count" :"1",
}
]
You would then need to set cities to be of type nested in the index mapping
"mappings": {
"<type_name>": {
"properties": {
"cities": {
"type": "nested",
"properties": {
"city": {
"type": "string"
},
"count": {
"type": "integer"
},
"value": {
"type": "long"
}
}
},
"date": {
"type": "date",
"format": "dateOptionalTime"
},
"registry": {
"type": "string"
},
"state": {
"type": "string"
},
"traveler": {
"type": "string"
}
}
}
}
After which you could use nested aggregation to get the city count per user.
The query would look something on these lines :
{
"query": {
"match": {
"traveler": "patrick"
}
},
"aggregations": {
"city_travelled": {
"nested": {
"path": "cities"
},
"aggs": {
"citycount": {
"cardinality": {
"field": "cities.city"
}
}
}
}
}
}

Resources