I am looking for a logstash filter that can modify array fields.
For example, I would like a modifier that can turn this JSON document
{
arrayField: [
{
subfield: {
subsubfield: "value1"
}
},
{
subfield: {
subsubfield: "value2"
}
}
]
}
Into this JSON document
{
arrayField: [
{
subfield: "value1"
},
{
subfield: "value2"
}
]
}
I have tried the following input
input {
mutate {
replace => ["[arrayField][subfield]", "%{[arrayField][subField][subsubField]}"]
}
}
but the input just rewrites the array field instead of operating on each element of the array. How do you set up a modifier to operate on each element of an array?
Thanks Alain Collins for pointing out the ruby filter. The below input did the trick.
input {
ruby {
code => "
event['arrayField'].each{|subdoc| subdoc['subfield'] = subdoc['subfield']['subsubfield']}
"
}
}
Related
I have a mongo collection where docs have been already stored. The structure is of a single doc is something like this:
"_id":ObjectId("55c3043ab165fa6355ec5c9b"),
"address":{
"building":"522",
"coord":[
-73.95171,
40.767461
],
"street":"East 74 Street",
"zipcode":"10021"
}
}
Now I want to update the doc by inserting a new field "persons" with value being a list of objects [{"name":"marcus", "contact":"420"}, {"name":"modiji", "contact":"111"}], so after insertion the above doc should look like this:
"_id":ObjectId("55c3043ab165fa6355ec5c9b"),
"address":{
"building":"522",
"coord":[
-73.95171,
40.767461
],
"street":"East 74 Street",
"zipcode":"10021"
},
"persons":[
{
"name":"marcus",
"contact":"420"
},
{
"name":"modiji",
"contact":"111"
}
]
}
Can anyone please help me with then correct $set syntax? Also, it would be really helpful if anyone can suggest an efficient way to update a key's value, which is a list of objects so that I can push some new objects inside the existing list.
You can use the updateOne command along with $set operator to achieve it.
db.<Collection-Name>.updateOne({
"_id":ObjectId("55c3043ab165fa6355ec5c9b")
}, {
"$set": {
"persons":[
{
"name":"marcus",
"contact":"420"
},
{
"name":"modiji",
"contact":"111"
}
]
}
})
If you want to push additional data into the array, you can use the below command.
db.<Collection-Name>.updateOne({
"_id":ObjectId("55c3043ab165fa6355ec5c9b")
}, {
"$push": {
"persons": {
"name":"sample",
"contact":"1234"
}
}
})
To push multiple arrays of objects in a single command, use the below query
db.<Collection-Name>.updateOne({
"_id":ObjectId("55c3043ab165fa6355ec5c9b")
}, {
"$push": {
"persons": {
"$each": [
{
"name":"sample1",
"contact":"5678"
},
{
"name":"sample2",
"contact":"90123"
}
]
}
}
})
I have a problem with MongoDB.
I have a collection with many documents like this:
{
key1:[
{el:"EL1"},
{el:"EL2"}
]
}
I want to update all documents in the collection col adding a new key key2 where the value is key1.0.
In particular a generic output's document will be:
{
key1:[
{el:"EL1"},
{el:"EL2"}
],
key2: {el:"EL1"}
}
How can I do that?
Thanks
You can use $addFields with $let to generate new field based on key1.0 value and then you can run $out to update existing collection with the result of aggregation:
db.col.aggregate([
{
$addFields: {
"key2.el": {
$let: {
vars: { fst: { $arrayElemAt: [ "$key1", 0 ] } },
in: "$$fst.el"
}
}
}
},
{ $out: "col" }
])
I have a js object array like this.
[
{
name:"Japanilainen ravintola Koto",
rating:3.9,
photo:[
{
height:2160,
html_attributions:[
"Hannes Junnila"
],
photo_reference:"CoQBdwAAAMDlivT0nOnYg8jC1txZ3RbfBR59XvKN0WphDbRVUXaUTQclzzaIaXJ8-p7s3x_aG67AUsM_HLNML6pzGl3v_wV2D-eudH_3wy2cB1ROrRgGcGyf4lRuNpE3WwXYbYZu6EK8oEPiJ5B17Lybj-eVbYM2EgVVBgOrUJhsblY1mfxWEhAZ4oHCFakH-hgkbksfGa2uGhQe4aUeOrS2isAir01KUwQ7N3Ce2Q",
width:2269
}
]
},
{
name:"Kin Sushi Helsinki",
rating:4.2,
photo:[
{
height:2988,
html_attributions:[
"Stephan Winter"
],
photo_reference:"CoQBdwAAAN4iMumSbQjtRnJIH1AKRdbSfnI02WGh11r1xaVnZl1ohebKp6zpAS4mmJFqTagrIqUJ39kzulVI0sz2UzzfaVdsAFc5f80PnOCzSLqL5gnpsqv90dVJIqUWD3Bcc9TgYPPs3oGwyekkOsmjQ59o9yqdoF5GzrpaKkojhMNLxpfzEhBKpRkA2CzINpUzAAe3e90TGhQ_KbYCmtJYLfVGIu1kZkzQIAwE4A",
width:5312
}
]
}
]
I get this array above by doing this for each.
response.results.forEach((entry)=>{
var restaurantName = {
"name" : entry.name,
"rating" : entry.rating,
"photo_reference" : entry.photos
}
arr.push(restaurantName);
});
res.send(arr);
And I send the array to my browser so I can see it.
What I am trying to do is to get photo_reference from the entry.photos
I tried entry.photos[0].photo_reference and many more ways and in all of them I am getting a cannot read properly, and now I am not sure how to get that information out.
I edited some of the variable names to make it easier to simulate here, but just map the objects in the photo arrays to their references, and you'll get an array of photo references.
const data = [
{
name:"Japanilainen ravintola Koto",
rating:3.9,
photo:[
{
height:2160,
html_attributions:[
'Hannes Junnila'
],
photo_reference:"CoQBdwAAAMDlivT0nOnYg8jC1txZ3RbfBR59XvKN0WphDbRVUXaUTQclzzaIaXJ8-p7s3x_aG67AUsM_HLNML6pzGl3v_wV2D-eudH_3wy2cB1ROrRgGcGyf4lRuNpE3WwXYbYZu6EK8oEPiJ5B17Lybj-eVbYM2EgVVBgOrUJhsblY1mfxWEhAZ4oHCFakH-hgkbksfGa2uGhQe4aUeOrS2isAir01KUwQ7N3Ce2Q",
width:2269
}
]
},
{
name:"Kin Sushi Helsinki",
rating:4.2,
photo:[
{
height:2988,
html_attributions:[
'Stephan Winter'
],
photo_reference:"CoQBdwAAAN4iMumSbQjtRnJIH1AKRdbSfnI02WGh11r1xaVnZl1ohebKp6zpAS4mmJFqTagrIqUJ39kzulVI0sz2UzzfaVdsAFc5f80PnOCzSLqL5gnpsqv90dVJIqUWD3Bcc9TgYPPs3oGwyekkOsmjQ59o9yqdoF5GzrpaKkojhMNLxpfzEhBKpRkA2CzINpUzAAe3e90TGhQ_KbYCmtJYLfVGIu1kZkzQIAwE4A",
width:5312
}
]
}
]
const arr = []
data.forEach((entry)=>{
var restaurantName = {
"name" : entry.name,
"rating" : entry.rating,
"photo_reference" : entry.photo.map(x => x.photo_reference)
}
arr.push(restaurantName);
});
console.log(arr);
entry.photoes is not defined in your response.results object array.. did you mean to access it as entry.photo (inside your foreach function) ?
I have simple mapping - one string field and one string[] field.
The array of strings contains duplicate values, and I get those duplicate values in query:
{ "query" : { "term" : {"id" : "579a252585b8c5c428fa0a3c"} } }
Returns a single valid hit:
{
"id" : "579a252585b8c5c428fa0a3c",
"touches" : [ "5639abfb5cba47087e8b4571", "5639abfb5cba47087e8b4571", "5639abfb5cba47087e8b4571", "5639abfb5cba47087e8b457b", "5639abfb5cba47087e8b457b"
}
But in metric script aggregation:
"aggs": {
"path": {
"scripted_metric": {
"map_script": "_agg['result'] = doc['touches'].values"
}
}
}
retuns
"aggregations" : {
"path" : {
"value" : [ { }, {
"result" : [ "5639abfb5cba47087e8b4571", "5639abfb5cba47087e8b457b" ]
}, { }, { }, { } ]
}
}
that element is org.elasticsearch.index.fielddata.ScriptDocValues$Strings, casting it toString() returns a json-encoded 2-element array.
So, the question:
Why does ScriptDocValues$Strings return only unique array values and how to get the initial array in script aggregation?
Thanks.
UPD
I found that for numerical values (in particular floats) everything works perfect.
I need to rename indentifier in this:
{ "general" :
{ "files" :
{ "file" :
[
{ "version" :
{ "software_program" : "MonkeyPlus",
"indentifier" : "6.0.0"
}
}
]
}
}
}
I've tried
db.nrel.component.update(
{},
{ $rename: {
"general.files.file.$.version.indentifier" : "general.files.file.$.version.identifier"
} },
false, true
)
but it returns: $rename source may not be dynamic array.
For what it's worth, while it sounds awful to have to do, the solution is actually pretty easy. This of course depends on how many records you have. But here's my example:
db.Setting.find({ 'Value.Tiers.0.AssetsUnderManagement': { $exists: 1 } }).snapshot().forEach(function(item)
{
for(i = 0; i != item.Value.Tiers.length; ++i)
{
item.Value.Tiers[i].Aum = item.Value.Tiers[i].AssetsUnderManagement;
delete item.Value.Tiers[i].AssetsUnderManagement;
}
db.Setting.update({_id: item._id}, item);
});
I iterate over my collection where the array is found and the "wrong" name is found. I then iterate over the sub collection, set the new value, delete the old, and update the whole document. It was relatively painless. Granted I only have a few tens of thousands of rows to search through, of which only a few dozen meet the criteria.
Still, I hope this answer helps someone!
Edit: Added snapshot() to the query. See why in the comments.
You must apply snapshot() to the cursor before retrieving any documents from the database.
You can only use snapshot() with unsharded collections.
From MongoDB 3.4, snapshot() function was removed. So if using Mongo 3.4+ ,the example above should remove snapshot() function.
As mentioned in the documentation there is no way to directly rename fields within arrays with a single command. Your only option is to iterate over your collection documents, read them and update each with $unset old/$set new operations.
I had a similar problem. In my situation I found the following was much easier:
I exported the collection to json:
mongoexport --db mydb --collection modules --out modules.json
I did a find and replace on the json using my favoured text editing utility.
I reimported the edited file, dropping the old collection along the way:
mongoimport --db mydb --collection modules --drop --file modules.json
Starting Mongo 4.2, db.collection.update() can accept an aggregation pipeline, finally allowing the update of a field based on its own value:
// { general: { files: { file: [
// { version: { software_program: "MonkeyPlus", indentifier: "6.0.0" } }
// ] } } }
db.collection.updateMany(
{},
[{ $set: { "general.files.file": {
$map: {
input: "$general.files.file",
as: "file",
in: {
version: {
software_program: "$$file.version.software_program",
identifier: "$$file.version.indentifier" // fixing the typo here
}
}
}
}}}]
)
// { general: { files: { file: [
// { version: { software_program: "MonkeyPlus", identifier: "6.0.0" } }
// ] } } }
Literally, this updates documents by (re)$setting the "general.files.file" array by $mapping its "file" elements in a "version" object containing the same "software_program" field and the renamed "identifier" field which contains what used to be the value of "indentifier".
A couple additional details:
The first part {} is the match query, filtering which documents to update (in this case all documents).
The second part [{ $set: { "general.files.file": { ... }}}] is the update aggregation pipeline (note the squared brackets signifying the use of an aggregation pipeline):
$set is a new aggregation operator which in this case replaces the value of the "general.files.file" array.
Using a $map operation, we replace all elements from the "general.files.file" array by basically the same elements, but with an "identifier" field rather than "indentifier":
input is the array to map.
as is the variable name given to looped elements
in is the actual transformation applied on elements. In this case, it replaces elements by a "version" object composed by a "software_program" and a "identifier" fields. These fields are populated by extracting their previous values using the $$file.xxxx notation (where file is the name given to elements from the as part).
I had to face the issue with the same schema. So this query will helpful for someone who wants to rename the field in an embedded array.
db.getCollection("sampledocument").updateMany({}, [
{
$set: {
"general.files.file": {
$map: {
input: "$general.files.file",
in: {
version: {
$mergeObjects: [
"$$this.version",
{ identifer: "$$this.version.indentifier" },
],
},
},
},
},
},
},
{ $unset: "general.files.file.version.indentifier" },
]);
Another Solution
I also would like rename a property in array: and I used thaht
db.getCollection('YourCollectionName').find({}).snapshot().forEach(function(a){
a.Array1.forEach(function(b){
b.Array2.forEach(function(c){
c.NewPropertyName = c.OldPropertyName;
delete c["OldPropertyName"];
});
});
db.getCollection('YourCollectionName').save(a)
});
The easiest and shortest solution using aggregate (Mongo 4.0+).
db.myCollection.aggregate([
{
$addFields: {
"myArray.newField": {$arrayElemAt: ["$myArray.oldField", 0] }
}
},
{$project: { "myArray.oldField": false}},
{$out: {db: "myDb", coll: "myCollection"}}
])
The problem using forEach loop as mention above is the very bad performance when the collection is huge.
My proposal would be this one:
db.nrel.component.aggregate([
{ $unwind: "$general.files.file" },
{
$set: {
"general.files.file.version.identifier": {
$ifNull: ["$general.files.file.version.indentifier", "$general.files.file.version.identifier"]
}
}
},
{ $unset: "general.files.file.version.indentifier" },
{ $set: { "general.files.file": ["$general.files.file"] } },
{ $out: "nrel.component" } // carefully - it replaces entire collection.
])
However, this works only when array general.files.file has a single document only. Most likely this will not always be the case, then you can use this one:
db.nrel.componen.aggregate([
{ $unwind: "$general.files.file" },
{
$set: {
"general.files.file.version.identifier": {
$ifNull: ["$general.files.file.version.indentifier", "$general.files.file.version.identifier"]
}
}
},
{ $unset: "general.files.file.version.indentifier" },
{ $group: { _id: "$_id", general_new: { $addToSet: "$general.files.file" } } },
{ $set: { "general.files.file": "$general_new" } },
{ $unset: "general_new" },
{ $out: "nrel.component" } // carefully - it replaces entire collection.
])