Merging 2 json files into new json with no duplicates

Merging 2 json files into new json with no duplicates - arrays

My dedicated servers are generating 2 laptime arrays and I would like to use a script to merge them into a single, new json file, with duplicate "steamids" removed (and kept grouped together as they still are) and both arrays under a single loggedTimes {} (so I can feed it to a html script that produces laptimes and a leaderboard). In other words, I want the structure to remain.
The first laptime file and the second laptime file go through the following command
jq 'reduce . as $item ({}; . * $item)' laptimes_data_ams.json laptimes_data_kow.json > laptimes.json
to then generate the (badly) merged laptime file.
I can get a file reduced but can't get any further than that. I checked threads by other around here and whenever I try their suggestions the script just refuses to work. Anybody available to lend me a hand in generating a working script to keep this final structure post-merge?
{
"loggedTimes" : {
steamids" : {
"idnumber1" : "name1",
"idnumber2" : "name2"
},
"vehicles" : {
"vehiclenumber1" : {
"laptimes" : {
"idnumber1" : {
"lapTime" : time1,
"logtime" : log1,
"name" : "name 1",
"rank" : rank1,
"refId" : id1,
"vehicleid" : vehiclenumber1,
"wet" : 0
},
"idnumber2" : {
"lapTime" : time2,
"logtime" : log2,
"name" : "name 2",
"rank" : rank2,
"refId" : id2,
"vehicleid" : vehiclenumber1,
"wet" : 0
}
}
}
"vehiclesnumber2" : {
//you get the idea by now
}
}
}

You haven't specified how the merge is to be performed, but one option would be to let the key-value pairs in the second file dominate. In that case, you could write:
jq -n '
input as $one
| input as $two
| ($one + $two)
| .loggedTimes.steamids = ($one.loggedTimes.steamids + $two.loggedTimes.steamids)
' 1.json 2.json
With your input, this produces output from which the following is an extract:
{
"loggedTimes": {
"steamids": {
"76561197960277005": "[DECOCO]koker_SZ",
"76561197960436395": "JOJO",
...
},
"vehicles": {
"-1142039519": {
"lapTimes": {}
},
"-1201605905": {
"lapTimes": {
"76561197984026143": {
"lapTime": 609101,
"logtime": 1606516985,
"name": "Night Dick",
"rank": 1,
"refId": 36032,
"vehicleId": -1201605905,
"wet": 0
}
}
}
...
}
}
}

Related

Mongo shell aggregation pipeline error: "unknown operator: $nor"

I'm trying aggregation pipelines for the first time on Mongo Shell following this course. The idea is to create a search query with multiple conditions and use it within a $match aggregation stage.
My aproach was writing the conditions individually and combining them in an object like this:
let fullQuery = {
"languagesQuery" : {
"languages" : {
"$all" : [
"English",
"Japanese"
]
}
},
"genresQuery" : {
"$nor" : [
{
"genres" : "Crime"
},
{
"genres" : "Horror"
}
]
},
"imdbRatingQuery" : {
"imdb.rating" : {
"$gte" : 7
}
},
"ratedQuery" : {
"rated" : {
"$in" : [
"PG",
"G"
]
}
}
}
The thing is that, while individually all the queries seem to work fine, when I run the pipeline or even db.movies.find(fullQuery) I get the following error:
Error: error: {
"operationTime" : Timestamp(1618483194, 1),
"ok" : 0,
"errmsg" : "unknown operator: $nor",
"code" : 2,
"codeName" : "BadValue",
"$clusterTime" : {
"clusterTime" : Timestamp(1618483194, 1),
"signature" : {
"hash" : BinData(0,"LPVJIin4JoThWZbFICVnzHOnJKU="),
"keyId" : NumberLong("6902062171803353090")
}
}
}
Any clue as to what may be happening?

https://docs.mongodb.com/manual/reference/operator/query/nor/ is an expression-level operator, e.g. you can give it on top level of a query to combine clauses. It is not defined as a field-level operator.
For your query, use https://docs.mongodb.com/manual/reference/operator/query/nin/.
The rest of your query looks problematic also in that you should probably be using dot notation for conditions on nested fields and not nesting the fields in the query as you have done.

bash split array into separate files with dynamic name

I have the following returned to me as a response of a mocking tool I'm using.
{
"mappings" : [
{
"id" : "bcf3559f-7ff7-406b-a4f1-6d3e9ac00e63",
"name" : "Hellow world 2",
"request" : {
"url" : "/hello-world-2",
"method" : "POST"
},
"response" : {
"status" : 200,
"body" : "\nBody content for stub 3\n\n",
"headers" : { }
},
"uuid" : "bcf3559f-7ff7-406b-a4f1-6d3e9ac00e63",
"persistent" : true,
"priority" : 5
},
{
"id" : "9086b24f-4f5e-465a-bbe5-73bbfb82cd5c",
"name": "Hello world",
"request" : {
"url" : "/hello-world",
"method" : "ANY"
},
"response" : {
"status" : 200,
"body" : "Hi!"
},
"uuid" : "9086b24f-4f5e-465a-bbe5-73bbfb82cd5c"
} ]
}
I'd like to know how I can split each object into it's own file with the file named after the id of the object.
E.g:
bcf3559f-7ff7-406b-a4f1-6d3e9ac00e63.json
bcf3559f-7ff7-406b-a4f1-6d3e9ac00e63.json
I have got as far as this so far but can't get it over the line:
jq -c '.mappings = (.mappings[] | [.])' mappings.json |
while read -r json ; do
N=$((N+1))
jq . <<< "$json" > "tmp/file${N}.json"
done

I'd recommend printing the id on one line, and the corresponding object on the next. For example:
jq -c '.mappings[] | .id, .' mappings.json |
while read -r id ; do
echo "id=$id"
read -r json
jq . <<< "$json" > "tmp/${id}.json"
done

I would write a simple Python script instead (or the equivalent in your favorite, general-purpose programming language).
import sys, json
d = json.load(sys.stdin):
for o in d['mappings']:
with open(os.path.join('tmp', o['id'] + '.json'), 'w') as f:
json.dump(o, f)
This would be more efficient and less error-prone, at least until jq gets some sort of output built-in:
# hypothetical
jq '.mappings[] | output("tmp/\(.id).json")' mappings.json

MongoDB: Updating A specific array element in a sub document

I'm a novice with mongodb so please excuse me if the question is a little basic. I have a mongo collection with a relatively complex document structure. The documents contain sub documents and arrays. I need to add additional data to some of the documents in this collection. A cut down version of the document is:
"date" : ISODate("2018-08-07T08:00:00.000+0000"),
.
. <<-- Other fields
.
"basket" :
[
{
"assetId" : NumberInt(639),
"securityId" : NumberInt(12470),
.
. <<-- Other fields
.
"exGroup" : [
. << -- Fields......
.
. << -- New Data will go here
]
}
.
. << More elements
]
The following (abridged) aggregation query finds the documents that need modifying:
{
"$match" : {
"date" : {
"$gte" : ISODate("2018-08-07T00:00:00.000+0000"),
"$lt" : ISODate("2018-08-08T00:00:00.000+0000")
}
}
},
{
"$unwind" : {
"path" : "$basket"
}
},
{
"$unwind" : {
"path" : "$basket.exGroup"
}
},
{
"$project" : {
"_id" : 1.0,
"date" : 1.0,
"assetId" : "$basket.assetId",
"securityId" : "$basket.securityId",
"exGroup" : "$basket.exGroup"
}
},
{
"$unwind" : {
"path" : "$exGroup"
}
},
{
"$match" : {
"exGroup.order" : {
"$exists" : true
}
}
}
For each document returned by the mongo query I need to (in python) retrieve a set of additional data from a SQL database and then append this data to the original mongo document as shown above. The set of new fields will be the same, the data will be different. What is not clear to me is how, once I have the data I go about updating the array values.
Could somebody give me a pointer?

Try this, it works for me!
mySchema.aggregate([
//your aggregation code
],function(err, docList){
//for each doc in docList
async.each(docList, function(doc, callback){
query = {$and:[{idField:doc.idField},{"myArray.ArrayId":doc.myArray.ArrayId}]}
//Update or create field in array
update = {$set:"myArray.$.FieldNameToCreateOrUpdate":value}}
projection = {field1:1, field2:1, field3:1}
mySchema.findOneAndUpdate(query, update, projection, function(err, done){
if(err){callback(err,null)}
callback(null,'done')
})
,function(err){
//code if error
//code if no error
}
})

update nested array element of basis of selecting array element in mongodb

Below is MongoDB document.
`{
"_id" : ObjectId("588f09c8d466d7054114b456"),
"phonebook" : [
{
"pb_name_first" : "Aasu bhai",
"pb_phone_number" : [
{
"ph_id" : 2,
"ph_no" : "+91111111",
"ph_type" : "Mobile"
}
],
"pb_email_id" : [
{
"email_id" : "temp#gmail.com",
"email_type" : "Home",
"em_id" :1
},
{
"email_id" : "test#gmail.com",
"email_type" : "work",
"em_id" :2
}
],
"pb_name_prefix" : "MR."
}
]
}`
I want mongodb query that will update email_id data in pb_email_id array on basis of em_id. If i select em_id=1 then that record temp#gmail.com will update.if i select em_id=2 then test#gmail.com will update.

I don't think you can apply if-else logic in update call, you can run two separate update calls
db.collection.update({'pb_email_id.em_id':1},{$set : {'pb_email_id.$.email_id' : 'temp#gmail.com'}},{multi:true});
db.collection.update({'pb_email_id.em_id':2},{$set : {'pb_email_id.$.email_id' : 'test#gmail.com'}},{multi:true});
However you can run a script on collection to apply multiple logic
db.collection.find({}).forEach(function(doc){
if(doc.pb_email_id && doc.pb_email_id.length>0){
for(var i in doc.pb_email_id){
if(doc.pb_email_id[i].em_id === 1){
doc.pb_email_id[i].email_id = "temp#gmail.com"}
else if(doc.pb_email_id[i].em_id === 2){doc.pb_email_id[i].email_id = "test#gmail.com"}
db.collection.save(db)
}
}
})
If you have to apply multiple logic, you can run script, otherwise two update calls if that's as much as needed.
P.S - since you didn't mentioned collection name, I used db.collection.update it should be collection name like db.phonebook.find etc.

Mapreduce split input string into output array

I am dealing with documents like the following one:
> db.productData.find({"upc" : "XXX"}).pretty()
{
"_id" : ObjectId("538dfa3d44e19b2bcf590a77"),
"upc" : "XXX",
"productDescription" : "bla foo bar bla bla fooX barY",
"productSize" : "",
"ingredients" : "foo; bar; foo1; bar1.",
"notes" : "bla bla bla"
}
>
I would like to have a document containing, among the fields, a list/array of splitted ingredients (on the ;). I want to split the string of the original collection into an array of strings.
I would like to map only some of the input fields in the output collection.
I would like to use mapreduce on MongoDB.
I've tried many different ways moving stuff from the map function to the reduce function failing to find a proper solution.
From all the attempts I performed, now I know I need to check for null values etc, so the following one is my last attempt:
The map function:
var mapperProductData = function () {
var ingredientsSplitted = values.ingredientsString.split(';');
var objToEmit = {barcode : "", description : "", ingredients : []};
// checking for null (is this strictly necessary? why?)
if (
this.hasOwnProperty('ingredients')
&& this.hasOwnProperty('productDescription')
&& this.hasOwnProperty('upc')
) {
for (var i = 0; i < ingredientsSplitted.length; i++) {
// I want to emit a new document only when I have all the splitted strings inside the array
if (i == ingredientsSplitted.length - 1) {
objToEmit.barcode = this.upc;
objToEmit.description = this.productDescription;
objToEmit.ingredients = ingredientsSplitted;
emit(this.upc, objToEmit);
}
}
}
};
The reduce function:
var reducerNewMongoCollection = function(key, values) {
return values;
};
The map-reduce call:
db.productData.mapReduce(
mapperProductData,
reducerNewMongoCollection,
{
out : "newMongoCollection" ,
query: { "values" : {$exists: true} }
}
);
I am getting an empty collection in output (newMongoCollection is empty).
What am I doing wrong?

Let's start from the beginning. Your map function should look like this:
var mapperProductData = function () {
var ingredientsSplitted = this.ingredients.split(';');
var objToEmit = {
barcode : this.upc,
description : this.productDescription,
ingredients : ingredientsSplitted
};
emit(this.upc, objToEmit);
};
Your map-reduce call should be:
db.productData.mapReduce(
mapperProductData,
reducerNewMongoCollection,
{
out : "newMongoCollection",
query : {
upc : { $exists : true },
productDescription : { $exists : true },
ingredients : { $exists : true , $type : 4 }
}
}
);
The query part will filter the documents that do have relevant fields. Also the query parameter $type will match only documents where ingredients is an array. This way you don't need to do complicated checking inside your map function and the number of documents sent to map function will be lower.
The result for your test document document will look like this:
key : XXX,
value: {
"barcode" : "XXX",
"description" : "bla foo bar bla bla fooX barY",
"ingredients" : [
"foo",
" bar",
" foo1",
" bar1."
]
}

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Merging 2 json files into new json with no duplicates - arrays

Related

Mongo shell aggregation pipeline error: "unknown operator: $nor"

bash split array into separate files with dynamic name

MongoDB: Updating A specific array element in a sub document

update nested array element of basis of selecting array element in mongodb

Mapreduce split input string into output array

Categories

Resources