mongoDB aggregate find total number of employees group by each state - database

Find total number of employees group by each state using aggregate
I tried the following in the screenshot link below. But the result is 0.
db.research.aggregate({$unwind:'$offices'},{"$match":
{'offices.country_code':"USA"}},{"$project": {'offices.state_code' : 1}},
{"$group" : {"_id":'$of
fices.state_code',"count" : {"$sum":'$number_of_employees'}}})

you need to $project number_of_employees to count in next stage, or you can remove the $project stage
db.research.aggregate([
{$unwind:'$offices'},
{"$match": {'offices.country_code':"USA"}},
{"$project": {'offices.state_code' : 1, 'number_of_employees' : 1}}, //project number_of_employees
{"$group" : {"_id":'$offices.state_code',"count" : {"$sum":'$number_of_employees'}}}
])
or
db.research.aggregate([
{$unwind:'$offices'},
{"$match": {'offices.country_code':"USA"}},
{"$group" : {"_id":'$offices.state_code',"count" : {"$sum":'$number_of_employees'}}}
])

Related

Mongodb TTL Index not expiring documents from collection

I have TTL index in collection fct_in_ussd as following
db.fct_in_ussd.createIndex(
{"xdr_date":1},
{ "background": true, "expireAfterSeconds": 259200}
)
{
"v" : 2,
"key" : {
"xdr_date" : 1
},
"name" : "xdr_date_1",
"ns" : "appdb.fct_in_ussd",
"background" : true,
"expireAfterSeconds" : 259200
}
with expiry of 3 days. Sample document in collection is as following
{
"_id" : ObjectId("5f4808c9b32ewa2f8escb16b"),
"edr_seq_num" : "2043019_10405",
"served_imsi" : "",
"ussd_action_code" : "1",
"event_start_time" : ISODate("2020-08-27T19:06:51Z"),
"event_start_time_slot_key" : ISODate("2020-08-27T18:30:00Z"),
"basic_service_key" : "TopSim",
"rate_event_type" : "",
"event_type_key" : "22",
"event_dir_key" : "-99",
"srv_type_key" : "2",
"population_time" : ISODate("2020-08-27T19:26:00Z"),
"xdr_date" : ISODate("2020-08-27T19:06:51Z"),
"event_date" : "20200827"
}
Problem Statement :- Documents are not getting removed from collection. Collection still contains 15 days old documents.
MongoDB server version: 4.2.3
Block compression strategy is zstd
storage.wiredTiger.collectionConfig.blockCompressor: zstd
Column xdr_date is also part of another compound index.
Observations as on Sep 24
I have 5 collections with TTL index.
It turns out that data is getting removed from one of the collection and rest of the collections remains unaffected.
Daily insertion rate is ~500M records (including 5 collections).
This observation left me confused.
TTL expiration thread run on single. Is it too much data for TTL to expire ?

Error in creating index in Mongo db using mongo console

I am using this query for creating the index:
db.CollectionName.createIndex({result: {$exists:true}, timestamp : {$gte: 1573890921898000}})
What am I trying to do here is, creating indexing on timestamp > last month and for only those data where result exist, but I am getting error
{
"ok" : 0,
"errmsg" : "Values in v:2 index key pattern cannot be of type object. Only numbers > 0, numbers < 0, and strings are allowed.",
"code" : 67,
"codeName" : "CannotCreateIndex"
}
What am I doing wrong here?
thanks to #prasad_ for the suggestion, i needed to use partial_Indexes so the query becomes :
db.CollectionName.createIndex({result: 1, timestamp: 1}, {partialFilterExpression :{result: {$exists:true}, timestamp : {$gte: 1573890921898000}}})

JSONata sort / order by of an array

I want to order an array. The JSONata expression below has an incoming array as follows.
[{"id":"Air-1a",
"Controller":"ESP62",
"Cntr-TaskNo":10,
"Cntr-GPIO":13,
"name":"Air",
"valueName":"Humidity",
"Sensor":"DHT22",
(and many other key pairs)},
{next object}, ...]
I then transform the array with the following JSONata expression:
payload.(
{ "Controller" : $.Controller,
"Cntr-TaskNo": $.CntrDef.TaskNo,
"Cntr-GPIO" : $.CntrDef.GPIO,
"name" : $.name,
"valueName" : $.valueName,
"Sensor" : $.Sensor,
"id" : $.id
}
)
But now I want to - in the same JSONata expression, sort on firstly the Controller, and then the GPIO. To tried with the Controller only first.
I tried:
payload.(
{ $sort("Controller",function($l, $r){$l.Controller > $r.Controller}) : $.Controller ,
"Cntr-TaskNo": $.CntrDef.TaskNo,
"Cntr-GPIO" : $.CntrDef.GPIO,
"name" : $.name,
"valueName" : $.valueName,
"Sensor" : $.Sensor,
"id" : $.id
}
)
As well as trying to add the sort function at the end with the ~> chaining command. I also tried the order-by operator.
Could anyone point me in the right direction?
//----------
The new flow with the changed 'ESP62' to '-' that does not work:
[{"id":"874b0c77.f87418","type":"inject","z":"6f27a311.d135bc","name":"","topic":"","payload":"","payloadType":"date","repeat":"","crontab":"","once":false,"onceDelay":0.1,"x":200,"y":180,"wires":[["8c196590.c20638"]]},{"id":"8c196590.c20638","type":"change","z":"6f27a311.d135bc","name":"Dataset","rules":[{"t":"set","p":"payload","pt":"msg","to":"[{\"id\":\"Air-1a\",\"Controller\":\"ESP62\",\"CntrTaskNo\":10,\"CntrGPIO\":13,\"name\":\"Air\",\"valueName\":\"Humidity\",\"Sensor\":\"DHT22\",\"aaa\":\"111\",\"bbb\":\"222\",\"ccc\":\"333\"},{\"id\":\"Air-2a\",\"Controller\":\"ESP72\",\"CntrTaskNo\":11,\"CntrGPIO\":14,\"name\":\"Air\",\"valueName\":\"Humidity\",\"Sensor\":\"DHT22\",\"aaa\":\"444\",\"bbb\":\"555\",\"ccc\":\"666\"},{\"id\":\"Air-1a\",\"Controller\":\"ESP62\",\"CntrTaskNo\":2,\"CntrGPIO\":9,\"name\":\"Air\",\"valueName\":\"Humidity\",\"Sensor\":\"DHT22\",\"aaa\":\"777\",\"bbb\":\"888\",\"ccc\":\"999\"},{\"id\":\"Air-1a\",\"Controller\":\"-\",\"CntrTaskNo\":10,\"CntrGPIO\":12,\"name\":\"Air\",\"valueName\":\"Humidity\",\"Sensor\":\"DHT22\",\"aaa\":\"777\",\"bbb\":\"888\",\"ccc\":\"999\"}]","tot":"json"}],"action":"","property":"","from":"","to":"","reg":false,"x":360,"y":180,"wires":[["13981162.14e28f"]]},{"id":"c8a256a5.a170c8","type":"debug","z":"6f27a311.d135bc","name":"","active":true,"tosidebar":true,"console":false,"tostatus":false,"complete":"false","x":690,"y":180,"wires":[]},{"id":"13981162.14e28f","type":"change","z":"6f27a311.d135bc","name":"Jsonata $sort","rules":[{"t":"set","p":"payload","pt":"msg","to":"($sort(payload,function($l , $r){$l.Controller > $r.Controller}) ; \t$sort(payload,function($l , $r){$l.CntrGPIO > $r.CntrGPIO}))","tot":"jsonata"}],"action":"","property":"","from":"","to":"","reg":false,"x":520,"y":180,"wires":[["c8a256a5.a170c8"]]}]
I suggest first sorting the dataset and afterward transform the already sorted array of objects. The transformation is trivial and you want to know how to sort, so I show below one possible solution. It uses an expression with two concatenated $sort functions.
Edited after a better understanding of the requirement.
I tested successfully a Node-RED flow using this expression in a change node:
($a := $sort(payload,function($l , $r){$l.Controller > $r.Controller}) ; $sort($a,function($l , $r){(($l.Controller = $r.Controller) and ($l.CntrGPIO > $r.CntrGPIO))}))
Flow (contain dataset set hardcoded):
[{"id":"a7814b7e.3adeb8","type":"tab","label":"Flow 4","disabled":false,"info":""},{"id":"8bf10833.c71748","type":"inject","z":"a7814b7e.3adeb8","name":"","topic":"","payload":"","payloadType":"date","repeat":"","crontab":"","once":false,"onceDelay":0.1,"x":140,"y":140,"wires":[["9e365564.edca08"]]},{"id":"9e365564.edca08","type":"change","z":"a7814b7e.3adeb8","name":"Dataset","rules":[{"t":"set","p":"payload","pt":"msg","to":"[{\"id\":\"Air-1a\",\"Controller\":\"ESP62\",\"CntrTaskNo\":10,\"CntrGPIO\":13,\"name\":\"Air\",\"valueName\":\"Humidity\",\"Sensor\":\"DHT22\",\"aaa\":\"111\",\"bbb\":\"222\",\"ccc\":\"333\"},{\"id\":\"Air-2a\",\"Controller\":\"ESP72\",\"CntrTaskNo\":11,\"CntrGPIO\":14,\"name\":\"Air\",\"valueName\":\"Humidity\",\"Sensor\":\"DHT22\",\"aaa\":\"444\",\"bbb\":\"555\",\"ccc\":\"666\"},{\"id\":\"Air-1a\",\"Controller\":\"ESP62\",\"CntrTaskNo\":2,\"CntrGPIO\":9,\"name\":\"Air\",\"valueName\":\"Humidity\",\"Sensor\":\"DHT22\",\"aaa\":\"777\",\"bbb\":\"888\",\"ccc\":\"999\"},{\"id\":\"Air-1a\",\"Controller\":\"-\",\"CntrTaskNo\":10,\"CntrGPIO\":12,\"name\":\"Air\",\"valueName\":\"Humidity\",\"Sensor\":\"DHT22\",\"aaa\":\"777\",\"bbb\":\"888\",\"ccc\":\"999\"}]","tot":"json"}],"action":"","property":"","from":"","to":"","reg":false,"x":300,"y":140,"wires":[["762f6421.074fec"]]},{"id":"f827bddb.c9acd","type":"debug","z":"a7814b7e.3adeb8","name":"","active":true,"tosidebar":true,"console":false,"tostatus":false,"complete":"false","x":630,"y":140,"wires":[]},{"id":"762f6421.074fec","type":"change","z":"a7814b7e.3adeb8","name":"Jsonata $sort","rules":[{"t":"set","p":"payload","pt":"msg","to":"($a := $sort(payload,function($l , $r){$l.Controller > $r.Controller}) ; $sort($a,function($l , $r){(($l.Controller = $r.Controller) and ($l.CntrGPIO > $r.CntrGPIO))}))","tot":"jsonata"}],"action":"","property":"","from":"","to":"","reg":false,"x":460,"y":140,"wires":[["f827bddb.c9acd"]]}]
also tested in Jsonata exerciser: http://try.jsonata.org/S1IlT3y-E
You can sort the array using the following expression:
payload^(Controller, CntrDef.GPIO)
The order-by operator ^ will sort the array, first by increasing value of Controller, then by increasing value of CntrGPIO. You can then transform each object within that array
payload^(Controller, CntrDef.GPIO).{
"Controller" : Controller,
"Cntr-TaskNo": CntrDef.TaskNo,
"Cntr-GPIO" : CntrDef.GPIO,
"name" : name,
"valueName" : valueName,
"Sensor" : Sensor,
"id" : id
}

Adding child node values

Below is the firebase Database of a child node of a particular user under the "users" node:
"L1Bczun2d5UTZC8g2LXchLJVXsh1" : {
"email" : "orabbz#yahoo.com",
"fullname" : "orabueze yea",
"teamname" : "orabbz team",
"total" : 0,
"userName" : "orabbz#yahoo.com",
"week1" : 0,
"week10" : 0,
"week11" : 0,
"week12" : 0,
"week2" : 0,
"week3" : 17,
"week4" : 0,
"week5" : 20,
"week6" : 0,
"week7" : 0,
"week8" : 0,
"week9" : 10
},
IS there a way to add up all the values of Weeks 1 down to week 12 and have the total sum in the total key?
I am curently thinking of bringing all the values of week 1 - week 12 brought into the angular js scope then adding up the values and then posting the total back in the firebase databse key total. But this sounds too long winded. is there a shorter solution?
As far as I know, Firebase DB doesn't have any DB functions as you'd have in SQL. So the options you have is to get the data and calculate it in Angular as you say. Or update a counter when the weeks are added to the user (when writing to the DB). Then just read the counter later

Extract values from an array in mongoDB to dataframe using rmongodb

I'm querying a database containing entries as displayed in the example. All entries contain the following values:
_id: unique id of overallitem and placed_items
name: the name of te overallitem
loc: location of the overallitem and placed_items
time_id: time the overallitem was stored
placed_items: array containing placed_items (can range from zero: placed_items : [], to unlimited amount.
category_id: the category of the placed_items
full_id: the full id of the placed_items
I want to extract the name, full_id and category_id on a per placed_items level given a time_id and loc constraint
Example data:
{
"_id" : "5040",
"name" : "entry1",
"loc" : 1,
"time_id" : 20121001,
"placed_items" : [],
}
{
"_id" : "5041",
"name" : "entry2",
"loc" : 1,
"time_id" : 20121001,
"placed_items" : [
{
"_id" : "5043",
"category_id" : 101,
"full_id" : 901,
},
{
"_id" : "5044",
"category_id" : 102,
"full_id" : 902,
}
],
}
{
"_id" : "5042",
"name" : "entry3",
"loc" : 1,
"time_id" : 20121001,
"placed_items" : [
{
"_id" : "5045",
"category_id" : 101,
"full_id" : 903,
},
],
}
The expected outcome for this example would be:
"name" "full_id" "category_id"
"entry2" 901 101
"entry2" 902 102
"entry3" 903 101
So if placed_items is empty, do put the entry in the dataframe and if placed_items containts n entries, put n entries in dataframe
I tried to work out an RBlogger example to create the desired dataframe.
#Set up database
mongo <- mongo.create()
#Set up condition
buf <- mongo.bson.buffer.create()
mongo.bson.buffer.append(buf, "loc", 1)
mongo.bson.buffer.start.object(buf, "time_id")
mongo.bson.buffer.append(buf, "$gte", 20120930)
mongo.bson.buffer.append(buf, "$lte", 20121002)
mongo.bson.buffer.finish.object(buf)
query <- mongo.bson.from.buffer(buf)
#Count
count <- mongo.count(mongo, "items_test.overallitem", query)
#Note that these counts don't work, since the count should be based on
#the number of placed_items in the array, and not the number of entries.
#Setup Cursor
cursor <- mongo.find(mongo, "items_test.overallitem", query)
#Create vectors, which will be filled by the while loop
name <- vector("character", count)
full_id<- vector("character", count)
category_id<- vector("character", count)
i <- 1
#Fill vectors
while (mongo.cursor.next(cursor)) {
b <- mongo.cursor.value(cursor)
order_id[i] <- mongo.bson.value(b, "name")
product_id[i] <- mongo.bson.value(b, "placed_items.full_id")
category_id[i] <- mongo.bson.value(b, "placed_items.category_id")
i <- i + 1
}
#Convert to dataframe
results <- as.data.frame(list(name=name, full_id=full_uid, category_id=category_id))
The conditions work and the code works if I would want to extract values on an overallitem level (i.e. _id or name) but fails to gather the information on a placed_items level. Furthermore, the dotted call for extracting full_id and category_id does not seem to work. Can anyone help?

Resources