How to find a max value of a specific key in a dictionary in a list in a document in mongodb? - database

I want to use an aggregation to get the highest value of a specified key that's in a dict field that's in a list field that's in a document that's in a mongodb collection.
Here's some example data
[
{
"name": "hi",
"hist": [
{
"username": "bill",
},
{
"username": "jack",
"changed_from": 127
}
]
},
{
"name": "member1",
"hist": [
{
"username": "asdf",
"changed_from": 123
},
{
"username": "duhby",
"changed_from": 126
}
]
},
{
"name": "member5",
"hist": [
{
"username": "duhby",
"changed_from": 150
},
{
"username": "test",
"changed_from": 123
},
{
"username": "duhby",
"changed_from": 125
}
]
}
]
I want to be able to put in duhby as the username, for example, and get at least a list of results I can then easily get the maximum value of, with the maximum value in this case being 150.
I tried using an aggregate group but got stuck when trying to only get the data from the specific username, and not just all documents that had that username in the hist field.
db.collection.aggregate([
{
$group: {
"_id": "$hist.duhby",
update_time: {
$max: "$hist.changed_from"
}
}
}
])
With the example data shown earlier, this returns:
[
{
"_id": [],
"update_time": [
150,
123,
125
]
}
]
However, this isn't useful because it shows every changed_from value when I want it to only show (and sort by) the ones with the username specified. Expected result:
[
{
"_id": [],
"update_time": [
150,
125
]
}
]
I also want to be able to get the original document and maybe have the name field as the id in the aggregation, but the id is currently returned as an empty list.

I realized here that what I needed is just a find, not an aggregation, because I was just trying to see if any document existed that had the embedded document with a key value pair of a certain value. The following does what I was looking for:
db.collection.find({
"hist": {
$elemMatch: {
username: "duhby",
changed_from: {
"$lte": 123
}
}
}
})

Related

Is it possible to get key value pairs from snowflake api instead rowType?

I'm working with an API from snowflake and to deal with the json data, I would need to receive data as key-value paired instead of rowType.
I've been searching for results but haven't found any
e.g. A table user with name and email attributes
Name
Email
Kelly
kelly#email.com
Fisher
fisher#email.com
I would request this body:
{
"statement": "SELECT * FROM user",
"timeout": 60,
"database": "DEV",
"schema": "PLACE",
"warehouse": "WH",
"role": "DEV_READER",
"bindings": {
"1": {
"type": "FIXED",
"value": "123"
}
}
}
The results would come like:
{
"resultSetMetaData": {
...
"rowType": [
{ "name": "Name",
...},
{ "name": "Email",
...}
],
},
"data": [
[
"Kelly",
"kelly#email.com"
],
[
"Fisher",
"fisher#email.com"
]
]
}
And the results needed would be:
{
"resultSetMetaData": {
...
"data": [
[
"Name":"Kelly",
"Email":"kelly#email.com"
],
[
"Name":"Fisher",
"Email":"fisher#email.com"
]
]
}
Thank you for any inputs
The output is not valid JSON, but the return can arrive in a slightly different format:
{
"resultSetMetaData": {
...
"data":
[
{
"Name": "Kelly",
"Email": "kelly#email.com"
},
{
"Name": "Fisher",
"Email": "fisher#email.com"
}
]
}
}
To get the API to send it that way, you can change the SQL from select * to:
select object_construct(*) as KVP from "USER";
You can also specify the names of the keys using:
select object_construct('NAME', "NAME", 'EMAIL', EMAIL) from "USER";
The object_construct function takes an arbitrary number of parameters, as long as they're even, so:
object_construct('KEY1', VALUE1, 'KEY2', VALUE2, <'KEY_N'>, <VALUE_N>)

MongoDB - Pipeline $lookup with $group losing fields

I only have 2 years exp with SQL databases and 0 with NoSQL database. I am trying to write a pipeline using MongoDB Compass aggregate pipeline tool that performs a lookup, group, sum, and sort. I am using MongoDB compass to try and accomplish this. Also, please share any resources that make learning this easier, I've not had much like finding good and easy-to-understand examples online with using the compass to accomplish these tasks. Thank you.
An example question I am trying to solve is:
What customer placed the highest number of orders?
Example Data is:
Customer Collection:
[
{ "_id": { "$oid": "6276ba2dd1dfd6f5bf4b4f53" },
"Id": "1",
"FirstName": "Maria",
"LastName": "Anders",
"City": "Berlin",
"Country": "Germany",
"Phone": "030-0074321"},
{ "_id": { "$oid": "6276ba2dd1dfd6f5bf4b4f54" },
"Id": "2",
"FirstName": "Ana",
"LastName": "Trujillo",
"City": "México D.F.",
"Country": "Mexico",
"Phone": "(5) 555-4729" }
]
Order Collection:
[
{ "_id": { "$oid": "6276ba9dd1dfd6f5bf4b501f" },
"Id": "1",
"OrderDate": "2012-07-04 00:00:00.000",
"OrderNumber": "542378",
"CustomerId": "85",
"TotalAmount": "440.00" },
{ "_id": { "$oid": "6276ba9dd1dfd6f5bf4b5020" },
"Id": "2",
"OrderDate": "2012-07-05 00:00:00.000",
"OrderNumber": "542379",
"CustomerId": "79",
"TotalAmount": "1863.40" }
]
I have spent all day looking at YouTube videos and MongoDB documentation but I am failing to comprehend a few things. One, at the time I do a $group function I lose all the fields not associated with the group and I would like to keep a few fields. I would like to have it returned the name of the customer with the highest order.
The pipeline I was using that gets me part of the way is the following:
[{
$lookup: {
from: 'Customer',
localField: 'CustomerId',
foreignField: 'Id',
as: 'CustomerInfo'
}}, {
$project: {
CustomerId: 1,
CustomerInfo: 1
}}, {
$group: {
_id: '$CustomerInfo.Id',
CustomerOrderNumber: {
$sum: 1
}
}}, {
$sort: {
CustomerOrderNumber: -1
}}]
Example data this returns in order:
Apologies for the bad formatting, still trying to get the hang of posting questions that are easy to understand and useful.
In $group stage, it only returns documents with _id and CustomerOrderNumber fields, so CustomerInfo field was missing.
$lookup
$project - From 1st stage, CustomerInfo returns as an array, hence getting the first document as a document field instead of an array field.
$group - Group by CustomerId, sum the documents as CustomerOrderNumber, and take the first document as CustomerInfo.
$project - Decorate the output documents.
$setWindowsFields - With $denseRank to rank the document position by CustomerOrderNumber (DESC). If there are documents with same CustomerOrderNumber, the ranking will treat them as same rank/position.
$match - Select documents with denseRankHighestOrder is 1 (highest).
db.Order.aggregate([
{
$lookup: {
from: "Customer",
localField: "CustomerId",
foreignField: "Id",
as: "CustomerInfo"
}
},
{
$project: {
CustomerId: 1,
CustomerInfo: {
$first: "$CustomerInfo"
}
}
},
{
$group: {
_id: "$CustomerInfo.Id",
CustomerOrderNumber: {
$sum: 1
},
CustomerInfo: {
$first: "$CustomerInfo"
}
}
},
{
$project: {
_id: 0,
CustomerId: "$_id",
CustomerOrderNumber: 1,
CustomerName: {
$concat: [
"$CustomerInfo.FirstName",
" ",
"$CustomerInfo.LastName"
]
}
}
},
{
$setWindowFields: {
sortBy: {
CustomerOrderNumber: -1
},
output: {
denseRankHighestOrder: {
$denseRank: {}
}
}
}
},
{
$match: {
denseRankHighestOrder: 1
}
}
])
Sample Mongo Playground
Note:
$sort stage able to sort the document by CustomerOrderNumber. But if you try to limit the documents such as "SELECT TOP n", the output result may be incorrect when there are multiple documents with the same CustomerOrderNumber/rank.
Example: SELECT TOP 1 Customer who has the highest CustomerOrderNumber but there are 3 customers who have the highest CustomerOrderNumber.

MongoDB Track data changes

I want to track changes on MongoDB Documents. The big Challenge is that MongoDB has nested Documents.
Example
[
{
"_id": "60f7a86c0e979362a25245eb",
"email": "walltownsend#delphide.com",
"friends": [
{
"name": "Hancock Nelson"
},
{
"name": "Owen Dotson"
},
{
"name": "Cathy Jarvis"
}
]
}
]
after the update/change
[
{
"_id": "60f7a86c0e979362a25245eb",
"email": "walltownsend#delphide.com",
"friends": [
{
"name": "Daphne Kline" //<------
},
{
"name": "Owen Dotson"
},
{
"name": "Cathy Jarvis"
}
]
}
]
This is a very basic example of a highly expandable real world use chase.
On a SQL Based Database, I would suggest some sort of this solution.
The SQL way
users
_id
email
60f7a8b28db7c78b57bbc217
cathyjarvis#delphide.com
friends
_id
user_id
name
0
60f7a8b28db7c78b57bbc217
Hancock Nelson
1
60f7a8b28db7c78b57bbc217
Suarez Burt
2
60f7a8b28db7c78b57bbc217
Mejia Elliott
after the update/change
users
_id
email
60f7a8b28db7c78b57bbc217
cathyjarvis#delphide.com
friends
_id
user_id
name
0
60f7a8b28db7c78b57bbc217
Daphne Kline
1
60f7a8b28db7c78b57bbc217
Suarez Burt
2
60f7a8b28db7c78b57bbc217
Mejia Elliott
history
_id
friends_id
field
preUpdate
postUpdate
0
0
name
Hancock Nelson
Daphne Kline
If there is an update and the change has to be tracked before the next update, this would work for NoSQL as well. If there is a second Update, we have a second line in the SQL database and it't very clear. On NoSQL, you can make a list/array of the full document and compare changes during the indexes, but there is very much redundant information which hasn't changed.
Have a look at Set Expression Operators
$setDifference
$setEquals
$setIntersection
Be ware, these operators perform set operation on arrays, treating arrays as sets. If an array contains duplicate entries, they ignore the duplicate entries. They ignore the order of the elements.
In your example the update would result in
removed: [ {name: "Hancock Nelson" } ],
added: [ {name: "Daphne Kline" } ]
If the number of elements is always the same before and after the update, then you could use this one:
db.collection.insertOne({
friends: [
{ "name": "Hancock Nelson" },
{ "name": "Owen Dotson" },
{ "name": "Cathy Jarvis" }
],
updated_friends: [
{ "name": "Daphne Kline" },
{ "name": "Owen Dotson" },
{ "name": "Cathy Jarvis" }
]
})
db.collection.aggregate([
{
$set: {
difference: {
$map: {
input: { $range: [0, { $size: "$friends" }] },
as: "i",
in: {
$cond: {
if: {
$eq: [
{ $arrayElemAt: ["$friends", "$$i"] },
{ $arrayElemAt: ["$updated_friends", "$$i"] }
]
},
then: null,
else: {
old: { $arrayElemAt: ["$friends", "$$i"] },
new: { $arrayElemAt: ["$updated_friends", "$$i"] }
}
}
}
}
}
}
},
{
$set: {
difference: {
$filter: {
input: "$difference",
cond: { $ne: ["$$this", null] }
}
}
}
}
])

MongoDB: How to count number of values in key

I'm very new to MongoDB and I need help figuring out how to perform aggregation on a key in MongoDB and use that result to return matches.
For example, if I have a collection called Fruits with the following documents:
{
"id": 1,
"name": "apple",
"type": [
"Granny smith",
"Fuji"
]
}, {
"id": 2,
"name": "grape",
"type": [
"green",
"black"
]
}, {
"id": 3,
"name": "orange",
"type": [
"navel"
]
}
How do I write a query that will return the names of the fruits with 2 types, ie apple and grape?
Thanks!
Demo - https://mongoplayground.net/p/ke3VJIErhvb
use $size to get records with 2 number of type
https://docs.mongodb.com/manual/reference/method/db.collection.find/#mongodb-method-db.collection.find
The $size operator matches any array with the number of elements specified by the argument. For example:
db.collection.find({
type: { "$size": 2 } // match document with type having size 2
},
{ name: 1 } // projection to get name and _id only
)
To get the length of the array you should use $size operator in $project pipeline stage
So the pipeline $project stage should look like this
{
"$project": {
"name": "$name",
type: {
"$size": "$type"
}
}
}
Here is an working example of the same ⇒ https://mongoplayground.net/p/BmS9BGhqsFg

Return one array of data in sub-document of Mongodb

I'm using Nodejs with Mongoose package.
Given I've something like this:-
let people = [
{
"_id": 1,
"name": "Person 1",
"pets": [
{
"_id": 1,
"name": "Tom",
"category": "cat"
},
{
"_id": 2,
"name": "Jerry",
"category": "mouse"
}
]
}
]
I want to get only the data of Jerry in pets array using it's _id (result shown below)
{
"_id": 2,
"name": "Jerry",
"category": "mouse"
}
Can I get it without needing to specify the _id of person 1 when using $elemMatch? Right now I code like this:-
const pet = People.find(
{ "_id": "1"}, // specifying 'person 1 _id' first
{ pets: { $elemMatch: { _id: 2 } } } // using 'elemMatch' to get 'pet' with '_id' of '2'
)
And it gave me what I want like I've shown you above. But is there any other way I can do this without needing to specify the _id of it's parent first (in this case, the _id of the people array)
Assuming nested array's _id's are unique you can filter by nested array elements directly:
const pet = People.find(
{ "pets._id": 2 },
{ pets: { $elemMatch: { _id: 2 } } }
)

Resources