MongoDB unique index for all array elements - arrays

I'm trying to create unique index for array field in document. This index should works like when I have one document with array which contain two elements, then if I want to add a new document where array field if contain these two elements then should happen duplicate error - but not in situation when only one of elements is duplicated in another array.
Maybe I'll show the example what I mean:
First I create a simple document:
{
"name" : "Just a name",
"users" : [
"user1",
"user2"
]
}
And I want to create unique index on 'users' array field. The result of what I want is to make it possible to create another documents like this:
{
"name" : "Just a name",
"users" : [
"user1",
"user3"
]
}
or
{
"name" : "Just a name",
"users" : [
"user2",
"user5"
]
}
BUT it should be impossible to create second:
{
"name" : "Just a name",
"users" : [
"user1",
"user2"
]
}
Or reversed:
{
"name" : "Just a name",
"users" : [
"user2",
"user1"
]
}
But this is impossible because Mongo give me a error that "users1" is duplicated.
Is it possible to create unique index on all array elements as shown above?

As per the Mongo official documentation
For unique indexes, the unique constraint applies across separate documents in the collection rather than within a single document.
Because the unique constraint applies to separate documents, for a unique multikey index, a document may have array elements that result in repeating index key values as long as the index key values for that document do not duplicate those of another document.
So you can't insert the second documents as
{
"name" : "Just a name",
"users" : [
"user1",
"user3"
]
}
You will get the duplicate error of unique constraint:
> db.st1.insert({"name":"Just a name","users":["user1","user3"]})
WriteResult({
"nInserted" : 0,
"writeError" : {
"code" : 11000,
"errmsg" : "E11000 duplicate key error collection: test.st1 index: users_1 dup key: { : \"user1\" }"
}
})
Since users user1 already exist the users index for the the fist documents.
Currently you have only solution to manage it through your code from where you are inserting into the collection. Before save or update make a validation logic and make sure the constraint you want to impose.

I have a very similar problem and sadly it seems it's not possible. Not because unique constraint applies only across separate documents, but because:
To index a field that holds an array value, MongoDB creates an index key for each element in the array
i.e. each of the individual array elements has to be unique across all other documents.

Related

Is MongoDB's Index Alternative 1 or Alternative 2 or Alternative 3?

Use the common definition that:
Alternative 1 Index = Index stores "Whole data record with key value k"
Alternative 2 Index = Index stores "<k, _id of a data record with search key value k>"
Alternative 3 Index = Index stores "<k, list of _ids of data records with search key value k>"
I checked the mongo index readme in https://docs.mongodb.com/manual/indexes/, it looks like Alternative 2, but wanted to confirm.
By default MongoDB creates a unique index on the _id field during the creation of a collection. You can see the default index (_id) and others you have created with the mongo Shell.
db.collection.getIndexes() returns an array of documents that hold
index information for the collection.
[
{
"v" : 2,
"key" : {
"_id" : 1
},
"name" : "_id_"
},
...
]
v: The version of the index.
key: This is an unique index with the
_id field in ascending order.
name: The name of the index.

How mongodb sorts the documents when an object is used as an index?

If every document has an array of objects, let say :
hobbies:[
{
"title": "Swimming",
"frequency": 4
},
{
"title": "Playing",
"frequency": 3
}
]
and I use hobbies as an Index, then how all the documents in my db will be stored in an sorted manner? Which field will it consider to sort all the documents in index?
You can create an index on hobbies field like this as a compound-index :
db.collection.createIndex( { "hobbies.title": 1, "hobbies.frequency": 1 } )
So as hobbies is an array then eventually if you get hobbies.title,hobbies.frequency it will also be an array, So as if MongoDB finds an array to be indexed then it would create multikey-index on that particular field (Basically in above scenario your document will be unwinded into two docs on title & frequency one for first object in array & another for second object in array in ascending order).

Querying composite primary key in mongodb

I have a mongodb collection in which document is stored in the below format
{
"_id": {
"host_ip": "192.41.15.161",
"date": "2020-02-02T08:18:09.207Z"
},
"path": "/apache_pb.gif",
"request": "GET /apache_pb.gif HTTP/1.0",
"status": 200
}
where "host_ip" and "date" should be composite primary key i.e. (unique together) and there exists an _id index, which I think is created based on these two fields
So, how can I query based on host_ip and date together so that the "_id" index can be utilized?
Tried using
db.collection.find({_id: {host_ip: "192.41.15.161", date: {$gte: ISODate('2020-02-02T08:00:00:00.000Z')}}}), but it does not work, it does not even return the record which should match. Is it not the correct way to query?
Query like
db.collection.find({"_id.host_ip": "192.41.15.161", "_id.date": {$gte: ISODate('2020-02-02T08:00:00:00.000Z')}}), worked but it does not use the index created on "_id"
When querying for an _id composite primary key, mongo seems to only look for exact matches (essentially treating _id like a string), so when you query for _id: {a: x, $gte: y} it looks for "{a: x, $gte: y}" and doesn't allow querying by any portion of the object.
> db.test.insert({_id: {a: 1, b: 2}});
WriteResult({ "nInserted" : 1 })
> db.test.find({_id: {a: 1}}); // no results
> db.test.find({_id: {a: 1, b: {$eq: 2}}}); // no results
As far as solving your problem goes, I'd probably set up a new unique compound index on the fields you care about & ignore the _id field entirely.
MongoDB and composite primary keys has a few more details.

In mongoDB how to reach to value of the field "name" in the array of objects

I need to search how many people have a specific friend (by name) and each records in JSON have an array of friends. How can I reach to the value of the field "name" in the array of objects in mongoDB shell? Maybe this field need to be indexed before perform searches on this field?
"friends" : [{ "id" : 0, "name" : "Baird Fitzpatrick" }, { "id" : 1, "name" : "Karyn Benjamin" }, { "id" : 2, "name" : "Stacey Fuentes" }]
You can search nested object in mongodb by providing a path to nested property:
db.YOURCOLLECTION.find({"friends.name": "ABC"});
Read more about nested object query here
Maybe this field need to be indexed before perform searches on this field?
Of course you're able to create index on nested field. Just provide an index with property is path to nested field like above:
db.YOURCOLLECTION.createIndex( { "friends.name": 1 } )
For advance usage, you can read about $regex query

How do I find documents with array field that contains as many of the keywords as possible?

Lets say I want to find the documents with the field "tags" that contain tags: "a", "b", "c".
If I use $and operator, it will only return the documents where "tags" contain all three tags.
Such a strict search is not what I want. If I choose to use $or operator, it will find docs that contain at least one tag of the list, but it won't try to check whether there are docs that contain several or all of them first.
What I want to do is to search docs that contain "as much tags as possible, but at least one", or in other words, find all the docs that contain at least one tag, but show the ones that have most matches first.
I know that I could do this by doing a series of queries (e.g., use $and query and then $or), but if there are more that 2 tags, I'll have to make lots of queries with different combinations of tags to achieve good results.
You can aggregate the result as below:
$match all the documents which have at least 1 match.
$project a variable weight which holds the count of the number of
matching tags that the document contains. To find the matching tags,
use the $setIntersection operator.
$sort by the weight in descending order.
$project the required fields.
sample data:
db.t.insert([{"tags":["a","b","c"]},
{"tags":["a"]},
{"tags":["a","b"]},
{"tags":["a","b","c","d"]}])
search criteria:
var search = ["a","b"];
code:
db.t.aggregate([
{$match:{"tags":{$in:search}}},
{$project:{"weight":{$size:{$setIntersection:["$tags",search]}},
"tags":"$tags"}},
{$sort:{"weight":-1}},
{$project:{"tags":1}}
])
o/p:
{
"_id" : ObjectId("54e23b74c6185de718484948"),
"tags" : [
"a",
"b",
"c"
]
}
{ "_id" : ObjectId("54e23b74c6185de71848494a"), "tags" : [ "a", "b" ] }
{
"_id" : ObjectId("54e23b74c6185de71848494b"),
"tags" : [
"a",
"b",
"c",
"d"
]
}
{ "_id" : ObjectId("54e23b74c6185de718484949"), "tags" : [ "a" ] }

Resources