My goal is to find songs that match user's choices. User has the opportunity to exclude genres he doesn't like.
This is basically a duplicate of this question, except that I'd like to be able to do this in an Aggregate operation instead of a Find, as I need to add other stages.
Right now, I am trying in an aggregation to exclude songs that belong to genres user doesn't like.
My song json is as such (keep in mind this is dummy data - but this song is really good though):
{
"_id": {
"$oid": "5890aa3b0a9f110011698fac"
},
"artist": "Beach House",
"songName": "Master Of None",
"genres": [
{
"$oid": "58624b4298fba881a46663a01"
},
{
"$oid": "58624b9d98fba772a46663a05"
}
]
}
A song can have multiple genres, stored in an array of objects as references to Genre documents in a different collection.
Basically, user's disliked genres are in an array of genre $oid.
Say
dislikedGenres = [ "786761gg1G176ga1", "78676187g1G176hsj3", "78676187g1G1761sj4" ]
What I'm trying to do is say "if you find a song with any of these genres, exclude it".
Any idea how to achieve this? I feel like I'm missing something dramatically obvious here...
Thanks in advance for the help! Much appreciated.
Please leave a comment if you need extra info.
Related
Below is my MongoDb collection structure:
{
"_id": {
"$oid": "61efa44933eabb748152a250"
},
"title": "my first blog",
"body": "Hello everyone,wazuzzzzzzzzzzzzzzzzzup",
"comments": [{
"comment": "fabulous work bruhv",
}]
}
}
Is there a way to auto generate ids for comments without using something like this:
db.messages.insert({messages:[{_id:ObjectId(), message:"Message 1."}]});
I found the above method from the SO question:
mongoDB : Creating An ObjectId For Each New Child Added To The Array Field
But someone in the comments pointed out that:
"I have been looking at how to generate a JSON insert using ObjectID() and in my travels have found that this solution is not very good for indexing. The proposed solution has _id values that are strings rather than real object IDs - i.e. "56a970405ba22d16a8d9c30e" is different from ObjectId("56a970405ba22d16a8d9c30e") and indexing will be slower using the string. The ObjectId is actually represented internally as 16 bytes."
So is there a better way to do this?
I am practicing arangodb in company.
When I want to express the following relation between the user and the user, I want to delete the data of the corresponding following relation when one user is deleted.
user collection
{
"_key": "test4",
"_id": "users/test4",
"_rev": "_V8yGRra---"
},
{
"_key": "test2",
"_id": "users/test2",
"_rev": "_V8whISG---"
},
{
"_key": "test1",
"_id": "users/test1",
"_rev": "_V8whFQa---"
},
{
"_key": "test3",
"_id": "users/test3",
"_rev": "_V8yoFWO---",
"userKey": "test3"
}
follow collection[edge]
{
"_key": "48754",
"_id": "follow/48754",
"_from": "users/test1",
"_to": "users/test2",
"_rev": "_V8wh4Xe---"
}
{
"_key": "57447",
"_id": "follow/57447",
"_from": "users/test2",
"_to": "users/test3",
"_rev": "_V8yHGQq---"
}
If used properly, the ArangoDB system ensures the integrity of named graphs (GRAPHs).
To delete a specific user (say "users/test4") and the corresponding edges in follow manually, an AQL query along the following lines should suffice to delete the edges:
for v,e IN 1..1 ANY "users/test4" follow
REMOVE e IN follow
COLLECT WITH COUNT INTO counter
RETURN counter
Assuming "users/test4" is not referenced elsewhere, the node can then safely be deleted, e.g. by
REMOVE "test4" in users
The important point here is that when manually deleting nodes, all the relevant edge collections must be identified and managed explicitly.
First you should create a graph with your vertex and edge collection. Working with graphs you can use the REST API to remove a vertex. This way all edges pointing to this vertex and the vertex itself get removed.
DELETE /_api/gharial/{graph-name}/vertex/{collection-name}/{vertex-key}
You can find the documentation including an example under https://docs.arangodb.com/3.2/HTTP/Gharial/Vertices.html#remove-a-vertex.
It is also possible to achieve this with an AQL query, for example deleting test1 from the users collection:
LET keys = (
FOR v, e IN 1..1 ANY 'users/test1' GRAPH 'your-graph-name' RETURN e._key)
LET r = (FOR key IN keys REMOVE key IN follow) REMOVE 'test1' IN users
A graph traversal is used to get _key attributes of all edges pointing to test1, then these edges are removed from the follow collection and test1 is removed from the users collection.
I recently found jsonschema and I've been loving using it, however recently I've come across something that I want to do that I just haven't been able to figure out.
What I want to do is to validate that an array must contain an element that matches a schema, but I don't want to have validation fail on other elements that would be in the list.
Say that I have an array like the following:
arr = [
{"some object": True},
False,
{"AnotherObj": "a string this time"},
"test"
]
I want to be able to do something like "validate that arr contains an object that has a property 'some object' that is a boolean, and error if it doesn't, but don't care about other elements."
I don't want it to validate the other items in the list. I just want to make sure that the list contains an element that matches the schema at least once. I also do not know the order which the elements will arrive in the array.
I've tried this already with a schema like:
{"type": "array",
"items": {
"type": "object",
"properties": {
"tool": {
# A schema here to validate tool
},
"required": ["tool"]
}
}
The problem is that it requires every item in the array to have the property "tool", and not what I actually want.
Any help anyone can give me with this would be much appreciated! I've been stumped on this for a really long time with no forward progress.
Thanks!
I've gotten an answer to this question:
The schema used is (where ... B ... is the schema to require):
{
"type": "array",
"not": {
"items": {
"not": {... B ...}
}
}
}
It basically works out to be something like "Ensure that not (items don't match B)". I'm not 100% clear on why this works the way it does, but it does so I figured I'd share it for posterity.
I'm new bee in mongodb.
I made a nested array document like this.
data = {
"title": "mongo community",
"description": "I am a new bee",
"topics": [{
"title": "how to find object in array",
"comments": [{
"description": "desc1"
}]
},
{
"title": "the case to use ensureIndex",
"comments": [{
"description": "before query"
},
{
"description": "If you want"
}
]
}
]
}
after that, put it in the "community"
db.community.insert(data)
so,I would like to accumulate "comments" which topics title is "how to find object in array"
then I tried,
data = db.community.find_one({"title":"mongo community","topics.title":"how to find object in array" } )
the result is
>>> print data
{
u 'topics': [{
u 'comments': [{
u 'description': u 'desc1'
}],
u 'title': u 'how to find object in array'
},
{
u 'comments': [{
u 'description': u 'before query'
},
{
u 'description': u 'If you want'
}],
u 'title': u 'the case to use ensureIndex'
}],
u '_id': ObjectId('4e6ce188d4baa71250000002'),
u 'description': u 'I am a new bee',
u 'title': u 'mongo community'
}
I don't need the topics "the case to use ensureIndex"
Whould you give me any advice.
thx.
It looks like you're embedding topics as an array all in a single document. You should try to avoid returning partial documents frequently from MongoDB. You can do it with the "fields" argument of the find method, but it isn't very easy to work with if you're doing it frequently.
So to solve this you could try to make each topic a separate document. I think that would be easier for you too. If you want to save information about the "community" for forum, put it in a separate collection. For example, you could use the following in the monbodb shell:
// ad a forum:
var forum = {
title:"mongo community",
description:"I am a new bee"
};
db.forums.save(forum);
// add first topic:
var topic = {
title: "how to find object in array",
comments: [ {description:"desc1"} ],
forum:"mongo community"
};
db.topics.save(topic);
// add second topic:
var topic = {
title: "the case to use ensureIndex",
comments: [
{description:"before query"},
{description:"If you want"}
],
forum:"mongo community"
};
db.topics.save(topic);
print("All topics:");
printjson(db.topics.find().toArray());
print("just the 'how to find object in array' topic:")
printjson(db.topics.find({title:"how to find object in array"}).toArray());
Also, see the document Trees In MongoDB about schema design in MongoDB. It happens to be using a similar schema to what you are working with and expands on it for more advanced use cases.
MongoDB operates on documents, that is, the top level documents (the things you save, update, insert, find, and find_one on). Mongo's query language lets you search within embedded objects, but will always return, update, or manipulate one (or more) of these top-level documents.
MongoDB is often called "schema-less," but something more like "(has) flexible schemas" or "(has) per-document schemas" would be a more accurate description. This is a case where your schema design -- having topics embedded directly within a community -- is not working for this particular query. However there are probably other queries that this schema supports more efficiently, like listing the topics within a community in a single query. You might want to consider the queries you want to make and re-design your schema accordingly.
A few notes on MongoDB limitations:
top-level documents are always returned (optionally with only a subset of fields, as #scott noted -- see the mongodb docs on this topic)
each document is limited to 16 megabytes of data (as of version 1.8+), so this schema will not work well if the communities have a long list of topics
For help with schema design, see the mongodb docs on schema design, Kyle Banker's video "Schema Design Basics", and Eliot Horowitz's video "Schema Design at Scale" for an introduction, tips, and considerations.
Hello everyone and thanks in advance for any ideas, suggestions or answers.
First, the environment: I am using CouchDB (currently developing on 1.0.2) and couchdb-lucene 0.7. Obviously, I am using couchdb-lucene ("c-l" hereafter) to provide full-text searching within couchdb.
Second, let me provide everyone with an example couchdb document:
{
"_id": "5580c781345e4c65b0e75a220232acf5",
"_rev": "2-bf2921c3173163a18dc1797d9a0c8364",
"$type": "resource",
"$versionids": [
"5580c781345e4c65b0e75a220232acf5-0",
"5580c781345e4c65b0e75a220232acf5-1"
],
"$usagerights": [
{
"group-administrators": 31
},
{
"group-users": 3
}
],
"$currentversionid": "5580c781345e4c65b0e75a220232acf5-1",
"$tags": [
"Tag1",
"Tag2"
],
"$created": "/Date(1314973405895-0500)/",
"$creator": "administrator",
"$modified": "/Date(1314973405895-0500)/",
"$modifier": "administrator",
"$checkedoutat": "/Date(1314975155766-0500)/",
"$checkedoutto": "administrator",
"$lastcommit": "/Date(1314973405895-0500)/",
"$lastcommitter": "administrator",
"$title": "Test resource"
}
Third, let me explain what I want to do. I am trying to figure out how to index the '$usagerights' property. I am using the word index very loosely because I really do not care about being able to search it, I simply want to 'store' it so that it is returned with the search results. Anyway, the property is an array of json objects. Now, these json objects that compose the array will always have a single json property.
Based on my understanding of couchdb-lucene, I need to reduce this array to a comma separated string. I would expect something like "group-administrators:31,group-users:3" to be a final output.
Thus, my question is essentially: How can I reduce the $usagerights json array above to a comma separated string of key:value pairs within the couchdb design document as used by couchdb-lucene?
A previous question I posted regarding indexing of tagging in a similar situation, provided for reference: How-to index arrays (tags) in CouchDB using couchdb-lucene
Finally, if you need any additional details, please just post a comment and I will provide it.
Maybe I am missing something, but the only difference I see from your previous question, is that you should iterate on the objects. Then the code should be:
function(doc) {
var result = new Document(), usage, right;
for(var i in doc.$usagerights) {
usage = doc.$usagerights[i];
for(right in usage) {
result.add(right + ":" + usage[right]);
}
}
return result;
}
There's no requirement to convert to a comma-separated list of values (I'd be intrigued to know where you picked up that idea).
If you simply want the $usagerights item returned with your results, do this;
ret.add(JSON.stringify(doc.$usagerights),
{"index":"no", "store":"yes", "field":"usagerights"});
Lucene stores strings, not JSON, so you'll need to JSON.parse the string on query.