This is a fairly generic issue, and after just getting started with MongoDB I am trying to find which is the better option for this schema design.
Suppose we have those entities: Product, Category and Order
In an SQL scenarion, this would be the traditional approach for the schema:
Category -> Product -> OrderDetails <- Order
(A Category has Many Products and and each Product has only A Category | A product is in many Orders and An order has Many products, therefore we create an OrderDetails join table)
How should the same approach be in a MongoDB Schema?
Suppose a category has at most 10 to 12 products only.
Scenario 1: Embedding Products into Categories:
Category:
{
"_id" : Category_ID,
"name" : "Category description"
"products" : [ { "product_name": "First product", "price": 500 }, { "product_name": "Second product", "price": 150 } ]
}
Order:
{
"_id" : Order_ID,
"order_total" : 500
"products" : [ { "product_name": "First product", "price": 500 }]
}
Scenario B: Embedding Category into Product AND having an arrays of product references in Order
Product
{
"_id" : First product,
"price" : 500
"category" : { "category_name": "First category name", "description": "some random description" }
}
Order
{
"_id" : Order_ID,
"order_total" : 500
"products" : [ product id1, product id2]
}
Depending on how many products each category has, embedding the product inside the category might be an ideal scenario (there's a 16MB limit on documents), and depending on the usage pattern, for example, do you always get a category and display the products inside of it? or is the usage pattern getting the product each time?
Another question to ask is, does the category need its own identity, do you have a page that lists more information about a category? Or is a category just a description? If that's the case then I'd personally treat it more as a tag, see the example below:
{
"_id" : id,
"name" : "Product name",
"description" : "Product name",
"price" : 500
"categories" : [ "Category name 1", "Category name 2" ]
}
As for the order document, I'd copy the whole product into the order, you always see similar patterns within say SQL, and this is because if the product changes like its price or name, you want the order to have a snapshot of the product at the time of the order. You don't want the price of the order to change after the customer has paid and it's been shipped.
{
"_id" : Order_ID,
// "order_total" : 500 * this would be calculated
"products" : [
{
"_id" : id,
"name" : "Product name 1",
"description" : "Product name 1",
"price" : 500
"categories" : [ "Category name 1", "Category name 2" ]
},
{
"_id" : id,
"name" : "Product name 2",
"description" : "Product name 2",
"price" : 500
"categories" : [ "Category name 1", "Category name 2" ]
}
]
}
I know this isn't an actual pure answer, more like a bunch more questions, however, I hope it gives some direction for your schema design.
Related
I am having trouble understanding why an index is not able to cover a certain query, when my interpretation of documentation suggests it should... :)
The document I am referring to is: https://docs.mongodb.com/manual/core/index-multikey/
I am creating an index on a property which is part of an array of objects. The value indexed is present in other documents. The query looks up directly for the value of the property in the array. But when I look at the plan in the profiler, it is looking through the entire collection.
The structure of the document is as follows:
{
"userEmail": "string",
"basicInformation": {
"name" : "string"
},
"events": {
"live" : [
{"eventId": "id of event 1", // <--- field indexed : "events.live.eventId"
"date" : "date of event",
"duration": n},
{"eventId": "id of event 2",
"date" : "date of event",
"duration": n},
...
],
"onDemand" : [
{"eventId": "id of event 1", // <--- field indexed : "events.onDemand.eventId"
"date" : "date of event",
"duration": n},
{"eventId": "id of event 2",
"date" : "date of event",
"duration": n},
...
]
}
QUERY:
{
$facets: {
"liveUsers": [
{$match: {"events.live.eventId": "id of event 1"}},
{ $project: { .... }}
],
"onDemandUsers": [
{$match: {"events.live.eventId": "id of event 1"}},
{ $project: { .... }}
]
}
}
}
The plan does not seem to use the index and scans the collection. Currently the number of documents in the collection is over 63K, which leads to alerts. Can you help me understand how the indexes should be built or query restructured, so that we can avoid the full collection scan.
"-Kj9Penv_LMRUIPSet0b" : {
"categories" : [ "food", "fashion"],
"contact" : "profile/contact/eieiiieie888x7ww28288_x22",
"location" : "New York, United States",
"name" : "Billybob Smith",
"social" : {
"twitter" : {
"followers" : "1,002",
"nickname" : "#billybob"
}
},
"state" : "0"
},
"eieiiieie888x7ww28288_x22" : {
"categories" : [ "food", "fashion" ],
"contact" : "profile/contact/eieiiieie888x7ww28288_x22",
"location" : "New York, United States",
"name" : "Billybob Smith.",
"social" : {
"twitter" : {
"followers" : "1,002",
"nickname" : "#billybob"
}
},
"socialID" : "twitter_id|558969977",
"state" : "0",
"uniqueID" : "eieiiieie888x7ww2828"
},
This is one .JSON example of a duplicate in my database. I have a lot of duplicates in my database. The only common piece of data I capture which uniquely identifies each user is their contact link. What is my best course of action to seek and remove duplicates from my database? I'm totally stuck. The second entry example is the more accurate and complete entry. Ideally, I could remove the first one and leave the second one behind.
Could really use some help here! Thank you so much!
Let's say I have a collection of documents in the following format:
{
// some fields
"name" : "some name",
"specs" : [
{
"key" : {
"en" : "English key name",
"xx" : "Other key name",
},
"value" : {
"en" : "English value",
"xx" : "Other value",
}
},
{
"key2" : {
"en" : "English key name2",
"xx" : "Other key name2",
},
"value2" : {
"en" : "English value2",
"xx" : "Other value2",
}
},
//and some more sub-documents
],
}
I'm trying to query it from the database to get it in the following format:
{
"name" : "some name",
"specs" : [
{
"key" : "English key name",
"value" : "English value",
},
{
"key2" : "English key name2",
"value2" : "English value2",
},
//and some more sub-documents
],
}
How can it be done, if it is possible at all?
Background
I'm making a software which must be available in multiple languages, and I think current document schema is most suitable for this (if you've got better ideas for the schema I'd like to see them).
To minimize amount of data queried from the database, I'm trying to select the data only in one language. And moreover I want to minimize nesting of structures in the code, so I'm searching a way to somehow select a value out from a sub-document and replace the sub-document.
I've tried a lot of ways writing such query. Here's the one, but it doesn't work as I expect it to:
db.software.aggregate({
$project : {
"name" : true,
"specs" : {
"key" : "$specs.key.en",
"value" : "$specs.value.en"
}
}
});
It transforms a key into an array of all "key.en" fields within specs field. May there be a way to reference a current array element inside "specs" instead of the whole specs array?
I am a problem, which in my view, is very complicated. I wanted to see other thoughts about.
Created fields with the angular formly, these same fields are generated by a policy to be dealt with, the problem to record this data, I would need two repeats or not, still have not found a way out for not using 2 repeats, my code below is one more try unsuccessfully:
Repeat:
<md-card-content>
<h2 class="md-title">{{g.title}}</h2>
<div ng-repeat="f in g.fields and i in item.grupos.fields">
<field-create field="f" ng-model="i.model"></field-create>
<p>result: {{f | json}}</p>
</div>
</md-card-content>
</md-card>
If someone can be interested in helping me, I get more details from the rest of the code.
I hope ideas :)
UPDATED, NEW IDEIA, no sucess..
The problem is, I have recorded in my resource fields, which create fields dynamically. What I'm not getting is to record data in these fields.
a group of fields in the resource is created.
This field group is called in another view, a policy is whether the field is input, textarea or checkbox and renders the field.
When saving these fields have the same view are created other entries in the resource already with the data you enter in the fields.
Now my problem, I'm not able to show this data recorded in the view, because I have a repeat to check the recorded fields in the resource. I need another repeat to populate these fields.
I managed to explain as it is complicated.
i create Jsbin for help: http://jsbin.com/faquhizupo/1/edit?html,js,console,output
Your exact intent of showing up a large portion of code is not 100% clear. But I assume you are having difficulty in manipulating nested loops.
If you want to have nested `ng-repeat', use the below syntax.
JSON Object
$scope.myDataSet = [
{
"id" : 1234,
"desc" : "My Description",
"data": [
{
"sub_id" : "sub id 1",
"field1" : "Value 1",
"field2" : "Value 2"
},
{
"sub_id" : "sub id 2",
"field1" : "Value 3",
"field2" : "Value 4"
},
{
"sub_id" : "sub id 3",
"field1" : "Value 5",
"field2" : "Value 6"
}
]
},
{
"id" : 4567,
"desc" : "My Description2",
"data": [
{
"sub_id" : "sub id 3",
"field1" : "Value 1",
"field2" : "Value 2"
},
{
"sub_id" : "sub id 4",
"field1" : "Value 3",
"field2" : "Value 4"
},
{
"sub_id" : "sub id 5",
"field1" : "Value 5",
"field2" : "Value 6"
}
]
}
];
This can be displayed in HTML as below,
<div ng-repeat="data in myDataSet track by $index">
{{data.desc}}
<ul>
<li ng-repeat="record in data track by $index">
<div>
<h3>{record.sub_id}</h3>
{{record.field1}}
</div>
</li>
</ul>
</div>
If you want to access external loop items inside internal loop, that can be achieved as $parent.$index.
I have data for customers with more than one adresses with json representation like this:
{
"firstName" : "Max",
"lastName" : "Mustermann",
"addresses" : [{
"city" : "München",
"houseNumber" : "1",
"postalCode" : "87654",
"street" : "Leopoldstraße",
}, {
"city" : "Berlin",
"houseNumber" : "2a",
"postalCode" : "12345",
"street" : "Kurfürstendamm",
}
]
}
these json is stored in a column named json of datatype json in a table named customer.
I want to query like this:
SELECT *
FROM customer cust,
json_array_elements(cust.json#>'{addresses}') as adr
WHERE adr->>'city' like '%erlin'
and adr->>'street' like '%urf%';
Query works fine ... but can't create index that postgresql 9.3.4 can use.
Any idea?