Tag-based search model in Mongodb - database

I am creating a tag based search engine for various kind of things in mongodb.
I have blogs document, testimonials document, comments documents, books document and images document and all these have array of tags field.
Now when I fetch a book, which have certain tags associated with it, I would like to also fetch blogs and testimonials and comments with those tags.
I would like to the same when I fetch a blog .. fetch rest with tags that blog have.
I am designing my database model. what is the best way to handle these kind of tag based search.
currently what I am thinking is
add tags in each document
at fetch , take tag and search through all other document
take the result and then send with result
is this the best way ? how should I design model?
Update :
I will perform search more frequently.

If you need to repeat tags in multiple collections, I would rather do a tags collection itself.
Why would I move tags into their own collection?
Think if you need to change the name of one tag in the future, maybe because of a mistake like a typo, you'll need to iterate over all your collections searching for this tag to fix it. Wouldn't it be easier if you only need to replace in one place?
Embed arrays and objects in one document is a powerful tool, but there are times when it's not the best solution. This case is one of them, and you should prevent as much as you can repetition.
Official documentation talking about avoid repetition.
Collection Structure
Create a tags collection and add their ObjectId to the tags array in the other documents instead of the tag itself. Like below.
// tags collection
{
_id: <ObjectId1>
title: "trending"
}
// all other documents (blogs, testimonials...)
{
_id: <ObjectId2>
tags: [
<ObjectId1>
],
// other stuff...
}
Fetching tag related documents in one hit
When you fetch one document you can get all its tags and look for other documents with related tags using the operator $in, like this:
db.blogs.find({
tags: {
$in: [
<ObjectId1>,
<ObjectIdX>,
// other tags ids
]
}
})
And this will return at once all the documents matching one or more tags.
More about $in operator.
Other tips
Well used indexes have a great impact on performance. This isn't the place to teach about how they works, but mongodb have multikey indexex and in your concrete case is obviuos which one, tags.
Example:
db.blogs.createIndex( { "tags": 1 } )

Related

Best way to store a single independent document in mongodb

I have MERN application, I have a bunch of collections in my db, now i want to store an object that represents an order:
Say i have "item1, item2, item3" in an "items" collection, each one can be anything really;
I just need the Id's (to reference them), I want the user to choose their order so that i know the correct way of displaying the items (Not the order in the db, an order for a seprate purpose)
I think the best way of doing it is having a single document, with the order data in it, but each document should be in a collection, so the question is, is it right to create a collection only to store a single document in it? or is there a better way?
This is an example of the document i want to store: (The array index is their order)
{
items: [{itemId:xxx, otherprops...}, {itemId: yyy, otherprops...}]
}
The items collection can have 100s of items, so changing the order in that collection is not the correct option for my needs.
is it right to create a collection only to store a single document in
it
If you look at the admin database in mongodb, you'll see that it does something similar to what I think you're trying to do. There's a system.version collection. In that collection, I've seen documents that contain settings-like information. For example, the featureCompatibility property is actually stored as a document with _id: "featureCompatibility". Shard identity information is also stored as a single document in this collection:
{
"_id" : "shardIdentity",
"clusterId" : ObjectId("2bba123c6eeedcd192b19024"),
"shardName" : "shard2",
"configsvrConnectionString" : "configDbRepl/alpha.example.net:28100,beta.example.net:28100,charlie.example.net:28100" }
There is only one such document in the system.version collection. You can very well create your own settings collection where you store these bespoke documents. It's certainly not unheard of.
Take a look at the "Shard Aware" section from the official mongodb documentation to see this type of practice in action:
https://docs.mongodb.com/manual/release-notes/3.6-upgrade-sharded-cluster/#prerequisites

Can i merge/combine graphql API arrays into one?

Hey i'm using Sanity and have created both a allSanityPost (for my blog posts) and allSanityCases (for all my cases). Do anyone know how to combine these categories into one array and order them by their "published-date"? I want to display them both on one blog page - but not in separated lists.
I'm building the page with Gatsby, so react answers would be preferable :)
Cheers
The current GraphQL endpoint have no way of querying across documents like GROQ does. Therefore, you will have to do this after querying the GraphQL endpoint for data. I suggest querying for both all posts and all cases separately like this:
query {
allSanityPost{
title
},
allSanityCases{
title
}
}
Given that both of these types have some sort of date, you should be able to combine them into one single array, and then do your sorting thereafter.

SuiteCommerce Advanced - Show a custom record on the PDP

I am looking to create a feature whereby a User can download any available documents related to the item from a tab on the PDP.
So far I have created a custom record called Documentation (customrecord_documentation) containing the following fields:
Related item : custrecord_documentation_related_item
Type : custrecord_documentation_type
Document : custrecord_documentation_document
Description : custrecord_documentation_description
Related Item ID : custrecord_documentation_related_item_id
The functionality works fine on the backend of NetSuite where I can assign documents to an Inventory item. The stumbling block is trying to fetch the data to the front end of the SCA webstore.
Any help on the above would be much appreciated.
I've come at this a number of ways.
One way is to create a Suitelet that returns JSON of the document names and urls. The urls can be the real Netsuite urls or they can be the urls of your suitelet where you set up the suitelet to return the doc when accessed with action=doc&id=_docid_ query params.
Add a target <div id="relatedDocs"></div> to the item_details.tpl
In your ItemDetailsView's init_Plugins add
$.getJSON('app/site/hosting/scriptlet.nl...?action=availabledoc').
then(function(data){
var asHtml = format(data); //however you like
$("#relatedDocs").html(asHtml);
});
You can also go the whole module route. If you created a third party module DocsView then you would add DocsView as a child view to ItemDetailsView.
That's a little more involved so try the option above first to see if it fits your needs. The nice thing is you can just about ignore Backbone with this approach. You can make this a little more portable by using a service.ss instead of the suitelet. You can create your own ssp app for the function so you don't have to deal with SCAs url structure.
It's been a while, but you should be able to access the JSON data from within the related Backbone View class. From there, within the return context, output the value you're wanting to the PDP. Hopefully you're extending the original class and not overwriting / altering the core code :P.
The model associated with the PDP should hold all the JSON data you're looking for. Model.get('...') sort of syntax.
I'd recommend against Suitelets for this, as that's extra execution time, and is a bit slower.
I'm sure you know, but you need to set the documents to be available as public as well.
Hope this helps, thanks.

How do I apply Solr 'More Like This' component as part of my search

I'm trying to perform a solr query that includes 'More Like This' component. Can't find this scenario in the documentation. Here's a hypothetical sample Product entity with two fields -
{
product_name: "name of the product in 3 or 4 words"
product_description: "this is a long english verbose text, may be 10
sentences"
}
If I'm given a newProduct, I want to search for similar products in my Solr Index. The search should use the following logic -
newProduct.product_name - I want to do a simple token based search against 'product_name' field in my index.
newProduct.product_description - I want to use this field to perform 'More Like This' search against 'product_description' field in my index.
How can this be accomplished with a single query to Solr?
Let me know if the scenario is not clear.
You can use the More Like This request handler (mlt) to perform a "More Like Search" with a given document text without adding the document to the index first.
Include the document as the POST data or in the stream.body parameter in the request. There's an example given in the old wiki - see "Using content streams":
If you post text in the body, that will be used for similarity. Alternatively, you can put the posted content in the URL using something like:
http://localhost:8983/solr/mlt?stream.body=electronics%20memory&mlt.fl=manu,cat&mlt.interestingTerms=list&mlt.mintf=0
If remoteStreaming is enabled, you can find documents similar to the text on a webpage:
http://localhost:8983/solr/mlt?stream.url=http://lucene.apache.org/solr/&mlt.fl=manu,cat&mlt.interestingTerms=list&mlt.mintf=0
I think this is what you are looking for:
https://wiki.apache.org/solr/MoreLikeThis#MoreLikeThisComponent
It uses the MoreLikeThis Component in search rather than having a separate handler.

Find CouchDB docs missing an arbitrary field

I need a CouchDB view where I can get back all the documents that don't have an arbitrary field. This is easy to do if you know in advance what fields a document might not have. For example, this lets you send view/my_view/?key="foo" to easily retrieve docs without the "foo" field:
function (doc) {
var fields = [ "foo", "bar", "etc" ];
for (var idx in fields) {
if (!doc.hasOwnProperty(fields[idx])) {
emit(fields[idx], 1);
}
}
}
However, you're limited to asking about the three fields set in the view; something like view/my_view/?key="baz" won't get you anything, even if you have many docs missing that field. I need a view where it will--where I don't need to specify possible missing fields in advance. Any thoughts?
This technique is called the Thai massage. Use it to efficiently find documents not in a view if (and only if) the view is keyed on the document id.
function(doc) {
// _view/fields map, showing all fields of all docs
// In principle you could emit e.g. "foo.bar.baz"
// for nested objects. Obviously I do not.
for (var field in doc)
emit(field, doc._id);
}
function(keys, vals, is_rerun) {
// _view/fields reduce; could also be the string "_count"
return re ? sum(vals) : vals.length;
}
To find documents not having that field,
GET /db/_all_docs and remember all the ids
GET /db/_design/ex/_view/fields?reduce=false&key="some_field"
Compare the ids from _all_docs vs the ids from the query.
The ids in _all_docs but not in the view are those missing that field.
It sounds bad to keep the ids in memory, but you don't have to! You can use a merge sort strategy, iterating through both queries simultaneously. You start with the first id of the has list (from the view) and the first id of the full list (from _all_docs).
If full < has, it is missing the field, redo with the next full element
If full = has, it has the field, redo with the next full element
If full > has, redo with the next has element
Depending on your language, that might be difficult. But it is pretty easy in Javascript, for example, or other event-driven programming frameworks.
Without knowing the possible fields in advance, the answer is easy. You must create a new view to find the missing fields. The view will scan every document, one-by-one.
To avoid disturbing your existing views and design documents, you can use a brand new design document. That way, searching for the missing fields will not impact existing views you may be already using.

Resources