MongoDB Nested - Search TrackingValues - arrays

We need to perform a phrased-based search (like Google's "") over a nested array of key words, by order.
For instance, let us suppose the data looks like:
{
Name: "question",
body: [
"We",
"need",
"to",
"perform",
"a",
"search",
"like",
"google's"
]
}
By searching: "we search" – I will get no result, but the document will be returned by searching any of the followings: "we need", "to perform a search", "we" etc.
I do need to tokenize the words for encryption, so saving them as a string could not do for me here…
Is that any possible?

Folks, I tried to solve it with the technical support of MongoDB. Apparently, there is not out-of-the-box solution.
I have been able to "solve" this by keeping another field, concatenating all the tokenized, encrypted words in one string, and use regex expression over it.
Not ideal, and required to duplicate some data – but it works foe our needs.

Related

How to respect Solr conditions in order

I need to send a query to Solr with two conditions in OR, instead of sending the query twice:
{!complexphrase inOrder=true}title:"some tests*" || title:(some tests*)
.. where, in the first condition, I want the precise result. If not found, then it goes to OR and retrieves any result that has at least one word in the search phrase. But when I launch the query, I still get the right condition results first.
Here is my data:
{
"title": "some values"
},
{
"title": "data tests"
},
{
"title": "some tests"
}
The response I need is:
{
"title": "some tests"
},
{
"title": "data tests"
},
{
"title": "some values"
}
I already tried using boosting, like so: {!complexphrase inOrder=true}title:"some tests*"^2 || title:(some tests*)^1 but didn't work. I am NOT able to change the Solr configuration since it's a software that's already in production and not managed by me. I even cannot sort by rating, infact I don't receive best occurences first. Solr version is 7.3.1. Any help is appreciated, thanks in advance!
I solved it with a work-around. Instead of putting two OR conditions, I managed to apply a working boost on the title field, using edismax.
What I had to change in my Java application was:
From
SolrQuery q = new SolrQuery("*");
To
SolrQuery q = new SolrQuery("(" + query + "*)");
and added:
q.set("defType", "edismax");
q.set("qf", "title^100");
Now, I'm not making a precise query but I'm retrieving documents with a higher match first without changing any configuration! The Solr Frontend equivalent is similar, but the query should look like this:
http://localhost:8983/solr/mycollection/select?defType=edismax&q=(some%20test*)&qf=title^100
Hope it helps someone

Is there a way to auto generate ObjectIds inside arrays of MongoDB?

Below is my MongoDb collection structure:
{
"_id": {
"$oid": "61efa44933eabb748152a250"
},
"title": "my first blog",
"body": "Hello everyone,wazuzzzzzzzzzzzzzzzzzup",
"comments": [{
"comment": "fabulous work bruhv",
}]
}
}
Is there a way to auto generate ids for comments without using something like this:
db.messages.insert({messages:[{_id:ObjectId(), message:"Message 1."}]});
I found the above method from the SO question:
mongoDB : Creating An ObjectId For Each New Child Added To The Array Field
But someone in the comments pointed out that:
"I have been looking at how to generate a JSON insert using ObjectID() and in my travels have found that this solution is not very good for indexing. The proposed solution has _id values that are strings rather than real object IDs - i.e. "56a970405ba22d16a8d9c30e" is different from ObjectId("56a970405ba22d16a8d9c30e") and indexing will be slower using the string. The ObjectId is actually represented internally as 16 bytes."
So is there a better way to do this?

Logic Apps: Data not parsed on the second query inside "foreach" loop

Hi Logic Apps Experts,
I'd like to check with you some of the foreach loop behaviors, and to check whether this is expected/ is there any workarounds for this.
So the steps with this logicapps is to "Run query and list results" search will do is searching SecurityIncident table. And foreach SecurityIncident record, find a corresponding SecurityAlert record in "Using IncidentId-Query Details of the Alert" step.
For the first query, the data is parsed properly and each fields can be used.
However, after the second query I can only use 'Body' and 'value' in the steps. Which contains unparsed values.
Questions:
Is this behavior expected?
Is there a better way to ensure the second query is parsed?
Any other room of improvements advice are greatly appreciated.
Thank you!
The selection list affected by the required type/format of the input box in the action. So I think the behavior is expected.
If you want to get the parsed field from the query action, you can use expression. I'm not clear about the details of query result body, here I just provide a sample for your reference:
For example, if the query result shows like:
{
"body": [
{
"TenantId": "111",
"xxxx": "xxx"
},
{
"TenantId": "222",
"xxxx": "xxx"
}
]
}
Then you can use the expression body('Run_query_and_list_results')[0]?['TenantId'] to get the value of first TenantId. In a word, use [index] to get array, use ?['key'] to get map.

MongoDB: Query and retrieve objects inside embedded array?

Let's say I have the following document schema in a collection called 'users':
{
name: 'John',
items: [ {}, {}, {}, ... ]
}
The 'items' array contains objects in the following format:
{
item_id: "1234",
name: "some item"
}
Each user can have multiple items embedded in the 'items' array.
Now, I want to be able to fetch an item by an item_id for a given user.
For example, I want to get the item with id "1234" that belong to the user with name "John".
Can I do this with mongoDB? I'd like to utilize its powerful array indexing, but I'm not sure if you can run queries on embedded arrays and return objects from the array instead of the document that contains it.
I know I can fetch users that have a certain item using {users.items.item_id: "1234"}. But I want to fetch the actual item from the array, not the user.
Alternatively, is there maybe a better way to organize this data so that I can easily get what I want? I'm still fairly new to mongodb.
Thanks for any help or advice you can provide.
The question is old, but the response has changed since the time. With MongoDB >= 2.2, you can do :
db.users.find( { name: "John"}, { items: { $elemMatch: { item_id: "1234" } } })
You will have :
{
name: "John",
items:
[
{
item_id: "1234",
name: "some item"
}
]
}
See Documentation of $elemMatch
There are a couple of things to note about this:
1) I find that the hardest thing for folks learning MongoDB is UN-learning the relational thinking that they're used to. Your data model looks to be the right one.
2) Normally, what you do with MongoDB is return the entire document into the client program, and then search for the portion of the document that you want on the client side using your client programming language.
In your example, you'd fetch the entire 'user' document and then iterate through the 'items[]' array on the client side.
3) If you want to return just the 'items[]' array, you can do so by using the 'Field Selection' syntax. See http://www.mongodb.org/display/DOCS/Querying#Querying-FieldSelection for details. Unfortunately, it will return the entire 'items[]' array, and not just one element of the array.
4) There is an existing Jira ticket to add this functionality: it is https://jira.mongodb.org/browse/SERVER-828 SERVER-828. It looks like it's been added to the latest 2.1 (development) branch: that means it will be available for production use when release 2.2 ships.
If this is an embedded array, then you can't retrieve its elements directly. The retrieved document will have form of a user (root document), although not all fields may be filled (depending on your query).
If you want to retrieve just that element, then you have to store it as a separate document in a separate collection. It will have one additional field, user_id (can be part of _id). Then it's trivial to do what you want.
A sample document might look like this:
{
_id: {user_id: ObjectId, item_id: "1234"},
name: "some item"
}
Note that this structure ensures uniqueness of item_id per user (I'm not sure you want this or not).

Using CouchDB-lucene how can I index an array of objects (not values)

Hello everyone and thanks in advance for any ideas, suggestions or answers.
First, the environment: I am using CouchDB (currently developing on 1.0.2) and couchdb-lucene 0.7. Obviously, I am using couchdb-lucene ("c-l" hereafter) to provide full-text searching within couchdb.
Second, let me provide everyone with an example couchdb document:
{
"_id": "5580c781345e4c65b0e75a220232acf5",
"_rev": "2-bf2921c3173163a18dc1797d9a0c8364",
"$type": "resource",
"$versionids": [
"5580c781345e4c65b0e75a220232acf5-0",
"5580c781345e4c65b0e75a220232acf5-1"
],
"$usagerights": [
{
"group-administrators": 31
},
{
"group-users": 3
}
],
"$currentversionid": "5580c781345e4c65b0e75a220232acf5-1",
"$tags": [
"Tag1",
"Tag2"
],
"$created": "/Date(1314973405895-0500)/",
"$creator": "administrator",
"$modified": "/Date(1314973405895-0500)/",
"$modifier": "administrator",
"$checkedoutat": "/Date(1314975155766-0500)/",
"$checkedoutto": "administrator",
"$lastcommit": "/Date(1314973405895-0500)/",
"$lastcommitter": "administrator",
"$title": "Test resource"
}
Third, let me explain what I want to do. I am trying to figure out how to index the '$usagerights' property. I am using the word index very loosely because I really do not care about being able to search it, I simply want to 'store' it so that it is returned with the search results. Anyway, the property is an array of json objects. Now, these json objects that compose the array will always have a single json property.
Based on my understanding of couchdb-lucene, I need to reduce this array to a comma separated string. I would expect something like "group-administrators:31,group-users:3" to be a final output.
Thus, my question is essentially: How can I reduce the $usagerights json array above to a comma separated string of key:value pairs within the couchdb design document as used by couchdb-lucene?
A previous question I posted regarding indexing of tagging in a similar situation, provided for reference: How-to index arrays (tags) in CouchDB using couchdb-lucene
Finally, if you need any additional details, please just post a comment and I will provide it.
Maybe I am missing something, but the only difference I see from your previous question, is that you should iterate on the objects. Then the code should be:
function(doc) {
var result = new Document(), usage, right;
for(var i in doc.$usagerights) {
usage = doc.$usagerights[i];
for(right in usage) {
result.add(right + ":" + usage[right]);
}
}
return result;
}
There's no requirement to convert to a comma-separated list of values (I'd be intrigued to know where you picked up that idea).
If you simply want the $usagerights item returned with your results, do this;
ret.add(JSON.stringify(doc.$usagerights),
{"index":"no", "store":"yes", "field":"usagerights"});
Lucene stores strings, not JSON, so you'll need to JSON.parse the string on query.

Resources