Spring-Data/MongoDB/QueryDSL Searching nested _id of ObjectId type - spring-data-mongodb

When an document from a collection is nested inside a document of another collection, it's pretty standard to make a copy of the nested document verbatim instead of creating another type of document just for the sake of nesting. Ex:
category {"_id": ObjectId("c1"), "name": "Category 1"}
question {"_id": ObjectId("q1"), category: {"_id": ObjectId("c1"), "name": "Category 1"}}
When using queryDSL as follows:
question.category.id = "c1"
queryDSL generates a query like this:
"question.category._id":"c1"
where I expect:
"question.category._id":ObjectId("c1")
This works for top level documents, not for nested. I think this is a valid case and Spring should do the same translations it does for top level search. Is there a workaround for that?

Related

Is there a way to auto generate ObjectIds inside arrays of MongoDB?

Below is my MongoDb collection structure:
{
"_id": {
"$oid": "61efa44933eabb748152a250"
},
"title": "my first blog",
"body": "Hello everyone,wazuzzzzzzzzzzzzzzzzzup",
"comments": [{
"comment": "fabulous work bruhv",
}]
}
}
Is there a way to auto generate ids for comments without using something like this:
db.messages.insert({messages:[{_id:ObjectId(), message:"Message 1."}]});
I found the above method from the SO question:
mongoDB : Creating An ObjectId For Each New Child Added To The Array Field
But someone in the comments pointed out that:
"I have been looking at how to generate a JSON insert using ObjectID() and in my travels have found that this solution is not very good for indexing. The proposed solution has _id values that are strings rather than real object IDs - i.e. "56a970405ba22d16a8d9c30e" is different from ObjectId("56a970405ba22d16a8d9c30e") and indexing will be slower using the string. The ObjectId is actually represented internally as 16 bytes."
So is there a better way to do this?

Search for multiple id:s in cloudant

I am using node-red to communicate with cloudant and for each time my flow runs I might have different amount of id:s coming in msg.payload. Later I want to use these id:s to display all the relevant objects. Is it possible to search for multiple id:s in some way? Or do you have any other solution? Can't find anything about this online atm
It looks like Node-RED supports querying by _id, a search index, or all documents. When you use _id there does not seem to be a way to specify more than one ID. You can use a search index, however, to query for multiple IDs.
Create a search index in Cloudant similar to the following:
{
"_id": "_design/allDocSearch",
"views": {},
"language": "javascript",
"indexes": {
"byId": {
"analyzer": "standard",
"index": "function (doc) {\n index(\"id\", doc._id);\n}"
}
}
}
This corresponds to the following when using the Cloudant dashboard:
design doc = allDocSearch
index name = byId
index function =
function (doc) {
index("name", doc.name);
}
To search for multiple IDs your query would look something like this:
id:"1" OR id:"2"
In Node-Red set up your Cloudant node to point to the appropriate database, specify a "Search by" of search index, and configure your design document and index name (in this case it would be allDocSearch/byId).
You can test with a simple inject node with a payload similar to the search query above: id:"1" OR id:"2"

Solr Indexing multiple json objects

I am trying to learn solr but some of the technicalities are confusing me.
I have a large document that are basically structured like this:
url -> {Some giant json object}
url -> {another giant json object}
...
url -> {another giant json object}
and there are close to 30,000 of them.
I want to index them to solr. So I created a schema.xml that has every possible field and whether it is indexed, multivalued etc etc.
I am wondering what is the general structure of what to do next. I understand that I have to index the file, but do I use a curl command for each line separately?
Just looking for a higher level understanding of things because the online sources are a little confusing to me.
Thank you!
EDIT--
Are terminal commands the fastest way of indexing these particular type of file? I updated the example the show how the json file looks like.
Curl request
curl 'http://localhost:8983/solr/collection1/update/json/docs'
'?split=/exams'
'&f=first:/first'
'&f=last:/last'
'&f=grade:/grade'
'&f=subject:/exams/subject'
'&f=test:/exams/test'
'&f=marks:/exams/marks'
-H 'Content-type:application/json' -d '
This will index data as
{
"first": "John",
"last": "Doe",
"grade": 8,
"exams": [
{
"subject": "Maths",
"test" : "term1",
"marks":90},
{
"subject": "Biology",
"test" : "term1",
"marks":86}
]
}'
To know more follow this link -
https://lucidworks.com/blog/2014/08/12/indexing-custom-json-data/

How can you retrieve a full nested document in Solr?

In my instance of Solr 4.10.3 I would like to index JSONs with a nested structure.
Example:
{
"id": "myDoc",
"title": "myTitle"
"nestedDoc": {
"name": "test name"
"nestedAttribute": {
"attr1": "attr1Val"
}
}
}
I am able to store it correctly through the admin interface:
/solr/#/mySchema/documents
and I'm also able to search and retrieve the document.
The problem I'm facing is that when I get the response document from my Solr search, I cannot see the nested attributes. I only see:
{
"id": "myDoc",
"title": "myTitle"
}
Is there a way to include ALL the nested fields in the returned documents?
I tried with : "fl=[child parentFilter=title:myTitle]" but it's not working (ChildDocTransformerFactory from:https://cwiki.apache.org/confluence/display/solr/Transforming+Result+Documents). Is that the right way to do it or is there any other way?
I'm using: Solr 4.10.3!!!!!!
To get returned all the nested structure, you indeed need to use ChildDocTransformerFactor. However, you first need to properly index your documents.
If you just passed your structure as it is, Solr will index them as separate documents and won't know that they're actually connected. If you want to be able to correctly query nested documents, you'll have to pre-process your data structure as described in this post or try using (modifying as needed) a pre-processing script. Unfortunately, including the latest Solr 6.0, there's no nice and smooth solution on indexing and returning nested document structures, so everything is done through "workarounds".
Particularly in your case, you'll need to transform your document structure into this:
{
"type": "parentDoc",
"id": "myDoc",
"title": "myTitle"
"_childDocuments_": [
{
"type": "nestedDoc",
"name": "test name",
"_childDocuments_" :[
{
"type": "nestedAttribute"
"attr1": "attr1Val"
}]
}]
}
Then, the following ChildDocTransformerFactor query will return you all subdocuments (btw, although it says it's available since Solr 4.9, I've actually only seen it in Solr 5.3... so you need to test):
q=title:myTitle&fl=*,[child parentFilter=type:parentDoc limit=50]
Note, although it returns all nested documents, the returned document structure will be flattend (alas!), i.e., you'll get:
{
"type": "parentDoc",
"id": "myDoc",
"title": "myTitle"
"_childDocuments_": [
{
"type": "nestedDoc",
"name": "test name"
},
{
"type": "nestedAttribute"
"attr1": "attr1Val"
}]
}
Probably, not really what you've expected but... this is the unfortunate Solr's behavior that will be fixed in a nearest future release.
You can put
q={!parent which=}
and in fl field :"fl=*,[child parentFilter=title:myTitle].
It will give you all parent field and children field of title:mytitle

Batch node relationship creation in cypher/neo4j

What is the most efficient way to break down this CREATE cypher query?
The end pattern is the following:
(newTerm:term)-[:HAS_META]->(metaNode:termMeta)
In this pattern this is a single newTerm node and about ~25 termMeta nodes. The HAS_META relationship will have a single property (languageCode) that will be different for each termMeta node.
In the application, all of these nodes and relationships will be created at the same time. I'm trying to determine the best way to add them.
Is there anyway to add these without having to have perform individual query for each TermMeta node?
I know you can add multiple instances of a node using the following query format:
"metaProps" : [
{"languageCode" : "en", "name" : "1", "dateAdded": "someDate1"},
{"languageCode" : "de", "name" : "2", "dateAdded": "someDate2"},
{"languageCode" : "es", "name" : "3", "dateAdded": "someDate3"},
{"languageCode" : "fr", "name" : "3", "dateAdded": "someDate4"}
]
But you can only do that for one type of node at a time and there (as far as I can tell) is no way to dynamically add the relationship properties that are needed.
Any insight would be appreciated.
There's no really elegant way to do it, as far as I can tell—from your example, I'm assuming you're using parameters. You can use a foreach to loop through the params and do a create on each one, but it's pretty ugly, and requires you to explicitly specify literal maps of your properties. Here's what it would look like for your example:
CREATE (newTerm:term)
FOREACH ( props IN {metaProps} |
CREATE newTerm-[:HAS_META {languageCode: props.languageCode}]->
(:termMeta {name: props.name, dateAdded: props.dateAdded})
)
WITH newTerm
MATCH newTerm-[rel:HAS_META]->(metaNode:termMeta)
RETURN newTerm, rel, metaNode
If you don't need to return the results, you can delete everything after the FOREACH.
Select and name each vertex differently and then create relations using it.
For ex
match (n:Tag), (m:Account), (l:FOO) CREATE (n)-[r:mn]->(m),(m)-[x:ml]->(l)
match (n:Tag{a:"a"}), (m:Account{b:"x"}), (l:FOO) CREATE (n)-[r:mn]->(m),(m)-[x:ml]->(l)

Resources