Fetch partial documents from couchdb - database

I'm using couchdb to store large documents, which is causing some trouble when fetching them to memory. I do realize the database is not meant to be used this way. As a fallback solution, is it possible to fetch partial documents from the database, without creating a view?
In example, if a document has the fields id, content and extra_content, I would like to retrieve only the first two.
Thank you in advance.

If you are using CouchDB 2.x, you can use /db/_find endpoint as a mechanism to retrieve part of the doc.
POST /db/_find
{
"selector": {
"_id": "a-doc-id"
},
"fields": [
"_id",
"content"
]
}
You'll get only the set of fields you have specified in the query

This is not possible prior to CouchDB 2.x. For CouchDB 2.x or greater, see JuanjoRodriguez's answer.
But one possible work around for any version of CouchDB would be to take advantage of file attachments, which by default are excluded from a fetch. If some of your data isn't always needed, and doesn't need to be included in indexes, you could potentially store it as (JSON) attachments, rather than as part of the document directly:
{
"id": "foo",
"content": "stuff",
"extra_content": "other stuff"
}
becomes:
{
"id": "foo",
"content": "stuff",
"_attachments": {
"extra_content": {
"content_type": "application/json",
"data": "ZXh0cmEgc3R1ZmYK"
}
}
}

Related

CakePHP 4 - custom request and response format

For a new project I want to use a CakePHP 4 as REST backend with a Vue.js frontend.
Now Cake uses a nested data structure while vue.js uses a flat data structure.
My plan now is to convert the data in the backend.
Example Format:
CakePHP
{
"user": {
"id": 1,
"name": "Peter Maus"
"articles" : [
{
"id": 15,
"title": "First Post",
}
]
},
}
Vue.js
{
"user": {
"id": 1,
"name": "Peter Maus"
"articles" : [ 15 ]
},
"articles": [
{
"id": 15,
"title": "First Post",
}
]
}
So basically instead of just sending json with
$this->viewBuilder()->setOption('serialize', ['user']);
I want to first "convert the datastructure" and then send as json.
I have now found the following possibilities for the conversion based on the documentation:
Request - convert from vue to cake
I have seen that you can use Body Parser Middleware with your own parser.
But I still have json as response format and I don't want to override the standard json formatter.
Response - convert from cake to vue
ideas:
I have seen "Data Views", but I'm not sure if it is suitable for this purpose.
extend the ViewBuilder and write my own serialize() function.
How would I have to include my own ViewBuilder, is that even possible?
write a parser function in a parent entity from which all my entities inherit. And call that parse function before serializing the data.
I will probably need access to the Entity Relations to dynamically restructure the data, for both: request and response.
What would be a reasonable approach?

How can you retrieve a full nested document in Solr?

In my instance of Solr 4.10.3 I would like to index JSONs with a nested structure.
Example:
{
"id": "myDoc",
"title": "myTitle"
"nestedDoc": {
"name": "test name"
"nestedAttribute": {
"attr1": "attr1Val"
}
}
}
I am able to store it correctly through the admin interface:
/solr/#/mySchema/documents
and I'm also able to search and retrieve the document.
The problem I'm facing is that when I get the response document from my Solr search, I cannot see the nested attributes. I only see:
{
"id": "myDoc",
"title": "myTitle"
}
Is there a way to include ALL the nested fields in the returned documents?
I tried with : "fl=[child parentFilter=title:myTitle]" but it's not working (ChildDocTransformerFactory from:https://cwiki.apache.org/confluence/display/solr/Transforming+Result+Documents). Is that the right way to do it or is there any other way?
I'm using: Solr 4.10.3!!!!!!
To get returned all the nested structure, you indeed need to use ChildDocTransformerFactor. However, you first need to properly index your documents.
If you just passed your structure as it is, Solr will index them as separate documents and won't know that they're actually connected. If you want to be able to correctly query nested documents, you'll have to pre-process your data structure as described in this post or try using (modifying as needed) a pre-processing script. Unfortunately, including the latest Solr 6.0, there's no nice and smooth solution on indexing and returning nested document structures, so everything is done through "workarounds".
Particularly in your case, you'll need to transform your document structure into this:
{
"type": "parentDoc",
"id": "myDoc",
"title": "myTitle"
"_childDocuments_": [
{
"type": "nestedDoc",
"name": "test name",
"_childDocuments_" :[
{
"type": "nestedAttribute"
"attr1": "attr1Val"
}]
}]
}
Then, the following ChildDocTransformerFactor query will return you all subdocuments (btw, although it says it's available since Solr 4.9, I've actually only seen it in Solr 5.3... so you need to test):
q=title:myTitle&fl=*,[child parentFilter=type:parentDoc limit=50]
Note, although it returns all nested documents, the returned document structure will be flattend (alas!), i.e., you'll get:
{
"type": "parentDoc",
"id": "myDoc",
"title": "myTitle"
"_childDocuments_": [
{
"type": "nestedDoc",
"name": "test name"
},
{
"type": "nestedAttribute"
"attr1": "attr1Val"
}]
}
Probably, not really what you've expected but... this is the unfortunate Solr's behavior that will be fixed in a nearest future release.
You can put
q={!parent which=}
and in fl field :"fl=*,[child parentFilter=title:myTitle].
It will give you all parent field and children field of title:mytitle

Highlight matches in MongoDB full text search

Is it possible to define which part of the text in which of the indexed text fields matches the query?
No, as far as I know and can tell from the Jira, no such feature exists currently. You can, of course, attempt to highlight the parts of the text yourself, but that requires to implement the highlighting and also implement the stemming according to the rules applied by MongoDB.
The whole feature is somewhat complicated - even consuming it - as can be seen from the respective elasticsearch documentation.
Refer to Mongodb Doc Highlighting
db.fruit.aggregate([
{
$searchBeta: {
"search": {
"path": "description",
"query": ["variety", "bunch"]
},
"highlight": {
"path": "description"
}
}
},
{
$project: {
"description": 1,
"_id": 0,
"highlights": { "$meta": "searchHighlights" }
}
}
])
I'm afraid that solution applies only to MongoDB Atlas at the moment #LF00.

URL with reference to object from HATEOAS REST response in AngularJS

I am using #RepositoryRestResource annotation to expose Spring JPA Data as restful service. It works great. However I am struggling with referencing specific entity within angular app.
As known, Spring Data Rest doesn't serialise #Id of the entity, but HAL response contains links to entities (_links.self, _embedded.projects[]._links.self) like in the following example:
{
"_links": {
"self": {
"href": "http://localhost:8080/api/projects{?page,size,sort}",
"templated": true
}
},
"_embedded": {
"projects": [
{
"name": "Sample Project",
"description": "lorem ipsum",
"_links": {
"self": {
"href": "http://localhost:8080/api/projects/1f888ada-2c90-48bc-abbe-762d27842124"
}
}
},
...
My Angular application requires to put kind of reference to specific project entity in the URL, like http://localhost/angular-app/#/projects/{id}. I don't think using href is good idea. UUID (#Id) seems to be better but is not explicitly listed as a field. This is point I got stuck. After reading tons of articles I came up with 2 ideas, but I don't consider neither of those as a perfect one:
Idea 1:
Enable explicitly serialisation of #Id field and just use it to reference to the object.
Caveat: exposing database specific innards to front-end.
Idea 2:
Keep #Id field internal and create an extra "business identifier" field which can be used to identify specific object.
Caveat: Extra field in table (wasting space).
I would appreciate your comment on this. Maybe I am just unnecessarily too reserved to implement either of presented ideas, maybe there is a better one.
To give you another option, there is a special wrapper for Angular+Spring Data Rest that could probably help you out:
https://github.com/guylabs/angular-spring-data-rest

Solr document Submission

I am new in the solr technology.Can you please tell me how a document can submit to solr using user interface.Is it necessary to create xml of the document first?I expect a simplest way of document indexing..
Please Help.
The default Solr RequestHandler (from 4.0) supports four formats: XML, JSON, CSV and javabin. There's a page under the Admin interface to submit documents to the index (select the core and Documents).
There are examples of each of the formats available in the Solr reference guide. If you're using a client library, the library will usually handle this for you anyways, and use an appropriate format depending on which language it's written in and what built-in libraries are available.
The simplest format for manually adding documents is probably JSON:
[
{
"id": "1",
"title": "Doc 1"
},
{
"id": "2",
"title": "Doc 2"
}
]
You can also use the DataImportHandler to import data locally at the server, such as from an SQL-database. In that case you don't submit the actual rows to the server, but you tell the handler to fetch the rows and create documents for you.

Resources