Solr document Submission - solr

I am new in the solr technology.Can you please tell me how a document can submit to solr using user interface.Is it necessary to create xml of the document first?I expect a simplest way of document indexing..
Please Help.

The default Solr RequestHandler (from 4.0) supports four formats: XML, JSON, CSV and javabin. There's a page under the Admin interface to submit documents to the index (select the core and Documents).
There are examples of each of the formats available in the Solr reference guide. If you're using a client library, the library will usually handle this for you anyways, and use an appropriate format depending on which language it's written in and what built-in libraries are available.
The simplest format for manually adding documents is probably JSON:
[
{
"id": "1",
"title": "Doc 1"
},
{
"id": "2",
"title": "Doc 2"
}
]
You can also use the DataImportHandler to import data locally at the server, such as from an SQL-database. In that case you don't submit the actual rows to the server, but you tell the handler to fetch the rows and create documents for you.

Related

Alfresco 7: Search both in the version store and the content store

I have Alfresco 7.0 community edition, with search services version 2.0.2.
My need is to search for all documents having some metadata values, both in the version store and the content store.
I've tryied to search against the versionStore from public api with the following body
{
"query": {
"language": "afts",
"query": "TYPE:\"myc:projects\" AND myc:prop:\"pippo\""
},
"paging": {
"maxItems": 100,
"skipCount": 0
},
"include": [
"allowableOperations",
"properties"
],
"scope": {
"locations": "versions"
}
}
but I get http status 500.
If I try to search from Node Browser I receive this error message No solr query support for store workspace://version2Store.
I also tried the ligthweigth store (which is the difference?)
Is it possible to search with lucene, AFTS against the versions? Do I need to enable some property in alfresco-global.properties?
I've also seen this question and it's seems to me they can do searches.
Thanks a lot.
I've been able to handle both types of searches doing a CMIS query, but I had to write a module that handle that type of reseach (the form, results to be joined and displayed...).
Versioned nodes have a property which point to the noreRef of the current node.

Using search.in with all

Follwing statement find all profiles that has Facebook or twitter and this works:
$filter=SocialAccounts/any(x: search.in(x, 'Facebook,Twitter'))
But I cant find any samples for finding all that has both Facebook and twitter. I tried:
$filter=SocialAccounts/all(x: search.in(x, 'Facebook,Twitter'))
But this is not valid query.
Azure Search does not support the type of ‘all’ filter that you’re looking for. Using search.in with ‘all’ would be equivalent to using OR, but Azure Search can only handle AND in the body of an ‘all’ lambda (which is equivalent to OR in the body of an ‘any’ lambda).
You might try a workaround like this:
$filter=tags/any(t: t eq 'Facebook') and tags/any(t: t eq 'Twitter')
However, this isn't actually equivalent to using all with search.in. The query as expressed using all is matching documents where every social account is strictly either Facebook or Twitter. If any other social account is present, the document won’t match. The workaround doesn’t have this property. A document must have at least Facebook and Twitter in order to match, but not exclusively those. This is certainly a valid scenario; it just isn't the same as using all with search.in, which was the original question.
No matter how you try to rewrite the query, you won’t be able to express an equivalent to the all query. This is a limitation due to the way Azure Search stores collections of strings and other primitive types in the inverted index.
Please vote on user voice to help prioritize:
https://feedback.azure.com/forums/263029-azure-search/suggestions/37166749-efficient-way-to-express-a-true-all
A possible workaround is to use the new Complex Types feature, which does allow more expressive filters inside lambda expressions. For example, if you model tags as objects with a single value property instead of as a collection of strings, you should be able to execute a filter like this:
$filter=tags/all(t: search.in(t/value, 'Facebook,Twitter'))
In the REST API, you'd define tags like this:
{
"name": "myindex",
"fields": [
...
{
"name": "tags",
"type": "Collection(Edm.ComplexType)",
"fields": [
{ "name": "value", "type": "Edm.String", "filterable": true }
]
}
]
}
Note that this feature is in preview at the time of this writing, but will be generally available (and publicly documented) soon.

Fetch partial documents from couchdb

I'm using couchdb to store large documents, which is causing some trouble when fetching them to memory. I do realize the database is not meant to be used this way. As a fallback solution, is it possible to fetch partial documents from the database, without creating a view?
In example, if a document has the fields id, content and extra_content, I would like to retrieve only the first two.
Thank you in advance.
If you are using CouchDB 2.x, you can use /db/_find endpoint as a mechanism to retrieve part of the doc.
POST /db/_find
{
"selector": {
"_id": "a-doc-id"
},
"fields": [
"_id",
"content"
]
}
You'll get only the set of fields you have specified in the query
This is not possible prior to CouchDB 2.x. For CouchDB 2.x or greater, see JuanjoRodriguez's answer.
But one possible work around for any version of CouchDB would be to take advantage of file attachments, which by default are excluded from a fetch. If some of your data isn't always needed, and doesn't need to be included in indexes, you could potentially store it as (JSON) attachments, rather than as part of the document directly:
{
"id": "foo",
"content": "stuff",
"extra_content": "other stuff"
}
becomes:
{
"id": "foo",
"content": "stuff",
"_attachments": {
"extra_content": {
"content_type": "application/json",
"data": "ZXh0cmEgc3R1ZmYK"
}
}
}

Append sensor data into document in Couchbase database

We are a group of people writing a bachelor-project about storing sensor data into a noSQL-database, and we have chosen couchbase for this.
We want to store quite a few data in the same document, one document per day, per sensor, and we want to append new sensor data witch comes in every minute.
But unforunatly, we are not able to append new data into existing document without overwriting the existing data.
The structure for the documents is:
DocumentID: Sensor + date, ie: KitchenTemperature20180227
{
"topic": "Kitchen/Temp",
"type": "temperature",
"unit": "DegC"
"20180227130400": [
{
"data": "24"
}
],
..............
"20180227130500": [
{
"data": "25"
}
],
}
We are all new to couchbase and NoSql-databases, but eager to learn and understand how we the best way should implemet this.
We've tried upsert, insert and update commands, but they all overwrite the existing document or won't execute because the document already exists. As you can see, we have some top-level information, like topic, type, unit. The rest should be data coming in every minute and appended to the existing document.
Help on how to proceed would be very appriciated.
Best regards, Kenneth
In this case you can use the subdocument API. This allows you to modify portions of a document based on a "path". This image gives the idea for getting a subdocument.
You can mutate subdocuments as well. Look at the subdocument API documentation for Couchbase. There are also blog posts that go through examples in Java and Go on the Couchbase blog site.

How to fetch data from couchDB using couch api?

Instead keys and IDs alone, I want to get all the docs via couch api. I have tried with GET "http://localhost:5984/db-name/_all_docs" but it returned
{
"total_rows":4,
"offset":0,
"rows":[
{"id":"11","key":"11","value":{"rev":"1-a0206631250822b37640085c490a1b9f"}},
{"id":"18","key":"18","value":{"rev":"30-f0798ed72ceb3db86501c69ed4efa39b"}},
{"id":"3","key":"3","value":{"rev":"15-0dcb22bab2b640b4dc0b19e07c945f39"}},
{"id":"6","key":"6","value":{"rev":"4-d76008cc44109bd31dd32d26ba03125d"}}
]
}
From the documentation
for the below request it will send the data as we expected but it requires set of keys in request.
POST /db/_all_docs HTTP/1.1
{
"keys" : [
"11",
"18"
]
}
Thanks in advance.
The _all_docs endpoint is actually just a system-level view that uses the _id field as the index. Thus, any parameters that you can use for views also apply here.
If you read the documentation further, you'll find that adding the parameter include_docs=true to your view will include the original documents in the results. The documents will be added as the doc field alongside id, value and rev.

Resources