Solr; read specific data - solr

Is there anyway to read specific data in solr? For example; using nutch I crawled http://www.amazon.com/Jessica-Simpson-Womens-Asymmetrical-X-Small/dp/B018MRT16Q/ref=lp_13906149011_1_3?m=ATVPDKIKX0DER&s=apparel&ie=UTF8&qid=1455828781&sr=1-3&nodeID=13906149011;
then with solr I want it to search through and just display the price of the jacket.

Yes. If you only want to search and return a particluar field, you can explicitly search on that field using <fieldname>:<query> and to only return some particular fields use the fl parameter fl=Price.
So your solr query should be something like this:
http://localhost:8983/solr/collection1/select?q=Price:500&fl=Price
Additionally if you want to search over and return multiple fields, use the qf parameter along with the edismax parser like this:
http://localhost:8983/solr/collection1/select?q=500&fl=Price,Name&defType=edismax&qf=Price,Name

Related

Synonyms in Solr Query Results

Whenever I query a string in solr, i want to get the synonyms of field value if there exists any as a part of query result, is it possible to do that in solr
There is no direct way to fetch the synonyms used in the search results. You can get close by looking at how Solr parsed your query via the debugQuery=true parameter and looking at the parsedQuery value in the response but it would not be straightforward. For example, if you search for "tv" on a text field that uses synonyms you will get something like this:
$ curl "localhost:8983/solr/your-core/select?q=tv&debugQuery=true"
{
...
"parsedquery":"SynonymQuery(Synonym(_text_:television _text_:televisions _text_:tv _text_:tvs))",
...
Another approach would be to load in your application the synonyms.txt file that Solr uses and do the mapping yourself. Again, not straightforward,

How to create a solr query that searches by multiple keywords in all fields

I want to perform a solr query on all fields for multiple keywords. For example, I want to search for the word "dog" AND the word "cat".
So far, I've tried to do something like this:
q=dog cat
or something like:
q=dog,cat
However, I think my queries are actually doing an OR instead of an AND.
Your question is about the default operator (AND/OR) and you want to search in "all fields".
For most parsers you can use the parameter q.op to change the default parser (e.g. for the Standard Query Parser and the DisMax Query Parser) or you can use the defaultOperator in schema.xml or Schema API.
Be aware that you will search only in the default field.
If you want to search in "all fields" you have to copy all your fields to one field (and use this as default field) or you have to list all your fields in the DisMax qf-parameter.
The results will not be the same: In the second case your "AND"-Search must match one of the fields (with its special tokenizer), in the first each term could be in different fields to match (because in the end all terms are in the default field).

Why does Dismax not work in simple query?

All:
I am pretty new to SOLR, I upload some documents which have "season" in content field(store but not indexed, copy to text field) and in title field(store and indexed copy to text field)
When I use basic query without dismax like:
http://localhost:8983/solr/collection1/select?q=season&rows=5&wt=json&indent=true
It works very well and return correct results, but when I want to boost those documents which have more "season" in content rather than title, I used dismax like(I guess the way I use it is totally, cos the content is not indexed, but I at least expect certain return result even incorrect ):
http://localhost:8983/solr/collection1/select?q=season&rows=5&wt=json&indent=true&defType=dismax&qf=content%5E100+title%5E1
There is no match result returned, I wonder if anyone could help me with this? Or could anyone show me how to use dismax correctly
Thanks
In your second query you specify the "content" field as the one and only query field but earlier you write that this field is stored but not indexed. If a field is not indexed you can not search against it.
I faced the same problem. Tracked it down to the schema definition where for dismax to work, field type should be text and not string
for e.g text_general,text_en_splitting,text_en
Its because of the tokenizers used for this field types.
-->

solr query not returning results

When I enter search url
http://localhost:8983/solr/select?qt=standard&rows=10&q=*:*
I get a response with 10 documents.
But when I want to test specific query, then nothing comes up. For example:
http://localhost:8983/solr/select?qt=standard&rows=10&q=white
Why is that happening? I clearly see in results, that there is document with word "White" in it. So Why solr dont return that document as result.?
q=*:* searches for all content on all the documents, hence you get back the results.
q=white will search for white on the default search field, which is usually text if you have not modified the schema.xml.
<defaultSearchField>text</defaultSearchField>
You can change the default field to be the field you want to search on.
OR use specific field to search on the specific field e.g. title q=title:white
If you want to search on multiple field, you can combine the fields into one field by using copyfields or use dismax request handler.

Limiting the output from MoreLikeThis in Solr

I'm trying to use MoreLikeThis to get all similar documents but not documents with a specific contenttype.
So the first query needs to find the one document that I want to get "More Like This" of - and the second query needs to limit the similar documents to not be pdf's (-contenttype:pdf)
Does anyone know if this is possible?
Thanks
When using the MoreLikeThisHandler, all the common parameters applied to the mlt results set. So you can use the fq parameter to exclude your pdf documents from the mlt results:
http://localhost:8983/solr/mlt?q=test&mlt.fl=text&fq=-contenttype:pdf
The q parameter allows to select the document to generate mlt results (actually, it's the first document matching the initial query that is used).

Resources