I have solr jetty 5.1.3 installed and indexed more than 15000 documents using tika. I have indexed and stored doc published date and content in SOLR. I have enable highlighted in solrConfig.xml, Here is the xml of request handler for highlighted terms
<requestHandler name="/select" class="solr.SearchHandler">
<!-- default values for query parameters can be specified, these
will be overridden by parameters in the request
-->
<lst name="defaults">
<str name="echoParams">explicit</str>
<int name="rows">10</int>
<str name="hl">on</str>
<str name="hl.fl">content</str>
<str name="hl.simple.pre"><b></str>
<str name="hl.simple.post"></b></str>
<str name="f.content.hl.snippets">3</str>
<str name="f.content.hl.fragsize">200</str>
<str name="f.content.hl.maxAnalyzedChars">200000</str>
<str name="f.content.hl.alternateField">content</str>
<str name="f.content.hl.maxAlternateFieldLength">750</str>
</lst>
</requestHandler>
<!-- A request handler that returns indented JSON by default -->
<requestHandler name="/query" class="solr.SearchHandler">
<lst name="defaults">
<str name="echoParams">explicit</str>
<str name="wt">json</str>
<str name="indent">true</str>
<str name="df">content</str>
<str name="hl">on</str>
<str name="hl.fl">content</str>
<str name="hl.simple.pre"><b></str>
<str name="hl.simple.post"></b></str>
<str name="f.content.hl.snippets">3</str>
<str name="f.content.hl.fragsize">200</str>
<str name="f.content.hl.maxAnalyzedChars">200000</str>
<str name="f.content.hl.alternateField">content</str>
<str name="f.content.hl.maxAlternateFieldLength">750</str>
</lst>
</requestHandler>
It is returning me up to three highlights and search text is bold. like if i search "Lorem" in query term, then it is returning a highlight to me something like that
Lorem ipsum dolor sit amet 2016, consectetur adipiscing elit. Sed volutpat metus lorem, a placerat nibh sodales in. Cras in mauris tempus, vulputate felis eu, tincidunt erat.
But when i search the doc which have publish date between last 1 year and now, it is highlighting two terms. For example, if i search " "Lorem" and docPublishDate:[2015-01-20 TO 2016-01-20] " Then it is returning a highlights to me something like that:
Lorem ipsum dolor sit amet 2016, consectetur adipiscing elit. Sed volutpat metus lorem, a placerat nibh sodales in. Cras in mauris tempus, vulputate felis eu, tincidunt erat.
I don't want that solr highlight 2016 text also. I want that it only bold the Lorem. What should i do to achieve it?
Use a filter query to limit the set of documents to be returned instead - filters given as fq parameters are not used for highlighting.
You can also use the hl.q parameter to use a specific query for highlighting, so you could also submit the query to the highlighter without the date part - but this case seems to be better suited to using a filter query.
Related
I was taking a look at the solrconfig.xml for the dismax parser and found a bunch of values such as sku, manu and cat. What are these?
<requestHandler name="dismax" class="solr.SearchHandler" >
<lst name="defaults">
<str name="defType">dismax</str>
<str name="echoParams">explicit</str>
<float name="tie">0.01</float>
<str name="qf">
text^0.5 features^1.0 name^1.2 sku^1.5 id^10.0 manu^1.1 cat^1.4
</str>
<str name="pf">
text^0.2 features^1.1 name^1.5 manu^1.4 manu_exact^1.9
</str>
<str name="bf">
popularity^0.5 recip(price,1,1000,1000)^0.3
</str>
<str name="fl">
id,name,price,score
</str>
<str name="mm">
2<-1 5<-2 6<90%
</str>
<int name="ps">100</int>
<str name="q.alt">*:*</str>
<!-- example highlighter config, enable per-query with hl=true -->
<str name="hl.fl">text features name</str>
<!-- for this field, we want no fragmenting, just highlighting -->
<str name="f.name.hl.fragsize">0</str>
<!-- instructs Solr to return the field itself if no query terms are
found -->
<str name="f.name.hl.alternateField">name</str>
<str name="f.text.hl.fragmenter">regex</str> <!-- defined below -->
</lst>
</requestHandler>
Those are fields being searched for: SKU (stock keeping unit), manufacturer and categories.
you are probably looking at the solrconfig.xml that is provided as a SAMPLE, in order to be used with the docs to index at exampledocs/ directory.
These are the field names the sample docs (and schema) contain. It's just like a sample installation of solr.
Say I have a field in Solr which has the value, "The rain in Spain falls mainly in the plain."
And I want a highlighted result for the phrase, "falls mainly".
I pass these parameters to the select...
<lst name="params">
<str name="hl.fragsize">-1</str>
<str name="q">"falls mainly"</str>
<str name="hl.q">"falls mainly"</str>
<str name="hl.simple.pre">##pre##</str>
<str name="hl.simple.post">##post##</str>
<str name="hl.fl">note</str>
<str name="hl.maxAnalyzedChars">-1</str>
<str name="hl">true</str>
<str name="rows">2147483647</str>
</lst>
And the response comes back with each phrase word individually highlighted...
<lst name="highlighting">
<lst name="test">
<arr name="note">
<str>
The rain in Spain ##pre##falls##post## ##pre##mainly##post## in the plain.
</str>
</arr>
</lst>
</lst>
What I would have expected was the phrase highlighted...
<lst name="highlighting">
<lst name="test">
<arr name="note">
<str>
The rain in Spain ##pre##falls mainly##post## in the plain.
</str>
</arr>
</lst>
</lst>
I am using Solr version 4.0.
For the solr autowarming, is there any way to autowarm the filter queries which are executed before?
Yes. Create firstSearcher and newSearcher event listeners as documented here on the Solr wiki: http://wiki.apache.org/solr/SolrCaching#newSearcher_and_firstSearcher_Event_Listeners
It will look like this in your solrconfig.xml
<listener event="firstSearcher" class="solr.QuerySenderListener">
<arr name="queries">
<!-- seed common sort fields -->
<lst> <str name="q">anything</str> <str name="sort">name desc, price desc, populartiy desc</str> </lst>
<!-- seed common facets and filter queries -->
<lst> <str name="q">anything</str>
<str name="facet.field">category</str>
<str name="fq">inStock:true</str>
<str name="fq">price:[0 TO 100]</str>
</lst>
</arr>
</listener>
I have imported my xml documents from oracle database and indexed them. When I search : in admin console I do get results. My xml format is not close to what solr expects. but still when I search for any word that is part of my xml document Solr displays whole xml document. for example if I search for word "voicemail" solr displays xml documents that has word "voicemail"
Now when I go to solr/browse and give : I do see some thing but each result is like below (no data) even if i search for same word "voicemail" I am getting below. Can some body !!!!!!please Advice!!!!!
Price:
Features:
In Stock
there are only two things I can think off, one is settings in solrconfig.xml(like below).
<requestHandler name="/browse" class="solr.SearchHandler">
<lst name="defaults">
<str name="echoParams">explicit</str>
<str name="wt">velocity</str>
<str name="v.template">browse</str>
<str name="v.layout">layout</str>
<str name="title">Solritas</str>
<str name="df">text</str>
<str name="defType">edismax</str>
<str name="q.alt">*:*</str>
<str name="rows">10</str>
<str name="fl">*,score</str>
<str name="mlt.qf">
text^0.5 features^1.0 name^1.2 sku^1.5 id^10.0 manu^1.1 cat^1.4
</str>
<str name="mlt.fl">text,features,name,sku,id,manu,cat</str>
<int name="mlt.count">3</int>
<str name="qf">
text^0.5 features^1.0 name^1.2 sku^1.5 id^10.0 manu^1.1 cat^1.4
</str>
You wil find browse.vm and product.vm file in solr/collection/conf/velocity folder.
By default Solr comes with one default example on books data. hence the ui part is showing Price: Features: In Stock
If you want to show the meta data of your index fields on browse page then you have to modify product-doc.vm file on above mentioned location and you have to add your meta data field name in that.
I need to get snippets from documents where the query terms are matched to be able to output results similar to Google's snippet beneath the website URL. For example:
Snippet - Wikipedia, the free encyclopedia
en.wikipedia.org/wiki/Snippet
A snippet is defined as a small piece of something, it may in more specific contexts refer to: Sampling (music), the use of a short phrase of a recording as an ...
I have set hl=true and even hl.fl='*' in the query URL and but no summaries are being output.
Solr FAQs say:
For a field to be summarizable it must be both stored and indexed.
I'm using Nutch and Solr and have set them up using this tutorial. What additional steps to I need to take to be able to do this?
Adding sample query and output:
http://localhost:8983/solr/select/?q=test&version=2.2&start=0&rows=10&indent=on&hl=true
<response>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">57</int>
<lst name="params">
<str name="indent">on</str>
<str name="start">0</str>
<str name="q">test</str>
<str name="hl">true</str>
<str name="version">2.2</str>
<str name="rows">10</str>
</lst>
</lst>
<result name="response" numFound="94" start="0">
<doc>
<arr name="anchor">
<str>User:Sir Lestaty de Lioncourt</str>
</arr>
<float name="boost">0.0</float>
<str name="digest">6c27160d0b08068f3873bb2c063508b3</str>
<str name="id">
http://aa.wikibooks.org/wiki/User:Sir_Lestaty_de_Lioncourt
</str>
<str name="segment">20111029223245</str>
<str name="title">User:Sir Lestaty de Lioncourt - Wikibooks</str>
<date name="tstamp">2011-10-29T21:34:27.055Z</date>
<str name="url">
http://aa.wikibooks.org/wiki/User:Sir_Lestaty_de_Lioncourt
</str>
</doc>
...
</result>
<lst name="highlighting">
<lst name="http://aa.wikibooks.org/wiki/User:Sir_Lestaty_de_Lioncourt"/>
<lst name="http://aa.wikipedia.org/wiki/User:PipepBot"/>
<lst name="http://aa.wikipedia.org/wiki/User:Purodha"/>
...
</lst>
</response>
Looks like you aren't specifying the field to highlight (hl.fl). You should create a text field to use for highlighting (don't use string type) and have it stored/indexed.