I'm using Solr example server to do an investigation. After fed it with all cached documents, mostly html files, it works fine except the highlight part.
The request URL I'm using is as followed,
http://localhost:8983/solr/collection1/select?q=keyword&wt=xml&hl=true
And the XML response is as followed,
<response>
<lst name="responseHeader">...</lst>
<result name="response" numFound="371" start="0">
<doc>
<arr name="links">
<str>rect</str>
<str>FJU_KDJFJJ_DJ_13</str>
</arr>
<str name="id">
F:\SkyDrive\funproj\cache\adfadf\asdff.htm
</str>
<arr name="title">
<str>asdff.htm</str>
</arr>
<arr name="content_type">
<str>text/html; charset=ISO-8859-1</str>
</arr>
<str name="resourcename">
F:\SkyDrive\funproj\cache\adfadf\asdff.htm
</str>
<arr name="content">
<str>...</str>
</arr>
<long name="_version_">1418589758873927680</long>
</doc>
<doc>...</doc>
</result>
<lst name="highlighting">
<lst name="F:\SkyDrive\funproj\cache\adfadf\asdff.htm"/>
<lst name="F:\SkyDrive\funproj\cache\cvzcv\c58053e10vq.htm"/>
<lst name="F:\SkyDrive\funproj\cache\hgdfhdfgh\c00302e10vq.htm"/>
<lst name="F:\SkyDrive\funproj\cache\asdfasdf\c00945e10vq.htm"/>
<lst name="F:\SkyDrive\funproj\cache\hjmyukt\asfdf06113002_03312010.htm"/>
<lst name="F:\SkyDrive\funproj\cache\nmvbmnm\saf0q033111.htm"/>
<lst name="F:\SkyDrive\funproj\cache\lkiullkl\a10-5974_110q.htm"/>
<lst name="F:\SkyDrive\funproj\cache\jhlhjkl\fdfinal.htm"/>
<lst name="F:\SkyDrive\funproj\cache\vcbxcbvcx\zynex10q33110_5132010.htm"/>
<lst name="F:\SkyDrive\funproj\cache\yuiuiou\v185403_10q.htm"/>
</lst>
</response>
The response, no matter JSON or XML, does not have the highlight part at all. I've checked the solrconfig.xml both in local file system and the admin page of the example server. The Highlighting is default on and pre/post are set to ""/"". The example search portal itself works fine with highlight in its results. But since it's not AJAX, there's no way for me to check its result through chrome.
What did I do wrong?
You have to define the fields using hl.fl which needs to be highlighted. For example, if you want to search and highlight hits in content field, you can use query below:
http://localhost:8983/solr/collection1/select?q=content:keyword&wt=xml&hl=true&hl.q=content:keyword&hl.fl=content
By default highlighting response returns only one snippet,even if your field have multiple hits. Also the length of snippet(fragsize) is set to 100 chars by default.
You can use hl.snippets and hl.fragsize to modify them.
For example, to modify fragsize:
http://localhost:8983/solr/collection1/select?q=content:keyword&wt=xml&hl=true&hl.q=content:keyword&hl.fl=content&hl.fragsize=5000
Passing hl.fragsize=0 will make fragsize unlimited.
For changing number of snippets:
http://localhost:8983/solr/collection1/select?q=content:keyword&wt=xml&hl=true&hl.q=content:keyword&hl.fl=content&hl.snippets=10
Refer to solr wiki for more parameters.
you would need to add the field hl.fl on which the highlighting needs to be enabled.
Default value for the param is blank.
Related
I've followed this to set up shard in Solr. As per this topic "Testing Index Sharding on Two Local Servers", I was able to query into shard and get the result (somehose:port1/solr/select?shards=somehost:port1/solr,somehost:port2/solr&indent=true&q=helloworld
).
In that page it is also mentioned that "Rather than require users to include the shards parameter explicitly, it is usually preferred to configure this parameter as a default in the RequestHandler section of solrconfig.xml."
So, I made the changes in solrconfig.xml of the solr instance which is running on port1
<requestHandler name="/select" class="solr.SearchHandler">
<lst name="defaults">
<str name="echoParams">explicit</str>
<int name="rows">10</int>
<str name="df">text</str>
</lst>
<lst name="shards.info">
<lst name="localhost:port2/solr">
<long name="numFound">1333</long>
<float name="maxScore">1.0</float>
<str name="shardAddress">http://localhost:port2/solr</str>
<long name="time">686</long>
</lst>
<lst name="localhost:port1/solr">
<long name="numFound">342</long>
<float name="maxScore">1.0</float>
<str name="shardAddress">http://localhost:port1/solr</str>
<long name="time">602</long>
</lst>
</lst>
Now, I'm trying to hit somehost:port1/solr/collection1/select?q=helloworld&wt=json&indent=true
but I'm not getting the desired responce. Please let me know what I'm missing here?
You can't just copy the content from the response into your configuration file - those two formats are completely different. The reference is to the fact that each entry in the defaults section is added to the query string (unless they're provided there already - there are also options if you want to force a certain value that can't be overridden).
<requestHandler name="/selectdistributed" class="solr.SearchHandler">
<lst name="defaults">
[...]
<str name="shards">somehost:port1/solr,somehost:port2/solr</str>
</lst>
</requestHandler>
.. should do what you want. This will add shards=somehost:port1/solr,somehost:port2/solr to the query string of all the requsts that go through that handler.
I am attempting to set up a request handler that will boost certain fields by different amounts. I have the following request handler.
<requestHandler name="/select" class="solr.SearchHandler" default="true">
<lst name="defaults">
<str name="echoParams">explicit</str>
<str name="start">0</str>
<int name="rows">10</int>
<str name="defType">edismax</str>
<str name="qf">
title^50.0 searchTitle^7.0 keywords^5.0 content^1.0 text^1.0
</str>
<str name="pf">
title^50.0 searchTitle^7.0 keywords^5.0 content^1.0 text^1.0
</str>
<str name="df">text</str>
</lst>
</requestHandler>
However, the fields aren't being boosted correctly, if at all. I noticed that documents with the search term in the title field aren't appearing any higher than documents with the search term in the text field. Arbitrarily re-arranging the weights produces the same document order each time.
When I go into the solr web interface/admin UI and do a search I get the same results. However, if I explicitly check the edismax checkbox and enter the field-boost data in the qf and pf boxes I get the results and the weighting I would expect.
In fact, I also just tried changing the rows value to 5 and still received the same result. It looks like my queries aren't being handled by the /select handler, even though that is what I choose both in the solr Admin UI and when I create the HttpSolrServer object to do the queries from the server.
I am using solr v4.8.0.
Any help would be appreciated.
Check setting in solrconfig for
<requestDispatcher handleSelect="false" >
If you want to use select as a requesthandler, this needs to be
<requestDispatcher handleSelect="true" >
I am trying to crawl data using Nutch and Index that Data in Solr.
I have follow the steps from this Url Using Nutch with Solr and Nutch Wiki Tutorial
I've successfully Index data using Solrindex command
bin/nutch solrindex http://127.0.0.1:8983/solr/ crawl/crawldb -linkdb crawl/linkdb crawl/segments/* but in Result I can't find the Indexed data.
I want result as below Image
But I can't see any result data at right side.
If you want some data to be returned with the search response, check that the targeted fields are stored by solr, then you can set a list of fields to return in your query using fl param (with stored field name as value). You can also set default fl values in solrconfig.xml.
For example, let's say you want content field to be returned. In your schema.xml, in the <fields> declaration you should have the option stored="true" for this field like so :
<field name="content" type="text" indexed="true" stored="true"/>
Then in solrconfig.xml, declare default fl params in the requestHandler definition, you can set specific fields (space separated field names). The xml sample (grabbed from the tutorial) should look like this if we just want data stored in the content field to be returned.
<requestHandler name="/nutch" class="solr.SearchHandler" >
<lst name="defaults">
<str name="defType">dismax</str>
<str name="echoParams">explicit</str>
<float name="tie">0.01</float>
<str name="qf">
content^0.5 anchor^1.0 title^1.2
</str>
<str name="pf">
content^0.5 anchor^1.5 title^1.2 site^1.5
</str>
<str name="fl">
url content
</str>
<str name="mm">
2<-1 5<-2 6<90%
</str>
<int name="ps">100</int>
<bool hl="true"/>
<str name="q.alt">*:*</str>
<str name="hl.fl">title url content</str>
<str name="f.title.hl.fragsize">0</str>
<str name="f.title.hl.alternateField">title</str>
<str name="f.url.hl.fragsize">0</str>
<str name="f.url.hl.alternateField">url</str>
<str name="f.content.hl.fragmenter">regex</str>
</lst>
</requestHandler>
You can override these defaults right in the query. A common use case is to put "*,score" in the fl area in solr query interface so that you can see all stored fields (using wildcard character *) along with the score in the results. You might also want to specify the query type parameter (qt) according to the targeted request handler (should be "/nutch").
Helpful links :
http://wiki.apache.org/solr/SchemaXml#Common_field_options
http://wiki.apache.org/solr/CommonQueryParameters#fl
I am wondering if it is possible with Solr 3.4 to boost a search result, if the query is found in a special field without using the "fieldname:query"-syntax.
Let me explain:
I have several fields in my index. One of it is named "abbreviation" and is filled with text like AVZ, JSP, DECT, ...
To be able to find results when searching purely for "AVZ" I added a
<copyField source="abbreviation" dest="text"/>
in my schema.xml. The field text is my defaultSearchField.
This is not the best solution in my opinion. So I am trying to find out, if it is possible to search for "AVZ" in all fields and if the String is found in the field abbreviation, the result entry should be boosted (increasing the score) so that it will be listed at first entry in the result list. Would be the same as using abbreviation:AVZ AVZ as query.
The other possibility I can think of is to analyze the query. And if a substring like "AVZ" is found, the query will be appended with abbreviation:AVZ. But in this case I must be able to find out, which abbreviations are indexed. Is it possible to retrieve all possible terms of a field from the Solr index using SolrJ?
Best Regards
Tobias
Without the fieldname:term syntax use can define a request handler -
<requestHandler name="search" class="solr.SearchHandler" default="true">
<lst name="defaults">
<str name="echoParams">explicit</str>
<str name="defType">dismax</str>
<str name="qf">
abbreviation^2 text
</str>
<str name="q.alt">*:*</str>
<str name="rows">10</str>
<str name="fl">*,score</str>
</lst>
</requestHandler>
This uses the dismax query parser. You can use edismax as well.
This will boost the results and query would be a simple query as q=AVZ.
If only through url, you can boost match on specific field like mentioned # link
e.g.
q=abbreviation:AVZ^2 text:AVZ
This would boost the results with a match on abbreviation, which would result the documents to appear on top.
It is not possible to get all results with dismax using the *:* query.
However, for all docs just do not pass any q param. q.alt=*:* will return all the docs.
Else, update the defType to edismax.
<requestHandler name="search" class="solr.SearchHandler" default="true">
<lst name="defaults">
<str name="echoParams">explicit</str>
<str name="defType">edismax</str>
<str name="qf">
abbreviation^2 text
</str>
<str name="q.alt">*:*</str>
<str name="rows">10</str>
<str name="fl">*,score</str>
</lst>
</requestHandler>
Apache Solr 6.4.2:
Boosting Exact phrase search not working:
Solrconfig.xml:
explicit
<int name="rows">10</int>
<str name="defType">edismax</str>
<str name="qf">names^50</str>
<!-- <str name="df">text</str> -->
</lst>
Solr query used to test: q=(names:alex%20pandian)&wt=json&debugQuery=on
In debug mode it shows
"parsedquery_toString":"+((names:alex ((names:pandian)^50.0))) ()"
It is boosting the terms from second word only. In this case only Pandian is boosted but Alex is not.
I am developing a search engine app using Asp.Net, C# and Solrnet. I use the standard request handler. Is there a way I can boost the fields at query time from inside the solrconfig.xml file itself. Just like the "qf" field for Dismax handler.
Right now am searching like "field1:value^1.5 field2:value^1.2 field3:value^0.8" and this is done in the middle tier. I want Solr itself to do this using standard request handler.
Can I write a similar kind of thing inside standard req handler?
Here is my solrconfig file.
<requestHandler name="standard" class="solr.SearchHandler" default="true">
<lst name="defaults">
<str name="echoParams">explicit</str>
<str name="hl">true</str>
<str name="hl.snippets">3</str>
<str name="hl.fragsize">25</str>
<str name="qf">file_description^100.0 file_content^6.0 file_name^10.0 file_comments^4.0
</str>
</lst>
<arr name="last-components">
<str>spellcheck</str>
</arr>
</requestHandler>
Regards
Vignesh
Inside a 'requestHandler' element in solrconfig.xml you can add
<requestHandler>
<str name="qf">
field1^3.0 field2^2.0 field3^1.0
</str>
</requestHandler>
to provide a predefined field bias, I hope I have understood your question right? :-)