Difference between Search result from Admin console and solr/browse

Difference between Search result from Admin console and solr/browse - solr

I have imported my xml documents from oracle database and indexed them. When I search : in admin console I do get results. My xml format is not close to what solr expects. but still when I search for any word that is part of my xml document Solr displays whole xml document. for example if I search for word "voicemail" solr displays xml documents that has word "voicemail"
Now when I go to solr/browse and give : I do see some thing but each result is like below (no data) even if i search for same word "voicemail" I am getting below. Can some body !!!!!!please Advice!!!!!
Price:
Features:
In Stock
there are only two things I can think off, one is settings in solrconfig.xml(like below).
<requestHandler name="/browse" class="solr.SearchHandler">
<lst name="defaults">
<str name="echoParams">explicit</str>
<str name="wt">velocity</str>
<str name="v.template">browse</str>
<str name="v.layout">layout</str>
<str name="title">Solritas</str>
<str name="df">text</str>
<str name="defType">edismax</str>
<str name="q.alt">*:*</str>
<str name="rows">10</str>
<str name="fl">*,score</str>
<str name="mlt.qf">
text^0.5 features^1.0 name^1.2 sku^1.5 id^10.0 manu^1.1 cat^1.4
</str>
<str name="mlt.fl">text,features,name,sku,id,manu,cat</str>
<int name="mlt.count">3</int>
<str name="qf">
text^0.5 features^1.0 name^1.2 sku^1.5 id^10.0 manu^1.1 cat^1.4
</str>

You wil find browse.vm and product.vm file in solr/collection/conf/velocity folder.
By default Solr comes with one default example on books data. hence the ui part is showing Price: Features: In Stock
If you want to show the meta data of your index fields on browse page then you have to modify product-doc.vm file on above mentioned location and you have to add your meta data field name in that.

Related

Getting SolrException : Boosting query defined twice for query

I have created two query documents with names 'makeup', and 'make up' in elevate.xml.
When I execute the elevate solr query, I am getting exception "Boosting query defined twice for query".
whereas when I save two documents with names 'ChildCare', and 'Child Care', Solr is returning the results.
Below is my Solr query:
http://localhost:8983/solr/oneweb-collection/elevate?
q=*:*&defType=edismax&fl=id&fl=title&fl=subtitle&fl=course_code&
fl=cricos_code&fl=course_introduction&fl=outcome&fl=page_url&
fl=score&fl=%5Btafe_elevated%5D&rows=3&wt=json
When I save the document nodes, system internally replacing the spaces and storing the documents with same name.
What is the resolution for this issue?
Config for elevator:
<searchComponent name="elevator" class="solr.QueryElevationComponent" >
<str name="queryFieldType">text_general</str>
<str name="config-file">elevate.xml</str>
<str name="forceElevation">true</str>
<str name="exclusive">true</str>
<str name="editorialMarkerFieldName">test_elevated</str>
</searchComponent>
<requestHandler name="/elevate" class="solr.SearchHandler" startup="lazy">
<lst name="defaults">
<str name="echoParams">explicit</str>
<str name="defType">edismax</str>
<int name="rows">3</int>
<str name="fl">id,title,subtitle,course_code,cricos_code,course_introduction,outcome,page_url,[test_elevated],score</str>
<str name="q.alt">*:*</str>
</lst>
<arr name="last-components">
<str>elevator</str>
</arr>
</requestHandler>

Implement k means clustering in solr

How can i implement k means clustering in solr 6.5 ?
Requirements :-
1) I want to cluster the docs at the query time on the basis of their score
2) I have written my own handler and i want to add the clustering function in that handler only such that it does not the ordering of the docs
I had tried to write the clustering search component as below :-
<searchComponent name="clustering" enable="${solr.clustering.enabled:true}" class="solr.clustering.ClusteringComponent">
<lst name="engine">
<str name="name">kmeans</str>
<str name="carrot.algorithm">org.carrot2.clustering.kmeans.BisectingKMeansClusteringAlgorithm</str>
<str name="BisectingKMeansClusteringAlgorithm.clusterCount">4</str>
<str name="documents">100</str>
<str name="BisectingKMeansClusteringAlgorithm.maxIterations">4</str>
</lst>
</searchComponent>
My Request Handler is as :
<requestHandler name="abc" class="solr.SearchHandler">
<lst name="invariants">
<str name="defType">synonym_edismax</str>
<str name="synonyms">true</str>
<str name="indent">on</str>
</lst>
<lst name="appends">
<str name="fq">search_term</str>
</lst>
<lst name="defaults">
<str name="echoParams">none</str>
<str name="wt">json</str>
<str name="timeAllowed">15000</str>
<str name="qf">Field1</str>
<str name="qf">Field2^0.5</str>
<str name="pf">Field3</str>
<float name="tie">0.2</float>
<str name="fl">Field5,Field6</str>
<str name="facet">false</str>
<str name="mm">2<-1 4<70%</str>
<!-- spellcheck -->
<str name="spellcheck.dictionary">default</str>
<str name="spellcheck">on</str>
<str name="spellcheck.extendedResults">true</str>
<str name="spellcheck.count">1</str>
<str name="spellcheck.onlyMorePopular">true</str>
<str name="spellcheck.collate">true</str>
</lst>
<arr name="last-components">
<str>spellcheck</str>
</arr>
</requestHandler>
How can i add the clustering in this request handler such that my number of clusters is 4 and iterations is also 4
Also whats the difference between
carrot.url
carrot.snippet
carrot.title
I read the docs definition but i m unable to understand it.

To add the clustering component to a request handler just :
<arr name="last-components">
<str>spellcheck</str>
<str>clustering</str>
</arr>
Then :
<str name="carrot.url">id</str> -> unique key of your document
This is the unique identifier for your document.
<str name="carrot.title">doctitle</str> -> the title(s)/label(s) for your document
This is the field or list of fields, which are short and tend to be more important to group your documents together
<str name="carrot.snippet">content</str> -> the content/text/body of your document
From the wiki :
carrot.title
The field (alternatively comma- or space-separated list of fields) that should be mapped to the logical document’s title. The clustering algorithms typically give more weight to the content of the title field compared to the content (snippet). For best results, the field should contain concise, noise-free content. If there is no clear title in your data, you can leave this parameter blank.
carrot.snippet
The field (alternatively comma- or space-separated list of fields) that should be mapped to the logical document’s main content. If this mapping points to very large content fields the performance of clustering may drop significantly. An alternative then is to use query-context snippets for clustering instead of full field content. See the description of the carrot.produceSummary parameter for details.
carrot.url
The field that should be mapped to the logical document’s content URL. Leave blank if not required.

How to design solr searchhandler in core mode and how to use it?

I'm using Solr 6.1.0 and not use cloud mode,
I has add searchhandler in solrconfig.xml and it's work, can see the search results
But when I use this searchhandler and add query in URL it'll error
Like this :
http://localhost:8983/solr/testcorea/contentsearch?indent=on&q=%22test%22&wt=json&shards=localhost:8983/solr/testcorea,localhost:8983/solr/testcoreb,localhost:8983/solr/testcorec,localhost:8983/solr/testcored
This is my searchhandler:
<requestHandler name="/contentsearch" class="solr.SearchHandler">
<lst name="defaults">
<str name="echoParams">explicit</str>
<str name="wt">json</str>
<str name="indent">true</str>
<str name="defType">edismax</str>
<str name="qf">
title^100.0 content^80.0 text^5.0
</str>
<str name="q">*:*</str>
<str name="indent">true</str>
<str name="rows">10</str>
<!-- Facet settings -->
<str name="facet">on</str>
<str name="facet.field">content_type</str>
<str name="facet.field">category</str>
<str name="facet.field">author</str>
<str name="facet.field">editor</str>
<str name="facet.field">source_type</str>
<str name="hl">on</str>
<str name="hl.fl">title content</str>
<str name="hl.preserveMulti">true</str>
</lst>
<arr name="last-components">
<str>elevator</str>
</arr>
</requestHandler>
Error message :
=========================================================================
{"responseHeader":{"status":404,"QTime":10,"params":{"q":"\"test\"","shards":"localhost:8983/solr/testcorea,localhost:8983/solr/testcoreb,localhost:8983/solr/testcorec,localhost:8983/solr/testcored","indent":"on","wt":"json"}},"error":{"metadata":["error-class","org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException","root-error-class","org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException"],"msg":"Error from server at http://localhost:8983/solr/testcorec: Expected mime type application/octet-stream but got text/html. \n\n\nError 404 Not Found\n\nHTTP ERROR 404\nProblem accessing /solr/testcorec/contentsearch. Reason:\n Not Found\n\n\n","code":404}}
When I use solr default searchhandlr and query url it'll work
Like this :
http://localhost:8983/solr/testcorea/browse?indent=on&q=%22test%22&wt=json&shards=localhost:8983/solr/testcorea,localhost:8983/solr/testcoreb,localhost:8983/solr/testcorec,localhost:8983/solr/testcored
does anyone know what's different?
and why it does not work?
Thanks

Add highlighting parameters to URL.
Add hl=on and hl.fl=field_name to your url
ex:
hl.fl=title&hl=on&indent=on&q=test

What are manu, sku and cat in the qf parameter in SOLR?

I was taking a look at the solrconfig.xml for the dismax parser and found a bunch of values such as sku, manu and cat. What are these?
<requestHandler name="dismax" class="solr.SearchHandler" >
<lst name="defaults">
<str name="defType">dismax</str>
<str name="echoParams">explicit</str>
<float name="tie">0.01</float>
<str name="qf">
text^0.5 features^1.0 name^1.2 sku^1.5 id^10.0 manu^1.1 cat^1.4
</str>
<str name="pf">
text^0.2 features^1.1 name^1.5 manu^1.4 manu_exact^1.9
</str>
<str name="bf">
popularity^0.5 recip(price,1,1000,1000)^0.3
</str>
<str name="fl">
id,name,price,score
</str>
<str name="mm">
2<-1 5<-2 6<90%
</str>
<int name="ps">100</int>
<str name="q.alt">*:*</str>
<!-- example highlighter config, enable per-query with hl=true -->
<str name="hl.fl">text features name</str>
<!-- for this field, we want no fragmenting, just highlighting -->
<str name="f.name.hl.fragsize">0</str>
<!-- instructs Solr to return the field itself if no query terms are
found -->
<str name="f.name.hl.alternateField">name</str>
<str name="f.text.hl.fragmenter">regex</str> <!-- defined below -->
</lst>
</requestHandler>

Those are fields being searched for: SKU (stock keeping unit), manufacturer and categories.

you are probably looking at the solrconfig.xml that is provided as a SAMPLE, in order to be used with the docs to index at exampledocs/ directory.
These are the field names the sample docs (and schema) contain. It's just like a sample installation of solr.

Solr: I have set `hl=true` but no summaries are being output

I need to get snippets from documents where the query terms are matched to be able to output results similar to Google's snippet beneath the website URL. For example:
Snippet - Wikipedia, the free encyclopedia
en.wikipedia.org/wiki/Snippet
A snippet is defined as a small piece of something, it may in more specific contexts refer to: Sampling (music), the use of a short phrase of a recording as an ...
I have set hl=true and even hl.fl='*' in the query URL and but no summaries are being output.
Solr FAQs say:
For a field to be summarizable it must be both stored and indexed.
I'm using Nutch and Solr and have set them up using this tutorial. What additional steps to I need to take to be able to do this?
Adding sample query and output:
http://localhost:8983/solr/select/?q=test&version=2.2&start=0&rows=10&indent=on&hl=true
<response>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">57</int>
<lst name="params">
<str name="indent">on</str>
<str name="start">0</str>
<str name="q">test</str>
<str name="hl">true</str>
<str name="version">2.2</str>
<str name="rows">10</str>
</lst>
</lst>
<result name="response" numFound="94" start="0">
<doc>
<arr name="anchor">
<str>User:Sir Lestaty de Lioncourt</str>
</arr>
<float name="boost">0.0</float>
<str name="digest">6c27160d0b08068f3873bb2c063508b3</str>
<str name="id">
http://aa.wikibooks.org/wiki/User:Sir_Lestaty_de_Lioncourt
</str>
<str name="segment">20111029223245</str>
<str name="title">User:Sir Lestaty de Lioncourt - Wikibooks</str>
<date name="tstamp">2011-10-29T21:34:27.055Z</date>
<str name="url">
http://aa.wikibooks.org/wiki/User:Sir_Lestaty_de_Lioncourt
</str>
</doc>
...
</result>
<lst name="highlighting">
<lst name="http://aa.wikibooks.org/wiki/User:Sir_Lestaty_de_Lioncourt"/>
<lst name="http://aa.wikipedia.org/wiki/User:PipepBot"/>
<lst name="http://aa.wikipedia.org/wiki/User:Purodha"/>
...
</lst>
</response>

Looks like you aren't specifying the field to highlight (hl.fl). You should create a text field to use for highlighting (don't use string type) and have it stored/indexed.