solr not showing all indexed doc - solr

I have created two cores for creating index for two different purpose.
The first core is running fine but when I try to created index with second core using DIH, it showed 5 doc created
<response>
−
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">0</int>
</lst>
−
<lst name="initArgs">
−
<lst name="defaults">
<str name="config">data-config.xml</str>
</lst>
</lst>
<str name="command">full-import</str>
<str name="status">idle</str>
<str name="importResponse"/>
−
<lst name="statusMessages">
<str name="Total Requests made to DataSource">1</str>
<str name="Total Rows Fetched">5</str>
<str name="Total Documents Skipped">0</str>
<str name="Full Dump Started">2011-12-26 12:24:45</str>
−
<str name="">
Indexing completed. Added/Updated: 5 documents. Deleted 0 documents.
</str>
<str name="Committed">2011-12-26 12:24:45</str>
<str name="Optimized">2011-12-26 12:24:45</str>
<str name="Total Documents Processed">5</str>
<str name="Time taken ">0:0:0.52</str>
</lst>
−
<str name="WARNING">
This response format is experimental. It is likely to change in the future.
</str>
</response>
But when I try to display all results with the given below url
http://localhost:8983/solr/core1/select/?q=*:*&version=2.2&start=0&rows=10&indent=on
but it is showing 1 result only.
Any help will be appreciated...
Thanks

You probably are using multivalue=false for the field in question. If you want all 5 documents to be created you have to change this in the schema.

Related

Spellchecker giving suggestions for correct words

Querying an alias with 5 collections and getting suggestions for correct words as well.
Ex:- Collection1 has "tire policy" in it
Collection2 has a word "polite" in it.
When I query "tire policy" it checks and returns "polite" as a suggestion for "policy".
P.S. - During query time i am passing
spellcheck.maxResultsForSuggest=0
As without it spellchecker does not correct the wrong spellings.
I am using DirectSolrSpellchecker
Adjusted the
<float name="maxQueryFrequency">0.01</float> to
<float name="maxQueryFrequency">1</float>
but getting the same issue.
Direct solr spellchecker code-
<lst name="spellchecker">
<str name="name">default</str>
<str name="field">text</str>
<str name="classname">solr.DirectSolrSpellChecker</str>
<str name="distanceMeasure">org.apache.lucene.search.spell.LevenshteinDistance</str>
<float name="accuracy">0.5</float>
<int name="maxEdits">2</int>
<int name="minPrefix">1</int>
<int name="maxInspections">5</int>
<int name="minQueryLength">3</int>
<float name="maxQueryFrequency">1</float>
</lst>
spellchecker inside handler
<str name="spellcheck.dictionary">default</str>
<str name="spellcheck.dictionary">wordbreak</str>
<str name="spellcheck.dictionary">file</str>
<str name="spellcheck">on</str>
<str name="spellcheck.extendedResults">true</str>
<str name="spellcheck.count">10</str>
<str name="spellcheck.alternativeTermCount">5</str>
<str name="spellcheck.maxResultsForSuggest">5</str>
<str name="spellcheck.collate">true</str>
<str name="spellcheck.collateExtendedResults">true</str>
<str name="spellcheck.maxCollationTries">10</str>
<str name="spellcheck.maxCollations">5</str>
There should not be any suggestions for policy as "maxqueryfreqency" is set to "1".
Have you tried to adjust the maxResultsForSuggest parameter?
With your current value of 5, if your query
returns 5 or fewer results, the spellchecker will report
"correctlySpelled=false" and also offer suggestions
solr 8.1 documentation

Getting SolrException : Boosting query defined twice for query

I have created two query documents with names 'makeup', and 'make up' in elevate.xml.
When I execute the elevate solr query, I am getting exception "Boosting query defined twice for query".
whereas when I save two documents with names 'ChildCare', and 'Child Care', Solr is returning the results.
Below is my Solr query:
http://localhost:8983/solr/oneweb-collection/elevate?
q=*:*&defType=edismax&fl=id&fl=title&fl=subtitle&fl=course_code&
fl=cricos_code&fl=course_introduction&fl=outcome&fl=page_url&
fl=score&fl=%5Btafe_elevated%5D&rows=3&wt=json
When I save the document nodes, system internally replacing the spaces and storing the documents with same name.
What is the resolution for this issue?
Config for elevator:
<searchComponent name="elevator" class="solr.QueryElevationComponent" >
<str name="queryFieldType">text_general</str>
<str name="config-file">elevate.xml</str>
<str name="forceElevation">true</str>
<str name="exclusive">true</str>
<str name="editorialMarkerFieldName">test_elevated</str>
</searchComponent>
<requestHandler name="/elevate" class="solr.SearchHandler" startup="lazy">
<lst name="defaults">
<str name="echoParams">explicit</str>
<str name="defType">edismax</str>
<int name="rows">3</int>
<str name="fl">id,title,subtitle,course_code,cricos_code,course_introduction,outcome,page_url,[test_elevated],score</str>
<str name="q.alt">*:*</str>
</lst>
<arr name="last-components">
<str>elevator</str>
</arr>
</requestHandler>

Solr suggester returning terms from deleted documents

I have a SolrCloud setup and I'm testing the suggestion component. I have several hundred documents in the index. I did not want some of the documents in the index because they contain gibberish (they were binary files that got improperly converted to text). I've removed them from the index, but the gibberish words from them are still showing up in the suggestions.
My suggest configuration looks like this:
<searchComponent name="suggest" class="solr.SuggestComponent">
<lst name="suggester">
<str name="name">fuzzySuggester</str>
<str name="lookupImpl">FuzzyLookupFactory</str>
<str name="dictionaryImpl">HighFrequencyDictionaryFactory</str>
<str name="storeDir">suggester_fuzzy_dir</str>
<str name="field">dictionary_text</str>
<str name="suggestAnalyzerFieldType">phrase_suggest</str>
<str name="exactMatchFirst">true</str>
<float name="threshold">0.001</float>
<str name="buildOnStartup">false</str>
<str name="buildOnCommit">true</str>
</lst>
</searchComponent>
<requestHandler name="/suggest" class="solr.SearchHandler" startup="lazy">
<lst name="defaults">
<str name="suggest">true</str>
<str name="suggest.dictionary">fuzzySuggester</str>
<str name="suggest.onlyMorePopular">true</str>
<str name="suggest.count">5</str>
<str name="suggest.collate">true</str>
</lst>
<arr name="components">
<str>suggest</str>
</arr>
</requestHandler>
Note that buildOnCommit is set to true. I also tried to remove them using a /suggest query with the suggest.build=true parameter, but that had no effect.
Is there something else required to remove terms from the dictionary?
Despite using expungeDeletes=true in the update, the deleted documents were still hanging around. Optimizing removed them and appears to have removed all the gibberish terms from suggestions.

solr suggester not working with shard for multiple core

I'm trying to use the suggest component (solr 4.6) with multiple cores. I have added a search component and a request handler in my solrconfig. That works fine for 1 core but querying my solr instance with the shards parameter does not work.
But did you mean' (spell check ) is working fine with multiple cores using shard.
Here is the configuration part of solrconfig file :
<searchComponent class="solr.SpellCheckComponent" name="suggest">
<lst name="spellchecker">
<str name="name">suggestDictionary</str>
<str name="classname">org.apache.solr.spelling.suggest.Suggester</str>
<str name="lookupImpl">org.apache.solr.spelling.suggest.fst.FSTLookupFactory</str>
<str name="field">suggest</str>
<float name="threshold">0.0005</float>
<str name="buildOnCommit">true</str>
</lst>
</searchComponent>
<requestHandler name="/suggest" class="org.apache.solr.handler.component.SearchHandler">
<lst name="defaults">
<str name="echoParams">none</str>
<str name="wt">xml</str>
<str name="indent">false</str>
<str name="spellcheck">true</str>
<str name="spellcheck.dictionary">suggestDictionary</str>
<str name="spellcheck.onlyMorePopular">true</str>
<str name="spellcheck.count">5</str>
<str name="spellcheck.collate">false</str>
<str name="qt">/suggest</str>
<str name="shards.qt">/suggest</str>
<str name="shards">localhost:8080/cores/core1,localhost:8080/cores/core2</str>
<bool name="distrib">false</bool>
</lst>
<arr name="components">
<str>suggest</str>
</arr>
<shardHandlerFactory class="HttpShardHandlerFactory">
<int name="socketTimeOut">1000</int>
<int name="connTimeOut">5000</int>
</shardHandlerFactory>
</requestHandler>
It works for me..
You can get the suggestions using this RestURL
http://localhost:8983/solr/demo/spell?q=howoo&wt=json&indent=true&qt=spell&shards.qt=/spell&shards=localhost:8983/solr/demo_shard2_replica1,localhost:8983/solr/demo_shard1_replica2
OR Simply use this :
http://localhost:8983/solr/demo/spell?q=hoo&wt=json&indent=true&shards.qt=/spell
shards.qt=/spell : Need to add that allows suggestion on shards
Here, you have make changes and apply for things which requires.
Collection = demo
Shards = demo_shard2_replica1, demo_shard1_replica2
Replace collection and shards names with your names of collection and shards.

Solr: I have set `hl=true` but no summaries are being output

I need to get snippets from documents where the query terms are matched to be able to output results similar to Google's snippet beneath the website URL. For example:
Snippet - Wikipedia, the free encyclopedia
en.wikipedia.org/wiki/Snippet
A snippet is defined as a small piece of something, it may in more specific contexts refer to: Sampling (music), the use of a short phrase of a recording as an ...
I have set hl=true and even hl.fl='*' in the query URL and but no summaries are being output.
Solr FAQs say:
For a field to be summarizable it must be both stored and indexed.
I'm using Nutch and Solr and have set them up using this tutorial. What additional steps to I need to take to be able to do this?
Adding sample query and output:
http://localhost:8983/solr/select/?q=test&version=2.2&start=0&rows=10&indent=on&hl=true
<response>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">57</int>
<lst name="params">
<str name="indent">on</str>
<str name="start">0</str>
<str name="q">test</str>
<str name="hl">true</str>
<str name="version">2.2</str>
<str name="rows">10</str>
</lst>
</lst>
<result name="response" numFound="94" start="0">
<doc>
<arr name="anchor">
<str>User:Sir Lestaty de Lioncourt</str>
</arr>
<float name="boost">0.0</float>
<str name="digest">6c27160d0b08068f3873bb2c063508b3</str>
<str name="id">
http://aa.wikibooks.org/wiki/User:Sir_Lestaty_de_Lioncourt
</str>
<str name="segment">20111029223245</str>
<str name="title">User:Sir Lestaty de Lioncourt - Wikibooks</str>
<date name="tstamp">2011-10-29T21:34:27.055Z</date>
<str name="url">
http://aa.wikibooks.org/wiki/User:Sir_Lestaty_de_Lioncourt
</str>
</doc>
...
</result>
<lst name="highlighting">
<lst name="http://aa.wikibooks.org/wiki/User:Sir_Lestaty_de_Lioncourt"/>
<lst name="http://aa.wikipedia.org/wiki/User:PipepBot"/>
<lst name="http://aa.wikipedia.org/wiki/User:Purodha"/>
...
</lst>
</response>
Looks like you aren't specifying the field to highlight (hl.fl). You should create a text field to use for highlighting (don't use string type) and have it stored/indexed.

Resources