Using Solr 9.
I'd like the same results to return for the terms
Lowe's
as well as
Lowes
I can't seem to find the correct combination with this filter:
<fieldType name="text_general" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.StandardTokenizerFactory" />
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" />
<filter class="solr.KStemFilterFactory"/>
<filter class="solr.EnglishPossessiveFilterFactory"/>
<filter class="solr.LowerCaseFilterFactory" />
<charFilter class="solr.MappingCharFilterFactory" mapping="mapping-ISOLatin1Accent.txt"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.StandardTokenizerFactory" />
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" />
<filter class="solr.KStemFilterFactory"/>
<filter class="solr.EnglishPossessiveFilterFactory"/>
<filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true" />
<filter class="solr.LowerCaseFilterFactory" />
<charFilter class="solr.MappingCharFilterFactory" mapping="mapping-ISOLatin1Accent.txt"/>
</analyzer>
</fieldType>
When testing in Solr's analyzer, I would expect that
<filter class="solr.KStemFilterFactory"/>
Would remove the s from the Lowes example in the query, thus matching Lowe in the index step.
As suggested in http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#SynonymFilter , synonym searching is not working when added white spaces in the synonyms i.e. the index word is "marketing" and the synonyms added are as follows:
abc, abc xyz, marketing
my schema is as follows:
<fieldType name="String" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true" />
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.StandardTokenizerFactory"/>
<charFilter class="solr.HTMLStripCharFilterFactory"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true" />
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>
also i tried adding <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/> to <analyzer type="query"> but its not working.
Please suggest.
Thanks & Many Regards,
Lalit Joshi
My schema is below. I have added PorterStemFilterFactory to schema.xml. I tried to restart it and reimport but not working:
<fieldType name="text_general" class="solr.TextField" positionIncrementGap="100" multiValued="true">
<analyzer type="index">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" />
<!-- in this example, we will only use synonyms at query time
<filter class="solr.SynonymFilterFactory" synonyms="index_synonyms.txt" ignoreCase="true" expand="false"/>
-->
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.PorterStemFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" />
<filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" tokenizerFactory="solr.StandardTokenizerFactory" ignoreCase="true" expand="true"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.PorterStemFilterFactory"/>
</analyzer>
</fieldType>
I have configured a SOLR installation 4.10.3 with schema.xml with the followin lines :
<fieldType name="text_general" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.KeywordTokenizerFactory"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" />
<!-- in this example, we will only use synonyms at query time
<filter class="solr.SynonymFilterFactory" synonyms="index_synonyms.txt" ignoreCase="true" expand="false"/>
-->
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.KeywordTokenizerFactory"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" />
<filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>
I have indexed the text field, and inserted some stopword in the stopword.txt file. But the configuration gives an error when I restart solr.
Please how to make it working the stopwords ?
Thank you
Have added the following in schema.xml:
<fieldType name="text_general" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true"/>
<filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
<!-- in this example, we will only use synonyms at query time
<filter class="solr.SynonymFilterFactory" synonyms="index_synonyms.txt" ignoreCase="true" expand="false"/>-->
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true"/>
<!--<filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>-->
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
Synonym.txt
Attention deficit hyperactivity disorder,ADHD
Re-indexed Solr
But the number of results and the ordering is different when I search 'ADHD' and 'Attention deficit hyperactivity disorder'. Is there any more configurations that has to be done for the Solr to identify the synonym.txt and provide the same results for both the searches?
#dwhelan - the query for the search of ms looks like '((drug_facet_auto:((ms*)))OR(company_facet_auto:((ms*)))OR(disease_facet_auto:((ms*))))' and the QueryResponse : {responseHeader={status=0,QTime=15,params={facet=true,q=((drug_facet_auto:((ms*)))OR(company_facet_auto:((ms*)))OR(disease_facet_auto:((ms*)))),facet.limit=100,facet.field=[drug_facet, company_facet, disease_facet],wt=javabin,rows=0,version=2}},response={numFound=0,start=0,docs=[]},facet_counts={facet_queries={{!label='Last 24 hours'}publishdate:[NOW/HOUR-24HOURS TO NOW/HOUR+1HOUR]=0,{!label='Last 7 days'}publishdate:[NOW/DAY-7DAYS TO NOW/DAY+1DAY]=0,{!label='Last 30 days'}publishdate:[NOW/DAY-1MONTH TO NOW/DAY+1DAY]=0,{!label='Last year'}publishdate:[NOW/DAY-1YEAR TO NOW/DAY+1DAY]=0},facet_fields={drug_facet={},company_facet={},disease_facet={}},facet_dates={},facet_ranges={}},highlighting={},spellcheck={suggestions={correctlySpelled=false}}}