I am newbie to Solr and I am trying to build a simple search solution for my website. I am trying to build a feature which is similar to Swifttype's result ranking feature.
i.e., Lets say "ice" results in
Ice
Iceberg
Ice cream
......
so on....
If i want to rank "Ice cream" higher only for query "ice", but when i search for other search terms like "Iceland" default ranking should be maintained. how can i achieve that?
I am using solr v7.5
Thanks in advance...
The Query Elevation Component lets you configure the top results for a given query regardless of the normal Lucene scoring.
More importantly, you will need to configure elevate.xml file which looks like this:
<elevate>
<query text="ice">
<doc id="1" /> //where id=1 is your ice cream document
<doc id="2" />
<doc id="3" />
</query>
</elevate>
Later, during searches you only need to enable it by specifying http param enableElevation=true
Related
We are using below elevate.xml to get desired results in order in solr configuration.
<query text="hotels">
<doc id="14421"/>
</query>
<query text="Hotels">
<doc id="14421"/>
</query>
Now, we got requirement with list of keywords(> 50 words). If I hardcode all these in elevate.xml I can fulfill my requirement. I want to know is there any better approach for this like configuring reqgural expression or any other way.
you could try two options:
putting the elevated ids in the request at query time, see here
this guy created a plugin to generate the elevate.xml from a db, which you maintain, see in here
Let's say we have a query "Homepage Content".
And there are two records whose title field are (1) "Homepage content" and (2) "content in homepage", respectively.
How to configure solr so that (1) has a higher matching score than (2) for the given query.
(I know it does not make sense to use edismax in this simplified example. But I would like the problem solvers to be aware of the fact that I am using edismax in the real situation.)
Here is my current (extremely simplified) configuration:
schema.xml:
<field name="title" type="text_general" indexed="true" stored="true"/>
solrconfig.xml:
defType='dismax'
qf="title^2"
An alternative is to use the pf argument. It will rank the document higher if the terms are closer together.
Moreover you could use the ps parameter
(Phrase Slop) in order to specify the number of positions that two terms can be apart to match the relevant phrase.
Here is the link to the documentation.
Did you try with elevate component.
For some query string you can specify which doc to be in the top of result by using its docID in elevate.xml file, which will be inside conf folder.
example :
<query text="Homepage">
<doc id="docID" /> <!-- put the first doc ID-->
<doc id="docID" exclude="true" /> <!-- exclude this doc -->
</query>
even thought if doc is matched specifying exclude=true can eliminate that doc in the result.
Is it possible to specify an additive boost for a set of documents (given a set of document ids) at query time for the extended dismax handler?
I was planning to use bq with an OR'ed list of ids but apparently bq's boost is not additive.
I think query elevation may help :http://wiki.apache.org/solr/QueryElevationComponent
Look into :
forceElevation
By default, this component respects the requested 'sort' parameter -- that is, if the request asks to sort by date, it will order the results by date. If forceElevation=true, results will first return the boosted docs, then order by date.
Have elevate.xml
<elevate>
<query text="AAA">
<doc id="A" />
<doc id="B" />
</query>
<query text="ipod">
<doc id="A" />
<!-- you can optionally exclude documents from a query result -->
<doc id="B" exclude="true" />
</query>
</elevate>
I have a scenario where I have a productdatabase in solr and a branddatabase in MySql. In the solr productdatabase I have a field named brandid where I refer to the Mysql primary key from the branddatabase. Now I would like to join the branddatabase for each solr searchquery and groups the result seperatly from the product results. I thought about a second solr database where I save the branddata and then join it on every query, but I would like to have each brand only one time and not merged together with the product results in the same resultset. A facette-style result for the brands is my goal. Anyone has a pointer how I could achieve this kind of results in my xml/json?
The resultset how I would like to have it in pseudo solr code:
<results>
<products>
<product>
...
</product>
<product>
...
</product>
<product>
...
</product>
<product>
...
</product>
</products>
<brands>
<brand>
...
</brand>
<brand>
...
</brand>
</brands>
</results>
If you only need to serve additional fields from brand database and you do not need to search/filter on them then you could apply a simple faceting on brandid and populate the presentation fields in a post processing step from DB directly/memory cache/key value store...
and use facet.mincount=1 to eliminate the brands without any products in the current query.
Can you use a higher-level language?
I currently do something similar, but I use Java as the glue. The Java application takes in requests, goes against solr using solrj, retrieves all the results, including the facets, I take that response and query against mysql to get more information, I merge all the data in the java layer and then construct the xml/json response.
solrj
other higher-level languages are offered:
Ruby,PHP,Java,Scala,Python,.Net,Perl,Javascript
I consider this solr psedo-doc
<doc>
<field name="title"/>
<field name="name"/>
<field name="keywords"/>
</doc>
Some doc's will have the keyword "up" which means that they should appear first (despite of their initial order position) when and only when they are part of the search results.
So lets say I have:
doc1('title1','Bob, Alice','people, up, couple')
doc2('title2','Smart Phone, Laptop, Bob','devices, electronics')
if I query with "title:title2 name:Bob" then I should get doc1 first (it has the 'up' keyword).
if I query with "name:Bob" I still get doc1 first for the same reason.
if I query with "name:Laptop" then I should only get doc2 in my results. doc1 should not be included since it doesnt match my search query.
Any suggestion to do this?
You have several options to do something like that:
function query / boost query (in dismax handler)
during index time (boost documents)
extract 'up' keyword to additional field and sort by this field, than score
For example (with dismax handler):
/select?defType=dismax&q=...&bq=keywords:"up"^1000
This can be solved with Solr's query time boosting. So following the guidance from the Solr Relevancy FAQ - you could add an additional boosted search term to all queries, e.g. title:title2 name:Bob keywords:up^2
You could also at index time for each document, determine if the up keyword is present then store that in an additional field (boolean for example) in your schema and boost the query results based on that boolean field.