How to set up SOLR parameter substitution in solrconfig.xml - solr

This is my first question at stackoverflow so apologies in advance if I break any rules but I did study them and also made sure this isn't a duplicate question.
So, according to this http://yonik.com/solr-query-parameter-substitution/ one can set up a search handler in solrconfig in a way that the
request handler defaults, appends, and invariants configured for the
handler may reference request parameters
I have the following query which works just fine with curl
curl http://localhost:7997/solr/vb_popbio/select -d 'q=*:*&fq=bundle:pop_sample_phenotype AND phenotype_type_s:"insecticide%20resistance"
&rows=0&wt=json&json.nl=map&indent=true
&fq=phenotype_value_type_s:${PFIELD}&
&PGAP=5&PSTART=0&PEND=101&PFIELD="mortality rate"&
json.facet = {
pmean: "avg(phenotype_value_f)",
pperc: "percentile(phenotype_value_f,5,25,50,75,95)",
pmin: "min(phenotype_value_f)",
pmax: "max(phenotype_value_f)",
denplot : {
type : range,
field : phenotype_value_f,
gap : ${PGAP:0.1},
start: ${PSTART:0},
end: ${PEND:1}
}
}'
I have translated this query to a search handler configuration in solrconfig.xml so a user only has to provide the PFIELD, PGAP, PSTART and PEND parameters. Here's how the configuration for the handler looks
<!--A request handler to serve data for violin plots (limited to IR assays)-->
<requestHandler name="/irViolin" class="solr.SearchHandler">
<!-- default values for query parameters can be specified, these
will be overridden by parameters in the request
-->
<lst name="defaults">
<str name="echoParams">explicit</str>
<int name="rows">0</int>
<str name="df">text</str>
<str name="wt">json</str>
<str name="json.nl">map</str>
<str name="json.facet">{
pmean: "avg(phenotype_value_f)",
pperc: "percentile(phenotype_value_f,5,25,50,75,95)",
pmin: "min(phenotype_value_f)",
pmax: "max(phenotype_value_f)",
denplot : {
type : range,
field : phenotype_value_f,
gap: ${PGAP:0.1},
start: ${PSTART:0},
end: ${PEND:1}
}
}
</str>
</lst>
<lst name="appends">
<str name="fq">bundle:pop_sample_phenotype</str>
<str name="fq">phenotype_type_s:"insecticide resistance"</str>
<str name="fq">has_geodata:true</str>
<str name="fq">phenotype_value_type_s:${PFIELD:"mortality rate"}</str>
</lst>
<lst name="invariants">
</lst>
</requestHandler>
Notice that I provided default values for all the parameters otherwise SOLR will fail to load the configuration. The problem is that using a query like this
curl http://localhost:7997/solr/vb_popbio/irViolin?q=*:*&
&PGAP=5&PSTART=0&PEND=101&PFIELD="mortality rate"
is not working. SOLR will read the request parameters fine (I can see them on the debug output) but will ignore them and use the default values in the configuration instead.
SOLR version is 5.2.1.
I tried moving the configuration parameters to either defaults, appends or invariants but nothing is working. After researching this for the past 2 days I'm almost ready to give up and just build the whole query on-the-fly instead.
Any help will be greatly appreciated.
Many thanks

I think (the post) is too old, but using a search engine I arrived at this page. A simple solution was to escape the dollar symbol. After that, you should achieve your desired result.
Example:
<str name="json.facet">{
pmean: "avg(phenotype_value_f)",
pperc: "percentile(phenotype_value_f,5,25,50,75,95)",
pmin: "min(phenotype_value_f)",
pmax: "max(phenotype_value_f)",
denplot : {
type : range,
field : phenotype_value_f,
gap: $${PGAP:0.1},
start: $${PSTART:0},
end: $${PEND:1}
}
}
</str>

I'm not sure when the Config API came to Solr but if query parameter substitution does work when added to configoverlay.json
{
"requestHandler": {
"/myHandler": {
"name": "/myHandler",
"class": "solr.SearchHandler",
"defaults": {
"fl": "id,name,color,size",
},
"invariants": {
"rows": 10,
},
"appends": {
"json": "{filter:[\"color:${color:red}\",\"size:${size:M}\"]}"
}
}
}
}
Now you can pass URL parameters &color=green&size=XXL to the /MyHandler query.

Related

Dynamic query fields in Sor / Macro substitution in solrconfig.xml

We've got a multilingual search index with the "field-per-language" configuration with a lot of similar aliases in the search handler like this:
<str name="f.content_en.qf">Title_en^10 Text_en^1 ...</str>
<str name="f.content_de.qf">Title_de^10 Text_de^1 ...</str>
...
They are used in the q parameter:
<str name="q">{!edismax qf=$searchField pf=$searchField v=$searchText}</str>
The client knows, which language should be used and calls Solr like this, e.g.: /solr/core/search?searchText=TEXT&searchField=content_en
That works fine, but the configuration contains a lot of similar stuff.
I'd like to optimize the config to something like this:
<str name="df">content</str>
<str name="f.content.qf">Title_${lang}^10 Text_${lang}^1...</str>
Then the client would need to provide the lang parameter only.
I tried to use concat function like this:
paramLang=en
searchFields=concat("Title", "_", "${paramLang}", " ", "Text", "_", "${paramLang}")
and use it as the qf:
q={!edismax qf=$searchFields v=$searchText}
But it seems, the local params qf does not support Solr functions.
Is is possible with Solr at all?
Actually, the Parameter substitution / Macro Expansion works fine.
The issue was with those macros in the solrconfig.xml: there is a conflict with Solr system properties substitution. Solr could not resolve the query parameter macros.
I could not find a proper way, how to escape query parameters (macros) and used the following workaround:
<lst name="invariants">
<str name="defType">edismax</str>
<str name="searchFields">
Title_${lang:${lang}}^10
Text_${lang:${lang}}^1
...
<lst name="defaults">
<str name="q">*</str>
<str name="qf">${searchFields:${searchFields}}</str>
<str name="pf">${searchFields:${searchFields}}</str>
<str name="lang">en</str>
...
Query URL: /search?q=TEXT&lang=en
Update: proper way to deal with var substitution in solrconfig.xml - escape the dollar char by $$:
<str name="searchFields">
Title_$${lang}^10
Text_$${lang}^1
...
Update #2: do NOT define macros in the invariant or append sections when using a Solr Cloud! Otherwise, you'll a weird exception, e.g.:
"undefined field: \"Text_$\"
or
"msg": "Error from server at null: org.apache.solr.search.SyntaxError: Query Field '${searchFields}' is not a valid field name"
P.S. wt=json as "invariant" is also NOT compatible with Solr Cloud, giving "unexpected" content-type error.
So many "surprises" :(

Which Analyzer is used for Solr-Query

I've added a field-type which uses the EdgeNgramFilter as analyzer to my schema.
{
"add-field-type" : {
"name":"text_edge_ngram",
"class":"solr.TextField",
"positionIncrementGap":"100",
"analyzer" : {
"tokenizer":{
"class":"solr.WhitespaceTokenizerFactory" },
"filters":[
{"class":"solr.LowerCaseFilterFactory"},
{ "class":"solr.EdgeNGramFilterFactory",
"minGramSize":"1",
"maxGramSize":"4"}
]
}
}
}
Further, I've assigned this type to one field:
{
"replace-field" : {
"name":"MyField",
"type":"text_edge_ngram",
"uninvertible":true,
"indexed":true,
"stored":true
}
}
I've reindexed my documents but now the following query does not return the expected results. Here comes an example:
.../select?q=aweso has no results
But, if I query that single field
.../select?q=MyField:aweso, I get 'awesome' as result.
It would be very nice, if anybody can explain what is going on or give me hint how to troubleshoot.
You need to set default search field in solrconfig.xml file under /query request handler or as IntParams
Default parameter values are specified in solrconfig.xml, or overridden by query-time values in the request.
<requestHandler name="/select" class="solr.SearchHandler">
<lst name="defaults">
<str name="df">MyField</str>
</lst>
</requestHandler>
I prefer not use the ConfigAPI to update the solr cofig setting and save default configuration directly in solrconfig.xml file.
Below configuration is for few request handlers you have defined:
<initParams path="/select,/get,standard">
<lst name="defaults">
<str name="df">term1 term2</str>
<str name="q.op">AND</str>
</lst>
</initParams>
This example SearchHandler declaration shows off usage of the
SearchHandler with many defaults declared.
Note that multiple instances of the same Request Handler
(SearchHandler) can be registered multiple times with different
names (and different init parameters)

How to highlight multiple words using different formatters in Solr?

I need to perform highlighting for multiple words into the same field but for each one using a specific formatter (prefix and postfix).
Let's say that I have the description field and for a document it has the value: Einstein always excelled at math and physics from a young age. How to highlight math with a pair of a specific prefix and postfix AND ALSO physicswith a different prefix-postfix pair? So, in the end I would like to obtain:
Einstein always excelled at <em class="hl-red">math</em> and <em class="hl-green">physics</em> from a young age
The reason is that in the frontend I have different CSS classes with background-color: red; for hl-red and background-color: green for hl-green for example.
However, I was managed to highlight multiple words into the same field but with the same prefix-postfix pair all over the places, which is not what I want actually. In addition, I tried to add multiple HtmlFormatter entries in solrconfig.xml:
<highlighting>
..............
<formatter name="html" default="true" class="solr.highlight.HtmlFormatter">
<lst name="defaults">
<str name="hl.simple.pre"><![CDATA[<em>]]></str>
<str name="hl.simple.post"><![CDATA[</em>]]></str>
</lst>
<lst name="hl-red">
<str name="hl.simple.pre"><![CDATA[<em class="hl-red">]]></str>
<str name="hl.simple.post"><![CDATA[</em>]]></str>
</lst>
<lst name="hl-green">
<str name="hl.simple.pre"><![CDATA[<em class="hl-green">]]></str>
<str name="hl.simple.post"><![CDATA[</em>]]></str>
</lst>
</formatter>
..............
</highlighting>
but I got HttpSolrClient$RemoteSolrException: Error from server at http://localhost:8983/solr: Unknown formatter: hl-green. Also, I didn't find a way to specify an array of prefixes in Solr Admin UI nor in spring-data-solr, just a simple query like this:
SimpleHighlightQuery query = new SimpleHighlightQuery(Objects.requireNonNull(criteria));
HighlightOptions highlightOptions = new HighlightOptions()
.addFields(fields)
.setSimplePrefix(prefix)
.setSimplePostfix(postfix);
query.setHighlightOptions(highlightOptions);
query.setPageRequest(pageable);
return solrTemplate.queryForHighlightPage(MY_CORE, query, MyModel.class);
My assumption is that it is a limitation of the Solr itself.
I was thinking about to write a custom fragmentsBuilder but I do not know exactly if it is the case nor how to do that. For another workaround I was thinking to execute for each word a highlight query, then to store the result, then to execute for the second word another highlight query, store the result and so on. But I don't think it is a good and elegant solution because I will have problems when executing the second query if the second word is: "em" or "class" or "red"/"green" (nested undesired highlighting will occur).
I am using spring-data-solr into a Spring Boot application and Solr 6.6.5 as a (http) service.
Does anyone know how to solve this? Please give me an advice! Any idea will be much appreciated!

SOLR mm and phrase queries not working after upgrading from SOLR 4 to SOLR 6

I'm working on testing a new SOLR 6 server (6.2.0), as we have been running 4.3.1 for some time, and it was time for an upgrade.
One thing I've noticed is that the mm (minMatch) term does not seem to work the way it used to (or it's being ignored), and phrase searches are not working properly either.
For example, searching for "tabletop scanning electron microscope" (including quotes) in our index should return two matching documents, but I get zero matches.
The search is set to use edismax.
Here's some of the debug output, in case this is helpful:
"responseHeader": {
"status": 0,
"QTime": 1,
"params": {
"mm": "4<-1 6<80%",
"q": "\"tabletop scanning electron microscope\"",
"qt": "dismaxsearch",
"indent": "on",
"pf": "headline^3.0 adtextintro^2.0 adtext^1.5",
"q.op": "OR",
"wt": "json",
"debugQuery": "true"
}
},
"response": {
"numFound": 0,
"start": 0,
"docs": []
},
"debug": {
"rawquerystring": "\"tabletop scanning electron microscope\"",
"querystring": "\"tabletop scanning electron microscope\"",
"parsedquery": "PhraseQuery(adtext:\"tabletop scan electron microscop\")",
"parsedquery_toString": "adtext:\"tabletop scan electron microscop\"",
"explain": {},
"QParser": "LuceneQParser",
The same search, but without the quotes, returns either far too many results when using q.op=OR, or zero results when using q.op=AND. Again, it seems that mm is ignored when using OR. When using AND, there should be two matches.
Any suggestions? From what I've read so far, it seems there is a change in the way q.op works, but I have not been able to get things to work regardless of how I adjust this.
Please let me know if more details are required.
After more testing, I'm finding that the "qf" defined in my config is being ignored. Or, possibly the entire searcher config is ignored.
Here's the config in my solrconfig.xml file:
<requestHandler name="dismaxsearch" class="solr.SearchHandler" default="true">
<lst name="defaults">
<str name="echoParams">explicit</str>
<str name="defType">edismax</str>
<float name="tie">0.01</float>
<str name="qf">
headline^3.0 manufacturer^1.0 model^1.0 adtextintro^2.0 adtext^1.5 companyname^0.2 clientnumber^20
</str>
<str name="bq">islvad:0^1.8</str>
<!-- <str name="bf">recip(lvqualityrank,1,1000000,500)</str>-->
<str name="bf">recip(lvqualityrankadjusted,1,5000000,50)</str>
<!-- <str name="bf">product(lvqualityrank,-1)</str>-->
<str name="q.alt">*:*</str>
<str name="fl">*,score</str>
<str name="rows">200</str>
<str name="boost">recip(ms(NOW/HOUR,addate),3.16e-11,0.08,0.03)</str>
</lst>
</requestHandler>
This all worked in SOLR4, but possibly I've done something incorrect when migrating the config to SOLR6...
Thanks,
Bill
Your request dispatcher certainly has the param handleSelect=true (default and recommended for backward compatibility), in this case you need to ensure no request handler named "/select" is defined (solrconfig.xml), otherwise /select will actually handle the request regardless of the qt param.
handleSelect is a legacy option that affects the behavior of requests such as /select?qt=XXX
handleSelect="true" will cause the SolrDispatchFilter to process the request and dispatch the query to a handler specified by the "qt" param, assuming "/select" isn't already registered.
But if handleSelect is set to false, then the invoked request handler is determined by the request path (not qt). In this case, a request like http://.../select?q=... is dispatched to /select handler.
So there is 2 options :
With handleSelect="true": just comment out /select handler definition and use the qt param to specify wich request handler to use.
With handleSelect="false": specify the request handler in the request path (.../dismaxsearch?q=...). But this requires renaming your request handler to "/dismaxsearch".
The first option is usually prefered and is maybe the most appropriate for those who want to be able to change request handler "on the fly", using a param is more instinctive and logical than changing path.

calling a custom request handler from another one in solr

Is there a way to call one custom request handler from another in Solr. eg : i have /myhandler1 and /myhandler2 defined as custom request handlers in the solrconfig.xml. Defined like this
<requestHandler name="/my handler1" class="solr.CSVRequestHandler">
<lst name="defaults">
<str name="update.chain">mylogupdate</str>
<str name="stream.contentType">application/csv</str>
</lst>
</requestHandler>
and
<requestHandler name="/myhandler2" class="solr.CSVRequestHandler">
<lst name="defaults">
<str name="update.chain">mylogupdate</str>
<str name="stream.contentType">application/csv</str>
</lst>
</requestHandler>
is there a way to call /myhandler2 from /myhandler1. basically i want to use handler 1 to do some processing and then redirect it to another handler to do a second task.
the larger problem is this:
given a line like this ,
2012-01-04 23:11:41,450 AltQ:RCR-TRP: 101863261
i can split this on a comma separator and get two fields. i further want the second field to be split on a space separator and i want to store these values to different fields
like
val1:450
val2: altQ:RCR-TRP:
val3:101863261
and so on...
For the benefit of the ppl .. I still haven't found a way to redirect requests handlers..but the other problem however was solved . I found a work around to it by defining my own custom processor < that extends update request processor >
http://wiki.apache.org/solr/UpdateRequestProcessor#Processor_Customization_Examples
and i used JAVA to manipulate the document !

Resources