Copy-transform a numeric field in Solr? - solr

I have a dynamic numeric multivalued field in my solr schema -
<dynamicField name="*_nm" type="float" indexed="true" stored="true" multiValued="true" omitNorms="false"/>
I'd like to run a function score on said field -
_val_:"if(exists(features.width_nm),mul(exp(div(pow(max(0,sub(abs(sub(features.width_nm,12.00000)),0.00000)),2),-51.93702)),10.00000),0.000000)"
but function queries on multivalued fields aren't properly supported in my version of Solr (5.2.1). Trying the above gives the error -
"can not use FieldCache on multivalued field"
My current work-around for this is during indexing to create another field, numeric single-valued, which contains a "reduced" form of the multivalues.
Currently I do this in Java code.
Is there any way for me to do this directly in Solr? for example using a "copy-field"?
Just for completeness - In solr 6.3 I am able to calculate a function-score on a multivalued field by using the field function with a min/max parameter described here.
Thank you very much!

Related

Could Solr search contains wildcard in key?

I have a json block saved as one document in solr,
{
"internal":...
"internet":...
"interface":...
"noise":...
"noise":...
}
Could I seach as " inter*:* "? I want to find out all content with key start with "inter"
Unfortunately, I got parser error, is there any way that I could the search with a wildcard in the key?
No, not really. You'll have to do that as a copyField if providing a wildcard is important to you, in effect copying everything into a single field and then querying that field.
You can supply multiple fields through qf without specifying each field in the q parameter as long as you're using the edismax query handler - that's usually more flexible, but it will still require each field to be specified.
There's also a little known feature named "Field aliasing using per-field qf overrides" (I'm wasn't aware with it, at least). If I've parsed what I've been able to find from a few web searches correctly, you should be able to do f.i_fields.qf=internal internet interface&qf=i_fields. In effect creating an i_fields alias that refers to those three fields. You'll still have to give them explicitly.
You can use Dynamic fields. It allow Solr to index fields that you did not explicitly define in your schema.
This is useful if you discover you have forgotten to define one or more fields. Dynamic fields can make your application less brittle by providing some flexibility in the documents you can add to Solr.
A dynamic field can be defined like
<dynamicField name="*_i" type="int" indexed="true" stored="true"/>
Please refer solr documentation for more on Dynamic Fields.
Dynamic Fields
After this create a copy field. Copy the dynamic fields into the copy field.
Once done with this, query can be done on the copyField.
<dynamicField name="inter_*" type="string" indexed="true" stored="true"/>
<field name="internal_static" type="string" indexed="true" stored="true" multiValued="true"/>
<copyField source="inter_*" dest="emp_static"/>

Apache solr date field in views

I have a custom date field in one of my content type field_last_archived_date.
There is a corresponding entry in the Apache solr field list called dm_field_last_archived_date.
Now there are two problems that I am facing
When I try to use this field in a solr view to sort the same, it gives me error "cannot sort on multivalued field."
When I try to use this field as an exposed filter to provide a date range, I'm not sure what date format should be given. I have tried formats like "2011-10-01T23:59:59Z", "2011-10-01 23:59:59", plain unix timestamp, etc. But all of them throws error "Invalid Date String:'OctoberAMCECESTAM+02:001_SunAMCESTE_1nd+02008601'".
Any idea what I am doing wrong here?
Thanks...
dm_field_last_archived_date field is multi value field and solr is not provide sorting on multi value field.
To confirm behavior apply sort on single value field.
You can check multi value in schema file in solr it looks like
<field name="yourFieldName" type="tint" indexed="true" stored="true" omitNorms="true" multiValued="true" default="defaultValue"/>

Sort on field completeness of Solr Documents

I have this Solr field
<field name="listing_thumbnail" type="string" indexed="false" stored="true"/>
Now when the results are shown the fields without the field value should be shown at the last. Is this possible in SOLR? To generalise is it possible to sort documents on field completeness?
You can make use of bq (Boost Query) Parameter of the dismax/edismax query handler. This allows to query if a field is empty or not and then affect the score, but to do so the field needs to be indexed=true.
If you had your field indexed you could add bq=(listing_thumbnail:*) - this would give a push to all documents with a value in that field.

Solr highlighting for external fields

I would like to use Solr highlighting, but our documents are only indexed and not stored. The field values are found in a separate database. Is there a way to pass in the text to be highlighted without Solr needing to pull that text from its own stored fields? Or is there an interface that would allow me to pass in a query, a field name, a field value and get back snippets?
I'm on Solr 5.1.
Lucene supports highlighting (returns offsets) also for non-stored content by using docValues.
Enabling a field for docValues only requires adding docValues="true" to the field (or field type) definition, e.g.:
<field name="manu_exact" type="string" indexed="true" stored="false" docValues="true" />
(introduced in Lucene 8.5, SOLR-14194)
You could reindex the resultset (read from database) in an embedded solr instance and run the query with same set of keywords with highlighting turned on and get the highlighted text back.
You could read the schema and solrconfig as resources from local jar and extract to temporary solr core directory to get this setup working.

solr fq; integer comparison on a substring

That is probably a bad title...
But let's say I have a bunch of strings in a multivalue field
<field name="stringweights" type="text_nostem" indexed="true" stored="true" multiValued="true"/>
Sample data might be:
history:10
geography:33
math:29
Now I want to write a fq where I select all records in solr where:
stringweights starts with "geography:"
and where the integer value after "geography:" is >= 10.
Is it possible to write a solr query like that?
(It's not possible to create an integer field in the solr schema named "geography", another called "math" etc because these string portions of the field are unknown at design time and can be many hundreds / thousands of different values.)
You may want to look into dynamic fields. Declare a dynamic field in your schema like:
<dynamicField name="stringweight_*" type="integer" indexed="true" stored="true"/>
Then you can have your docs like:
stringweight_history: 10
stringweight_geography: 33
stringweight_math: 29
Your filter query is then simply:
fq=stringweight_geography:[10 TO *]
You may need to build a custom indexer for doing this. Or use a script transformer with data import handler as mentioned here: Dynamic column names using DIH (DataImportHandler).

Resources