Sort on field completeness of Solr Documents - solr

I have this Solr field
<field name="listing_thumbnail" type="string" indexed="false" stored="true"/>
Now when the results are shown the fields without the field value should be shown at the last. Is this possible in SOLR? To generalise is it possible to sort documents on field completeness?

You can make use of bq (Boost Query) Parameter of the dismax/edismax query handler. This allows to query if a field is empty or not and then affect the score, but to do so the field needs to be indexed=true.
If you had your field indexed you could add bq=(listing_thumbnail:*) - this would give a push to all documents with a value in that field.

Related

Solrj indexing mechanism

I have a question about indexing mechanism using Solr in Java. If I create a documents and i want to find only field "name", solr will be index all fields? Or only index by field "name" in each document?
If you tell Solr to only store the field name in your schema, then only the field name will be stored.
If you instruct Solr to store everything you send to it (like in the schemaless mode) and you send 400 fields, each of those fields will be stored.
If you want to store information but not search for it, only those fields which you are going to query need to be indexed, while the other fields can be limited to just stored. If you don't need the content of the field, but just want to search for it, you can set stored to false, and indexed to true.
In the schema.xml where you define the fields getting used, you need to mention indexed=true for all the fields you want to search on.
In your case it would look something like this -
<field name="name" type="string" indexed="true" stored="true" />

Copy-transform a numeric field in Solr?

I have a dynamic numeric multivalued field in my solr schema -
<dynamicField name="*_nm" type="float" indexed="true" stored="true" multiValued="true" omitNorms="false"/>
I'd like to run a function score on said field -
_val_:"if(exists(features.width_nm),mul(exp(div(pow(max(0,sub(abs(sub(features.width_nm,12.00000)),0.00000)),2),-51.93702)),10.00000),0.000000)"
but function queries on multivalued fields aren't properly supported in my version of Solr (5.2.1). Trying the above gives the error -
"can not use FieldCache on multivalued field"
My current work-around for this is during indexing to create another field, numeric single-valued, which contains a "reduced" form of the multivalues.
Currently I do this in Java code.
Is there any way for me to do this directly in Solr? for example using a "copy-field"?
Just for completeness - In solr 6.3 I am able to calculate a function-score on a multivalued field by using the field function with a min/max parameter described here.
Thank you very much!

Solr field not visible in query results

I have added a new field in the schemas:
<field indexed="false" stored="true" docValues="true" sortMissingLast="true" name="RankScoreXXX" type="int" />
After all the indexing operations are done, in the solr admin panel while performing queries I do not see that field in any results where the value is actually 0. Results that contain a > 0 value in this specific field are shown.
By using this parameter I can see that none result does not contain this value
fq: -RankScoreXXX: [* TO *] .Also, I can sort results by this specific field.
I just do not understand why results with RankScoreXXX = 0 are not visible in the solr panel admin for given results.
Am I missing something?
Thanks.
I have ran into this scenario a few times. Let me tell you what each one was:
Field was added but reindexing did not take place for all documents, only new ones. This is not your case as you reindexed.
Request handler was not updated in solrconfig.xml. In this case the person added the field and had configured the request handler to return a specific number of fields using fl. The field was not in the list.

solr fq; integer comparison on a substring

That is probably a bad title...
But let's say I have a bunch of strings in a multivalue field
<field name="stringweights" type="text_nostem" indexed="true" stored="true" multiValued="true"/>
Sample data might be:
history:10
geography:33
math:29
Now I want to write a fq where I select all records in solr where:
stringweights starts with "geography:"
and where the integer value after "geography:" is >= 10.
Is it possible to write a solr query like that?
(It's not possible to create an integer field in the solr schema named "geography", another called "math" etc because these string portions of the field are unknown at design time and can be many hundreds / thousands of different values.)
You may want to look into dynamic fields. Declare a dynamic field in your schema like:
<dynamicField name="stringweight_*" type="integer" indexed="true" stored="true"/>
Then you can have your docs like:
stringweight_history: 10
stringweight_geography: 33
stringweight_math: 29
Your filter query is then simply:
fq=stringweight_geography:[10 TO *]
You may need to build a custom indexer for doing this. Or use a script transformer with data import handler as mentioned here: Dynamic column names using DIH (DataImportHandler).

SOLR - Use single text field in schema for full text search

I am getting familiar with SOLR.
I would like to use SOLR for full text search for many kind of entities. I don't want to create a Document for every different type of entity. I don't want to be able to search for specific fields. I am only interested in that if a specified string is anywhere in any item.
In database terms for example I have a table News and a table Employee and I want to search for the word 'apple', I don't mind in which field it is, I only want to get back the database ID from the records which contain it.
Could it be a solution, that I use a SOLR schema something like this:
<fields>
<field name="id" type="string" indexed="true" stored="true"/>
<field name="content" type="text" indexed="true" stored="false"/>
</fields>
So, I only need an ID and the contents. I put all the data, in which I want to be able search into one 'content' field. When I search for some words it looks for it in the 'id' and int the 'content'.
Is this a good idea? Any performance or design problem?
Thanks,
Tamas
See https://wiki.apache.org/solr/SchemaXml#Copy_Fields. It says:
A common requirement is to copy or merge all input fields into a single solr field. This can be done as follows:-
<copyField source="*" dest="text"/>
That's typically what is done to search across multiple fields.
But if you don't even want your original fields, just concatenate all your fields into one big field content and index in Solr. There should be no problems with that.
You can either copyField to text (see example in the distribution) and have that set as default field ("df" parameter in solrconfig.xml for the select handler).
Or, if you anticipate more complex requirements down the line and/or non-text searches, I would recommend looking at eDismax with qf parameter and it will handle searching all those fields itself.

Resources