Solr docs must match one field - solr

I have two fields
text field .. All important fields like category, product name, brand are copied into it.
attributes field .. All attributes are copied into this field.
I have a single search query e.g. "50 mm diameter drill"
I want to search this string in both fields. I am assuming that this will match all products that have drill in the text field.
I want to narrow down the result in case any attributes that match any of 50 mm diameter.
And in case none matches in the attributes field I want to return all documents that match text field.
Edit: I dont want any docs which don't match text field.
I only want that if search is matched to attributes field, and docs are found we return only those docs.
If not found we return all docs which match text field

This is getting a bit tricky and a lot of things depend on your field processing requirements.
You will need to use a combination of field weighting, to rank attributes field higher and edismax minimum match mm
Minimum match allows you to configure how many terms in the query must be hit in order for it to display results. This helps weed out documents that only hit on one term in one field.
Lastly, if you really want to have your own logic in here, you can prepend field with + to make it mandatory. For example +attributes:drill will only return items that have drill in the attributes field.

Whether "drill" will match depends on how your fields are processed, but probably, yes. The easiest way to do this is to not limit by "if not matched here, do this ..", but to score matches in the attributes field higher. You can do this by using qf (if using (e)dismax) together with their weights, such as attributes^20 text which will score any match in attributes 20 times more than a match in text. Any search matching documents with the correct term in attributes will then be scored higher than those just matching in text.
You can also do something similar in the q parameter, where you can weight each term separately: text:drill OR attributes:drill^20.

Related

Sorting of solr documents based on search term in solr

I would like to sort solr documents based on searched term. For example the search term is "stringABC"
Then the order of the results should be
stringABC,
stringABCxxxx,
xxxxstringABCxxxx
The solr document will contain lot of fileds ex: title, description, path, article No, Product code etc..
And the default field will contain more than one field ex: title, description and path.
So the solr doc will only be returned when the search term satisfied any field from the default field.
Use three fields - one with the exact string, one with a EdgeNgramTokenizer and one with an NgramTokenizer. You can then use qf=field1^10 field2^5 field3 to score hits in these fields according to how you want to prioritize them between each other.

Solr exact match field boosting

I have this requirement: if the query text match exactly with a particular field value (the title field) the result must be first or al least be boosted.
So I need to boost the results with the exact match.
My solution is to create the title as an untokenized field, so it'll match only exactly, and boost this the title with an edismax query.
Is there any othere way?
How can I index a field untokenized? So without tokenize on spaces?
Use a KeywordTokenizer - this will index the field as a single value, but still allow you to attach filters - for example to lowercase the text before storing the token.
If you don't want to perform lowercasing either, you can use a string (StrField) field - a string field will only give a hit if the value is exactly the same.
This is usually what you'll do to give exact hits a larger boost than other hits - and you can use the qf parameter to dismax (which you probably are already) to give this list. Use copyField to index the content into separate fields with different definitions.

I would order my search according some field

I would order my search according some field.
For example:
- title
- description
- some field
I would order by title and description. I try QF but this not work in all cases because search only in the specified field. I would to specify a list of field but i don't want exclude other field
Sort by score is the default sort if you don't specify anything. Perhaps you are looking to boost matches against specific field. This can be done using eDisMax and specifying boosts in the list of fields.
For example fl=title^10 description^3 otherfield1 otherfield2

Boost evenly across field of varying length

I've got a text field that can potentially have multiple values.
doc 1:
field a:"X Y"
doc 2:
field a:"X"
I want to be able to do :
a:X^5
And have both doc 1 and 2 get an identical score.
I've been messing around with all the field options, but I always end up with doc 2 getting double the score of doc 1.
I've tried setting multiValued="true", but get the same result.
Is there someway that I can set my search or the field definition so that it will boost just based upon the existence of the search term and not be effected by the rest of the field's contents.
Disable norms by setting omitNorms=true in your schema and reindex - it should disable the length normalization for the field and give you the desired results.
For more details of what omitNorms does, see this.
The field a of doc 2 has only one term as compared to doc 1 which has two.
Solr DefaultSimilartiy implementation takes into account the length norm, number of terms in the field, for the fields when calculating the score.
LenghtNorm is 1.0 / Math.sqrt(numTerms)
LengthNorm allows you to make shorter documents score higher.
You can provide your own implementation of Similarity class which doesn't take into account the lengthNorm.
Check computeNorm method implementation.
You can turn of the Norms using omitNorms=false.
Norms allow for index time boosts and field length normalization. This allows you to add boosts to fields at index time and makes shorter documents score higher.
So you would lose both of the above if you use it.

what is the advantages of mutivalued option in solr

What is the advantages of mutivalued field option in solr.
I have a field with comma separated keywords.
I can do 2 things
make a non-multivalued text field
make a multivalued text field which contains each keyword
I can still query in both the cases. So whats the advantages of multivalued over non-multivalued?
advantages of multivalued: you don't need to change the document design. If en document containes multiple values in one filed, so solr/lucen can handle this field.
Also an advantage: multiple values could describe an document more exact (thing about tags of an blog post, or so)
advantages of non-multivalued: you can use specific features, which required an single term (word) in one filed, like spell checking. It's also a benefit for clustering (carrot) or grouping, which works mostly better on non-multivalued fields
Querying by the multivalue field will receive what you want.
Example: doc1 has a keyword 'abc', and doc2 has a keyword 'abcd'. If query by keyword 'abc' only doc1 should be matched.
So in non-multivalue approach both documents will matched, case you'll use like syntax.
multivalue fields can be very handy, let say you have many fields and you wish to search for several fields but not in all of them. you can create multivalue field that include all the fields that you wont to search for them on this field and search in it.
for example, let say you have fields that may have value of string or value of number. and than you wish to search on all string values that were found in the document. so you can create multivalue field for all string values and search in it.

Resources