I have an azure search configured with many fields. Four of the fields are searchable (Field1, Field2, field3, field4) I want to rank my results so that if the match was found in field 1, I want those results displayed first. If the match was found in field2, I want those results displayed after the field1 matching results, but before the matching results from field3. etc
What scoring profile can I use to rank the results as above?
While creating a scoring profile, you can assign specific weight to the individual fields by using the weights component. In your case you would assign a higher weight to field2 than field1.You can further specify functions to influence the scoring based on scenarios like freshness, magnitude etc.
"scoringProfiles": [
{
"name": "boostGenre",
"text": {
"weights": {
"field1": 1.5,
"field2": 25,
"field3": 2,
"field4": 3,
}
}
},
{
"name": "somefunction",
"functions": [
...
]
}
],
You can more details about this here.
Related
I have records with a "title" and a "brand" fields and i query both fields.
Sometimes a record has the brand in the title, which will result in higher scores, but I want to score them the same.
How can i rate records lower were both fields match?
Your solution is not ideal.
In Solr, there is the Dismax query parser that allows you to search for individual terms across several fields, using some other parameters to influence the final score.
The q parameter defines the main query while the qf parameter can be used to specify a list of fields with which to search.
In addition, the tie parameter lets you control how much the final score of the query will be influenced by the scores of the lower-scoring fields compared to the highest-scoring field.
Let's make a simple example.
Using the standard query parser this is what you will obtain running this query (q=adidas):
http://localhost:8983/solr/indexName/select?q=title:adidas%20OR%20brand:adidas&fl=id,title,brand,score
"docs": [
{
"id": "2",
"title": "Shoes Adidas",
"brand": "Adidas",
"score": 0.9623127
},
{
"id": "1",
"title": "Shoes",
"brand": "Adidas",
"score": 0.31506687
},
{
"id": "6",
"title": "Shirt",
"brand": "Adidas",
"score": 0.31506687
}
]
The doc with id 2 has a higher score than the others because the score is the sum of two clauses ('adidas' in title + 'adidas' in brand).
If you perform a Dismax query with tie=0 (a pure "disjunction max query"):
http://localhost:8983/solr/indexName/select?defType=dismax&q=adidas&qf=brand%20title&fl=id,title,brand,score&tie=0
You will obtain:
"docs": [
{
"id": "2",
"title": "Shoes Adidas",
"brand": "Adidas",
"score": 0.6472458
},
{
"id": "1",
"title": "Shoes",
"brand": "Adidas",
"score": 0.31506687
},
{
"id": "6",
"title": "Shirt",
"brand": "Adidas",
"score": 0.31506687
}
]
The doc with id 2 has a lower score than before because only the maximum scoring subquery contributes to the final score, i.e. it takes the max score between 0.6472458 and 0.31506687 without summing them (0.9623127).
With the qf parameter, it is also possible to assign a boost factor to increase or decrease the importance of a particular field in the query, for example:
&qf=brand^3 title
It makes matches in brand much more significant than matches in title.
In any case, boosting should be used with caution because it may lead to unexpected results. Every decision with boosting should be supported by an online and offline search relevance evaluation.
Can this help you?
I solved it by removing all occurrences of the brand in the title (and other fields) when writing the index.
I have Table : User, with fields say firstName, lastName
If I search for Amit and use only searchField as firstName, i get different scores.
$count=true&search=Amit^2&searchFields=firstName&$select=firstName&queryType=full
"value": [
{
"#search.score": 7.986226,
"firstName": "Amit"
},
{
"#search.score": 7.986226,
"firstName": "Amit"
},
...
...
...
{
"#search.score": 7.986226,
"firstName": "Amit"
},
{
"#search.score": 7.9655724,
"firstName": "Amit"
},
Above is small resultset but i can see score changing after 15-20 results.
I was expecting same score if firstName is same, since complex query can be sort on score, last Name.
The search score for a document is a combination of how well a document matches a query and how relevant it is compared to "nearby" documents. Depending on the exact partitioning of documents into shards, exact matches may get different scores, but they will always score higher than non-exact matches.
While faceting azure search returns the count for each facet field by default.How do I also get other searchable fields for every facet?
Ex When I facet for area , I want something like this.(description is a searchable field)
{
"area": [
{
"count": 1,
"description": "Acrylics",
"value": "ACR"
},
{
"count": 1,
"description": "Power",
"value": "POW"
}
]
}
Can someone please help with the extra parameters I need to send in the query?
Unfortunately there is no good way to do this as there is no direct support for nested faceting in Azure search (you can upvote it here). To achieve the result you want you would need to store the data together as a composite value as described by this workaround.
I want to groupby/faceting by multiple fields, say by "name" and "type" fields in the search index. Is it possible in Azure search. If so how can it be done?
It is not possible to facet by the combined values of multiple fields. You'd have to denormalize the fields yourself when you populate the index, then facet by the denormalized field. For example, if you have 'name' and 'type' fields, you'd have to create a combined 'nametype' field containing the combination of 'name' and 'type'. Then you would refer to the 'nametype' field in the 'facet' parameter of the Search request.
If before you had a document like this:
{ "id": "1", "name": "John", "type": "Customer" }
Now you will have a document like this:
{ "id": "1", "name": "John", "type": "Customer", "nametype": "John; Customer" }
(You can use whatever separator you like between the name part and type part of nametype.)
Now, when you search, include facet=nametype in the request, and you'll get a count of all combinations of 'name' and 'type' that exist in the index.
I have a non trivial SOLR query, which already involves a filter query and facet calculations over multiple fields. One of the facet fields is a a multi value integer field, that is used to store categories. There are many possible categories and new ones are created dynamically, so using multiple fields is not an option.
What I want to do, is to restrict facet calculation over this field to a certain set of integers (= categories). So for example I want to calculate facets of this field, but only taking categories 3,7,9 and 15 into account. All other values in that field should be ignored.
How do I do that? Is there some build in functionality which can be used to solve this? Or do I have to write a custom search component?
The parameter can be defined for each field specified by the facet.field parameter – you can do it, by adding a parameter like this: facet.field_name.prefix.
I don't know about any way to define the facet base that should be different from the result, but one can use the facet.query to explicitly define each facet filter, e.g.:
facet.query={!key=3}category:3&facet.query={!key=7}category:7&facet.query={!key=9}category:9&facet.query={!key=15}category:15
Given the solr schema/data from this gist, the results will have something like this:
"facet_counts": {
"facet_queries": {
"3": 1,
"7": 1,
"9": 0,
"15": 0
},
"facet_fields": {
"category": [
"2",
2,
"1",
1,
"3",
1,
"7",
1,
"8",
1
]
},
"facet_dates": {},
"facet_ranges": {}
}
Thus giving the needed facet result.
I have some doubts about performance here(especially when there will be more than 4 categories and if the initial query is returning a lot of results), so it is better to do some benchmarking, before using this in production.
Not exactly the answer to my own question, but the solution we are using now: The numbers I want to filter on, build distinct groups. So we can prefix the id with a group id like this:
1.3
1.8
1.9
2.4
2.5
2.11
...
Having the data like this in SOLR, we can use facted prefixes to facet only over a single group: http://wiki.apache.org/solr/SimpleFacetParameters#facet.prefix