Solr faceting on a Query Function result - solr

Is it possible to produce solr facets for a field which is the result of Query Function?
I have an index of products with a price field for each store they are available in:
{
"id" : "p1",
"name_s" : "Product 1",
"description_s" : "The first product",
"price_l1_d" : 19.99,
"price_l2_d" : 20.00,
"price_l3_d" : 20.99,
"price_l4_d" : 19.99,
"price_l5_d" : 25.00,
"price_l6_d" : 18.00
},
{
"id" : "p2",
"name_s" : "Product 2",
"description_s" : "The second product",
"price_l1_d" : 12.99,
"price_l2_d" : 15.00,
"price_l3_d" : 13.49,
"price_l4_d" : 14.00,
"price_l5_d" : 12.50,
"price_l6_d" : 16.00
}
and I need my query to return the cheapest price in the customer's 3 closest stores.
I know I can return this value using fl=min(price_l2_d, price_l4_d, price_l6_d) and I can even sort on this but is it possible to return a "Price" facet based on this value for each document? Ideally I'd like to be able to show all products whose minimum price (in my 3 stores) is between 0-5, 5-10, 10-15, 15-20 etc etc and filter on this.
I've tried using min(price_l2_d, price_l4_d, price_l6_d) as facet.field but I receive an undefined field error. Is there a better way?
I cannot produce this value at index time because the closest 3 stores could be any combination of three price fields (in this example there is 6 but thee are likely to be over 200)

While not THE solution, I have found A solution which should work. Unfortunately it's not possible to create a traditional facet for price ranges as you would with a single integer attribute, but a two-point slider is possible.
Using the JSON facet API (as suggested by a comment on the original question) and the following:
{
"max" : "max(min(price_l2_d, price_l4_d, price_l6_d))",
"min" : "min(min(price_l2_d, price_l4_d, price_l6_d))"
}
I can return the boundaries of the slider with the smallest minimum price at the three stores and the biggest minimum price.
The values on this slider can then be applied using the {!frange} function as follows:
fq={!frange l=0 u=20}min(price_l2_d, price_l4_d, price_l6_d)
where l is the lower bound and u is the upper bound
Hopefully this helps anyone else looking for an answer to this.

Related

Sort solr from least to highest based on facet count

This my facet query
http://localhost:8983/solr/test_words/select?q=*:*&facet=true&facet.field=keyword
On searching I get results ordered based on facet count from highest to lowest
Example : {"and" :10, "to": 9, "also" : 8}
But instead I want the results ordered based on facet count from lowest to highest
Example : {"tamil" :1, "english" :2, "french":3}
I also tried
http://localhost:8983/solr/test_words/select?q=*:*&facet=true&facet.field=keyword&facet.sort=count
Which is not giving expected results. Pls help me on this!
The "old" facet interface doesn't support sorting by asc as far I know - it's always sorted from most common term to the least common one.
The JSON facet API does however support asc and desc for sorting:
sort - Specifies how to sort the buckets produced.
count specifies document count, index sorts by the index (natural) order of the bucket value. One can also sort by any facet function / statistic that occurs in the bucket. The default is count desc. This parameter may also be specified in JSON like sort:{count:desc}. The sort order may either be asc or desc.
"facet": {
keywords: {
"type": "terms",
"field": "keyword",
"sort": "count asc",
"limit": 5
}
}

Group by and Count(*) in Datastax Search/Solr

Hi we have a solr index with diff fields in it like business,businessType, regionName, StateName, .....
Now I need a solr query to get the number of business of type businessType ='event' group by regionName.
if I want to write a sql query for this it would be select region_name , Count(business) from solr where businessType='event' group by region_name
Any pointer would be helpful
I finally figured out how to do this. Note, if you need to query on a field with a space or a special character, you need to put the search term in quotes, e.g. businessType:"(fun) event".
curl http://localhost:8983/solr/yourCollection/query -d
{ "query"="*:*",
"fq"="businessType:event",
"rows"=0,
"json.facet"= { "category" : {
"type": "terms",
"field" : "region_name",
"limit" : -1 }}
}
One more Note: if you want to count over 2 fields, you have to do a nested facet.
curl http://localhost:8983/solr/yourCollection/query -d
{ "query"="*:*",
"fq"="businessType:event",
"rows"=0,
"json.facet"= { "category1" : {
"type": "terms",
"field" : "regionName",
"limit" : -1,
"facet" : { "category2" : {
"type": "terms",
"field" : "stateName",
"limit" : -1
}}}}
}
Add another facet chunk after the "limit":-1 item if you need to group by a third dimension. I tried this on my company's Solr and it hung, never returning anything but a timeout error. In general, working with Solr isn't very easy... and the documentation, IMO, is pretty terrible. And absolutely nothing about the syntax or names of the commands seem intuitive at all...
Use facets. Your solr query will look like, q=:&fq=businessType:event&facet=true&facet.field=region_name&rows=0
if want to group by on multiple fields then we need to do facet.pivot=state,region_name

solr search query apply condition in both field (exist OR does not exist)

I have data with different dynamic field. i want to apply the condition in exists field record and i need not exists field records also. My Solr version 6.1.0
{"employeeid" : "220", "displayname_s": "abu", "attr_36977": 55 },
{"employeeid" : "910", "displayname_s": "test","attr_36400": 565 },
{"employeeid" : "210", "displayname_s": "sam"},
{"employeeid" : "64", "displayname_s": "wel", "attr_36977": 152},
i write a query like this
(-attrl_36977:[* TO *] OR attrl_36977:[0 TO 100])
but this query not workout.
the idle result is first three records(220,910,210). how to solve the requirement
You have to be explicit about what you're subtracting the first part of your OR statement from:
(*:* -attrl_36977:[* TO *]) OR attrl_36977:[0 TO 100]
.. will give you any documents that doesn't have a value in attrl_36977 or a value between 0 and 100 inclusive.

Retrieving distinct documents from Solr

I've had hard time explaining and finding what I need so please put your self in my shoes for a moment.
My requirement comes from a relational database background. I may be using Solr to do something it wasn't designed to do, or may be it can do what I need, I still need to confirm that. Hopefully you can assist me.
After indexing numerous documents into Solr. I need to retrieve distinct documents based on a filter. Just think about it as retrieving distinct rows while also applying a WHERE condition.
For example, in a relational database, I may have the following columns
(Country) (City) (Whatever)
Egypt Cairo Hospitals
Egypt Alex Schools
Egypt Mansoura Hospitals
Egypt Cairo Schools
If I perform this query: SELECT DISTINCT Country, City FROM mytable
I should get the following rows
(Country) (City)
Egypt Alex
Egypt Mansoura
Egypt Cairo
Now after indexing the original table (SELECT * FROM mytable), how can I achieve the SAME output from Solr ? How can I retrieve documents by saying that I need these documents to be distinct based on some fields ? I will also need to apply a not null filter for a specific field.
I don't need statistics of any kind, I only need to get the documents.
I hope I was clear enough. Thank you for your time.
this would be achievable with field collapsing by grouping by multiple fields, but unfortunately only one field is supported right now. There is an open issue, check it out.
Did you try with facet?
You should do somethings like this:
http://localhost:8983/solr/select/?q=*:*&facet=on&facet.field=city&facet.field=country
he will return you all the city (with a distinct) and the his count.
Here there is the wiki if you want to learn more about it.
I hope this help you.
Another good solution available from Solr 4 is based on Pivot (Decision Tree) Faceting.
Try with:
/solr/collection1/select?q=*:*&facet=true&facet.pivot=Country,City
This should return:
"facet_counts" : {
"facet_queries" : {},
"facet_fields" : {},
"facet_dates" : {},
"facet_ranges" : {},
"facet_pivot" : {
"Country,City" : [ {
"field" : "Country",
"value" : "Egypt",
"count" : 4,
"pivot" : [ {
"field" : "City",
"value" : "Cairo",
"count" : 2
}, {
"field" : "City",
"value" : "Alex",
"count" : 1
}, {
"field" : "City",
"value" : "Mansoura",
"count" : 1
} ]
} ]
}
}

Restrict multi field facet calculation to subset of possible values

I have a non trivial SOLR query, which already involves a filter query and facet calculations over multiple fields. One of the facet fields is a a multi value integer field, that is used to store categories. There are many possible categories and new ones are created dynamically, so using multiple fields is not an option.
What I want to do, is to restrict facet calculation over this field to a certain set of integers (= categories). So for example I want to calculate facets of this field, but only taking categories 3,7,9 and 15 into account. All other values in that field should be ignored.
How do I do that? Is there some build in functionality which can be used to solve this? Or do I have to write a custom search component?
The parameter can be defined for each field specified by the facet.field parameter – you can do it, by adding a parameter like this: facet.field_name.prefix.
I don't know about any way to define the facet base that should be different from the result, but one can use the facet.query to explicitly define each facet filter, e.g.:
facet.query={!key=3}category:3&facet.query={!key=7}category:7&facet.query={!key=9}category:9&facet.query={!key=15}category:15
Given the solr schema/data from this gist, the results will have something like this:
"facet_counts": {
"facet_queries": {
"3": 1,
"7": 1,
"9": 0,
"15": 0
},
"facet_fields": {
"category": [
"2",
2,
"1",
1,
"3",
1,
"7",
1,
"8",
1
]
},
"facet_dates": {},
"facet_ranges": {}
}
Thus giving the needed facet result.
I have some doubts about performance here(especially when there will be more than 4 categories and if the initial query is returning a lot of results), so it is better to do some benchmarking, before using this in production.
Not exactly the answer to my own question, but the solution we are using now: The numbers I want to filter on, build distinct groups. So we can prefix the id with a group id like this:
1.3
1.8
1.9
2.4
2.5
2.11
...
Having the data like this in SOLR, we can use facted prefixes to facet only over a single group: http://wiki.apache.org/solr/SimpleFacetParameters#facet.prefix

Resources