Solr facet with additional metadata - solr

Is it possible to use additional metadata fields when using Solr facets? I would like to aggregate one attribute by counting them and desplaying the related group as additional metadata field.
http://localhost:8983/solr/gitIndex/select?indent=on&q=*:*&rows=0&wt=json&
json.facet={
Repository_s: {
type: terms,
field: Repository_s,
limit: 10,
facet: {
x:"count()"
}
}
}
The result should look like this:
...
"facets":{
"count":1354013,
"<name of attribute>":{
"buckets":[{
"val":"<value of attribute>",
"count":173997,
"<metadata_field>":<value of metadata_field>},
...

A solution is to use facet pivots - it'll get you any values in a secondary field under each facet, and if the value is unique for the set of documents, it'll just be a single value.
The reference guide has the syntax for non-json facets.

Related

SOLR - group by field and then get distinct value by another field

I'm using apache solr for searching records. In my case I'm having table which has columns category and sub-category, etc.
I want to group by category and then get the distinct list of sub-category from grouped results. Is that possible in apache solr?
If yes, please do help me to solve this.
Thanks in advance.
You can do that with a pivot facet:
facet=on&facet.pivot=category,subcategory
This will give you a facet with all the sub categories for each category.
You can also use the Facet JSON API. Example adopted from that page:
top_categories:{
type: terms,
field: category,
limit: 5,
facet:{
top_subcategories:{
type: terms,
field: subcategory,
limit: 20
}
}
}

In Solr, how to use edismax with filter queries (but without a default field)?

I have an edismax query with faceting enabled. I haven't specified a default field (neither in the select clause nor in the solrconfig.xml), as I only want to search on the fields specified in the 'qf' parameter. (I have the impression that if I do specify a default field, that field is also taken into account).
Here's the query:
/select?q=david&defType=edismax&qf=firstname^1+lastname^10&facet=true&facet.field=organization
So far everything works as expected: I get some results and there are also some results from the faceted search, e.g.
"UZ Leuven" (18)
"OLV Aalst" (8)
...
When I now click on one of the organizations, I want to search only within the set of documents that belong to that organization, hence the use of a filter query. However, when I add such a 'filterQuery' (fq), Solr complains that
no field name specified in query and no default specified via 'df'
param.
So does that mean I do have to add some kind of 'catch-all' default field? But this doesn't seem logical, as all search fields are already specified in the 'qf'?
Here's my query:
/select?q=david&defType=edismax&qf=firstname^1+lastname^10&fq=organization:UZ+Leuven
And here is the output from the query:
{
responseHeader: {
status: 400,
QTime: 1,
params: {
q: "david",
qf: "firstname^1 lastname^10",
wt: "json",
fq: "organization:UZ Leuven",
defType: "edismax"
}
},
error: {
msg: "no field name specified in query and no default specified via 'df' param",
code: 400
}
}
You probably want organization:"UZ Leuven", as it's complaining about a missing field name for Leuven. The standard query syntax for the fq parameter is the lucene syntax.
If you want to use the edismax/dismax query syntax for the fq parameter, you'll have to tell Solr to use the edismax parser through LocalParams:
fq={!type=edismax qf=$fqqf}organization:UZ Leuven&fqqf=field1 field2 field3
or possibly
fq={!type=edismax qf='field1 field2 field3'}organization:UZ Leuven

Solr facet counts for specific field values

Solr creates multi-select facet counts for me as described here:
https://web.archive.org/web/20131202095639/http://wiki.apache.org/solr/SimpleFacetParameters#Multi-Select_Faceting_and_LocalParams
I also have various predefined searches that allow a user to browse the catalog. Here is one such example and its query parameters:
q=*:*
fq={!tag=g}genre:western
facet=on
facet.field={!ex=g}genre
facet.mincount=1
facet.limit=50
With this search I get up to 50 genre values in the facet list. I then go through and mark which values were selected by the user; western in this case. This works well except when western is pushed out of the top 50. So I manually add it to the list to make a total of 51. This way the user can see that it is indeed selected. The problem is I have to leave the count for western blank because I don't know it.
Is there a way to get counts for specific facet values such as western in this case? Or another approach to solve this issue?
I am using Solr 4.7.0.
Solr allows you to create a query-based facet count by using the facet.query parameter. When creating a filter query (fq) that's based on a facet field value, I now create a corresponding facet query:
facet.query={!ex=g}genre:western
and add it to the rest of my parameters:
q=*:*
fq={!tag=g}genre:western
facet=on
facet.field={!ex=g}genre
facet.query={!ex=g}genre:western
facet.mincount=1
facet.limit=50
The facet_queries object will now be populated in the solr response:
{
...
"facet_counts": {
"facet_queries": {
"{!ex=g}genre:western": 7
},
...
},
...
}
Regardless of what is returned in the facet_fields object, I'm now guaranteed to have a facet count for genre:western. With some parsing, facet field counts can be extracted from the facet queries.

SOLR sort by IN Query

I was wondering if it is possible to sort by the order that you request documents from SOLR. I am running a In based query and would just like SOLR to return them based on the order that I ask.
In (4,2,3,1) should return me documents ordered 4,2,3,1.
Thanks.
You need Sorting in solr, to order them by field.
I assume that "In based query" means something like: fetch docs whose fieldx has values in (val1,val2). You can a field as multi-valued field and facet on that field. A facet query is a 'is in' search, out of the box (so to say) and it can do more sophisticated searches too.
Edited on OP's query:
Updating a document with a multi-valued field in JSON here. See the line
"my_multivalued_field": [ "aaa", "bbb" ] /* use an array for a multi-valued field */
As for doing a facet query, check this.
You need to do one or more fq statements:
&fq=field1:[400 to 500]
&fq=field2:johnson,thompson
Also do read up on the fact (in link above) that you need to facet on stored rather than indexed fields.
You can easily apply sorting with QueryOptions and field sort (ExtraParams property - I am sorting by savedate field, descending):
var results = _solr.Query(textQuery,
new QueryOptions
{
Highlight = new HighlightingParameters
{
Fields = new[] { "*" },
},
ExtraParams = new Dictionary<string, string>
{
{"fq", dateQuery},
{"sort", "savedate desc"}
}
});

Solr Faceting on Multiple Concatenated Fields

I need a way to get facets on two combined field names. To show you what I mean, take a look at the query as it is now:
{
"responseHeader":{
"status":0,
"QTime":16,
"params":{
"facet":"true",
"indent":"true",
"q":"productId:(1 OR 2 OR 3 OR 4)",
"facet.field":["productMetaType",
"productId"],
"rows":"10"}},
"response":{"numFound":4,"start":0,"docs":[
{
"productId":1,
"productMetaType":"PRIMARY_PHOTO",
"url":"1_PRIM.JPG"},
{
"productId":1,
"productMetaType":"OTHER_PHOTO",
"url":"1_1.JPG"},
{
"productId":1,
"productMetaType":"OTHER_PHOTO",
"url":"1_2.JPG"},
{
"productId":2,
"productMetaType":"OTHER_PHOTO",
"url":"2_1.JPG"}]
},
"facet_counts":{
"facet_queries":{},
"facet_fields":{
"productMetaType":[
"PRIMARY_PHOTO",1,
"OTHER_PHOTO",3],
"productId":[
"1",3,
"2",1]},
"facet_dates":{},
"facet_ranges":{}
}
}
I get two facet fields, productMetaType and productId. What I need to do is somehow combine those fields so I get data back something like this:
1_PRIMARY_PHOTO, 1,
1_OTHER_PHOTO, 2,
2_PRIMARY_PHOTO, 0,
2_OTHER_PHOTO, 1
Does the pivot functionality do this? Unfortunately, we're running Solr 3.1, so pivot isn't available, but if that is the only way to do this, I might have some ammo for upgrading.
The only other thing I could think of was some how concatenating the field names. I am new to Solr and don't know what is possible. Any advice or assistance is appreciated. Thank you for your time.
Yes, Pivot would work do the trick, but as you observed, this feature is only available in Solr trunk.
Your idea to combine both fields would work too. Actually, if your fields have a limited number of values, the easiest and most flexible way to do this would be to use facet queries:
productId:1 AND productMetaType:PRIMARY_PHOTO
productId:2 AND productMetaType:OTHER_PHOTO
productId:1 AND productMetaType:OTHER_PHOTO
productId:2 AND productMetaType:PRIMARY_PHOTO
Otherwise, just create a new field in your Solr schema.xml with string type, recreate your index by adding your documents as previously, but with this new field (that you can generate as you wish, using '_' as a separator between the two field values would work perfectly).

Resources