Schema based facets in Vespa - vespa

Is there any way to implement schema based facets in Vespa?
For Eg:
My schema contains the fields: album, artist and year. For that the schema facet would be:
[{"field":".fields", "label":"artist", "count":300},{"field":".fields", "label":"album", "count":300},{"field":".fields", "label":"year", "count":250}]

Facets in Vespa is called result grouping, see https://docs.vespa.ai/en/grouping.html

Related

Sorting of solr documents based on search term in solr

I would like to sort solr documents based on searched term. For example the search term is "stringABC"
Then the order of the results should be
stringABC,
stringABCxxxx,
xxxxstringABCxxxx
The solr document will contain lot of fileds ex: title, description, path, article No, Product code etc..
And the default field will contain more than one field ex: title, description and path.
So the solr doc will only be returned when the search term satisfied any field from the default field.
Use three fields - one with the exact string, one with a EdgeNgramTokenizer and one with an NgramTokenizer. You can then use qf=field1^10 field2^5 field3 to score hits in these fields according to how you want to prioritize them between each other.

dynamic fields as facet in solr

I am trying to develop a filter system using dynamic fields in solr. These dynamic fields may vary from product to product and have a prefix attribute_filter_ to help me recognize the filter field. So given a search query, I want to get faceted results based on these dynamic fields.
For example, I have 3 products as docs in solr
{ID:1, attribute_filter_color:"white", attribute_filter_brand:"Dell"}
{ID:2, attribute_filter_color:"red", attribute_filter_category:"electronics"}
{ID:3, attribute_filter_size:"mobiles", attribute_filter_brand:"samsung"}
When my search query matches doc 1 and doc2, I want only filters color, brand and category and so facet fields are attribute_filter_color, attribute_filter_brand and attribute_filter_category.
When my search query matches doc 2 and doc3, I want filters color, size, category and brand and so facet fields are attribute_filter_color, attribute_filter_size, attribute_filter_category and attribute_filter_brand.
When my search query matches doc 1 and doc3, I want filters color, brand and size and so facet fields are attribute_filter_color,attribute_filter_brandand attribute_filter_size.
Also these filters can be ~300 total over 10^5 products. This creates another problem for making a GET URL with 300 facet fields which might cross the limit for GET URL.
This jira ticket shows how regex could have helped in this situation.
My solution would be to index the field names to an additional field, so that you have "facet_fields": ["attribute_filter_color","attribute_filter_brand"] for the documents containing the fields as well.
Generate a facet across your document result set, then use that result in a new query to generate facets across the fields you're interest in. It will be an extra query, but should scale decently. The part that will be expensive will be the larger number of different fields you're faceting on anyway - the facet_fields field will be quick to calculate and return.

Solr : Boost Results from a specific collection

We have solr index which has multiple collections i.e. collection_data_sales and collection_data_marketing. So when the user performs a search query, both the collections are queried upon using collection alias. Both collections have same solr schema.
Is there a way to boost the result from a specific collection ?
i.e. Suppose user specifies collection sales data, then search should happen on both collection_data_sales and collection_data_marketing but boost should be given for documents from collection_data_sales.
If you are able to differentiate both collections using data from it it will be enough. Lets imagine that in schema you have field type so for collection_data_marketing you have type:marketing and for collection_data_sales you have type:sales.
The only thing now you have to do is to use boost function like for example this:
bf=sum(product(query($q1),10), product(query($q2,3)))&q1=type:sales&q2=type:marketing
In this example sales will have weight 10 and marketing will have weight 3

Difference between Solr Facet Fields and Filter Queries

I am using SolrMeter to test Apache Solr search engine. The difference between Facet fields and Filter queries is not clear to me. SolrMeter tutorial lists this as an exapmle of Facet fields :
content
category
fileExtension
and this as an example of Filter queries :
category:animal
category:vegetable
categoty:vegetable price:[0 TO 10]
categoty:vegetable price:[10 TO *]
I am having a hard time wrapping my head around it. Could somebody explain by example? Can I use SolrMeter without specifying either facets or filters?
Facet fields are used to get statistics about the returned documents - specifically, for each value of that field, how many returned documents have that value for that field. So for example, if you have 10 products matching a query for "soft rug" if you facet on "origin," you might get 6 documents for "Oklahoma" and 4 for "Texas." The facet field query will give you the numbers 6 and 4.
Filter queries on the other hand are used to filter the returned results by adding another constraint. The thing to remember is that the query when used in filtering results doesn't affect the scoring or relevancy of the documents. So for example, you might search your index for a product, but you only want to return results constrained by a geographic area or something.
A facet is an field (type) of the document, so category is the field. As Ansari said, facets are used to get statistics and provide grouping capabilities. You could apply grouping on the category field to show everything vegetable as one group.
Edit: The parts about searching inside of a specific field are wrong. It will not search inside of the field only. It should be 'adding a constraint to the search' instead.
Performing a filter query of category:vegetable will search for vegetable in the category field and no other fields of the document. It is used to search just specific fields rather than every field. Sometimes you know that the term you want only is in one field so you can search just that one field.

Solr Faceting Multi-valued vs Tokenizers

I'm trying to set up a subject field in my schema. I'm drawing from a database where a single record can have multiple subjects and the subjects are listed in a comma delimited string. Is there a way to facet on just one of the subjects?
Thanks
Check SolrFacetingOverview for an faceting overview.
Facet Indexing section mentions the field type you should choose for the field that you want to facet on.
You can customize the faceting using SimpleFacetParameters
You can filter the results with entities having particular value for a subject using the filter query e.g. fq=subject:"MATH"
The filtering would produce only the results matching the criteria and the facet results would include the facets from the resultset.
if I understand well you want this, in the dih file:
<entity name="entity" pk="id" query="..." transformer="RegexTransformer">
<field column="subjects" splitBy=","/>
</entity>
and the query for facetting:
http://localhost:8983/solr/select?q=...&facet=true&facet.field=subjects&facet.query=subjects:the-one-you-want
would that work?

Resources