Solr group by multiple fields - solr

Does Solr allowing grouping by multiple field much like in SQL with GROUP BY?
If so how?
For example: If I wanted to group by name and email. I have tried adding multiple group fields...
group=true&group.field=name&group.field=email
But it only groups by one of the fields.
I have looked at other posts with similar questions, but none had a verified answer.

I fear you cannot have intersection between grouped results at query time based on group.field.
The remaining solution would be to create the combination of the two fields into a new one at index time and use that new field for grouping which would give you the results.
See : https://lucene.apache.org/solr/guide/6_6/result-grouping.html

Related

NetSuite - UNION ALL equivalent in saved search?

I'm in the process of writing a SuiteTalk integration, and I've hit an interesting data transformation issue. In the target system, we have a sort of notes table which has a category column and then the notes column. Data going into that table from NetSuite could be several different fields on a single entity in NetSuite terms, but several records of different categories in our terms.
If you take the example of a Sales Order, you might have two text fields that we need to bring across as notes. For each of those fields I need to create a row, with both the notes field in the same column but separate rows. This would allow me to add a dynamic column that give the category for each of those fields.
So instead of
SO number notes 1 notes 2
SO1234567 some text1 some text2
You’d get
SO Number Category Text
SO1234567 category 1 some text1
SO1234567 category 2 some text2
The two problems I’m really trying to solve here are:
Where can I store the category name? It can’t be the field name in NetSuite. It needs to be configurable per customer as the number of notes fields in each record type might vary across implementations. This is currently my main blocker.
Performance – I could create a saved search for each type of note, and bring one row across each time, but that’s not really an acceptable performance hit if I can do it all in one call.
I use Saved Searches in NetSuite to provide a configurable way of filtering the data to import into the target system.
If I were writing a SQL query, i would use the UNION clause, with the first column being a dynamic column denoting the category and the second column being the actual data field from NetSuite. My ideal would be if I could somehow do a similar thing either as a single saved search, or as one saved search per entity, without having to create any additional fields within NetSuite itself, so that from the SuiteTalk side I can just query the search and pull in the data.
As a temporary kludge, I now have multiple saved searches in NetSuite, one per category, and within the ID of the saved search I expect the category name and an indicator of the record type. I then have a parent search which gives me the searches for that record type - it's very clunky, and ultimately results in far too many round trips for me to be satisfied.
Any idea if something like this is at all possible?? Or if not, is there a way of solving this without hard-coding the category values in the front end? Even if I can bring back multiple recordsets in one call, that would be a performance enhancement.
I've asked the same question on the NetSuite forums but to no avail.
Thanks
At first read it sounds like you are trying to query a set of fields from entities. The fields may be custom fields or built in fields. Can you not just query the entities where your saved search has all the potential category columns and then transform the received data into categories?
Otherwise please provide more specifics in Netsuite terms about what you are trying to do.

Querying Solr multiple indexes with different schema in single query

We have a situation where we are keeping two indexes with different schemas.
For example: suppose we have an index for seller where the key value is seller id and other attributes are seller information. Now another index is book where book id is unique key and it keeps book related information.
Is it possible to query both these indexes in a single query and get collective results?
I have checked Solr but as per my findings we can do this through distributed search in Solr but it works on same kind of schema being distributed in at max 3 indexes.
I am a newbie to Solr so please ignore if this is a stupid question.
You need to think about what makes sense for a search query but there are some rules.
The first requirement is that the unique keys need to have the same name and be unique across collections or Solr cannot collate results.
If you are then hoping to get some kind of sensible ranking of your results you need some common fields. For example I have two collections: one of product data and one containing product related documents. I have a unique key: id and I have common title and contents fields for when I want to query across the two collections. I also have an advanced search interface where I can query on specific fields like product id.
A "unification core" is a typical way of handling search across two or more cores, see this Stack Overflow answer on how to set that up
Query multiple collections with different fields in solr
Other techniques are to use federated search with something like Carrot or to issue two queries and show the results in different tabs in the search results.

How could I go about getting distinct field counts in Azure Search

I have an index with around 35 million documents. When a user issues a query with any combination of search words and filters, I need to get a count of unique values on another field. The purpose is to answer the question "How many unique (field x) are there with a given query?".
I'm pretty sure that Azure Search doesn't have any capability to do this, so I thought I would try to do another query where I select just the field I want to count distinct values of, but I think this would be very time consuming with such a large index. I'm also under the impression that I can only skip at max 100,000 records, which would make it impossible for me to do this if a query returned more than 100k results.
Any ideas on how to go about this?
Thanks!
Azure Search doesn't directly support distinct count of values today. In order to support it in a single query combined with $filter, it would either have to be supported as a new facet type, or maybe with a combination of $count and $filter where the field being counted is the key field (note that $count and $filter can't be combined today).
Feel free to add distinct count to the Azure Search feedback forum to help prioritize the feature.
Original Answer
If you wanted a count of documents per unique value, you could use facets. For example, if you're searching for shoes under $100 dollars and you want to know, out of the hits, how many shoes of each color there are, you would do this:
GET /indexes/products/docs?search=shoes&$filter=price+lt+100&facet=color&api-version=2015-02-28
The response will contain a #search.facets property that contains buckets for each unique value along with a count. You can find more info here and here.

How to approach a hierarchical search in Solr?

I'm trying to understand how to approach search requirements I have.
The first one is a normal product search that I know Solr can handle appropriately, where you search for a term and Solr returns relevant documents.
The second one is a search for products within a certain category. I have a hierarchical structure in my database that consists in a category with many subcategories and those have products.
The thing is, when some very specific words are searched for, the first approach shouldn't be used, instead a search for a category should be done and only products within that category should be returned, which for me is a very basic SQL query (select * from products where categoryId = 1000).
Does Solr should or can be used in the second case? If so, what is the normal approach to use?
Besides what #Mysterion proposed of filter queries you should take a look at Solr Facets which gives you very powerful catogory-like searching.
You also might want to consider multivalue field for categoryParentIds which will contain the parent categories that the product is in and thus combined with filter query and or facets will get your parent category searching.
Yes, you could use similar approach in Solr, by attributing your products with categoryId and later, while searching add filter query similiar to SQL, categoryId:10000
For more info about filter query, take a look here - http://wiki.apache.org/solr/CommonQueryParameters#fq

Solr: one of each in all categories

I have product index at solr, product has category field and I need to select one product (better would be random) from each category, how query would look like?
if you are looking for sql group by feature,
with solr 3.3 on-wards,
it has the similar feature called FieldCollapsing
Field Collapsing collapses a group of results with the same field value down to a single (or fixed number) of entries. For example, most search engines such as Google collapse on site so only one or two entries are shown, along with a link to click to see more results from that site. Field collapsing can also be used to suppress duplicate documents.

Resources