Solr: ordering by a multi valued field - solr

I need to create a new collection on my Solr 6.1.0 cluster where every row is a content and every content can belong to one or many categories, which are specified in a multivalued field categories.
In my web app the user can search by categories, and if wanted it can even group results by category. If it wants to order by category, what about the contents which belong to more than one category?
In this case, the search results page should show the same content more times in different categories. I don't want the web application to filter and order results because in this case, it should ask Solr for every row (I know this is not advised for bad performance), so is there a way to let Solr make this? For example, repeating the same content in two categories if a flag is enabled or if I am asking Solr to sort by category?
Until now I bypassed the problem cloning one record for every category and specifying the category ID in a single int field. But this is not optimized, because in this case my index is much bigger than it could be, and every content metadata a part of category is just the same for every content, and because of this I would like to have 1 content = 1 Solr record.

Related

NetSuite - UNION ALL equivalent in saved search?

I'm in the process of writing a SuiteTalk integration, and I've hit an interesting data transformation issue. In the target system, we have a sort of notes table which has a category column and then the notes column. Data going into that table from NetSuite could be several different fields on a single entity in NetSuite terms, but several records of different categories in our terms.
If you take the example of a Sales Order, you might have two text fields that we need to bring across as notes. For each of those fields I need to create a row, with both the notes field in the same column but separate rows. This would allow me to add a dynamic column that give the category for each of those fields.
So instead of
SO number notes 1 notes 2
SO1234567 some text1 some text2
You’d get
SO Number Category Text
SO1234567 category 1 some text1
SO1234567 category 2 some text2
The two problems I’m really trying to solve here are:
Where can I store the category name? It can’t be the field name in NetSuite. It needs to be configurable per customer as the number of notes fields in each record type might vary across implementations. This is currently my main blocker.
Performance – I could create a saved search for each type of note, and bring one row across each time, but that’s not really an acceptable performance hit if I can do it all in one call.
I use Saved Searches in NetSuite to provide a configurable way of filtering the data to import into the target system.
If I were writing a SQL query, i would use the UNION clause, with the first column being a dynamic column denoting the category and the second column being the actual data field from NetSuite. My ideal would be if I could somehow do a similar thing either as a single saved search, or as one saved search per entity, without having to create any additional fields within NetSuite itself, so that from the SuiteTalk side I can just query the search and pull in the data.
As a temporary kludge, I now have multiple saved searches in NetSuite, one per category, and within the ID of the saved search I expect the category name and an indicator of the record type. I then have a parent search which gives me the searches for that record type - it's very clunky, and ultimately results in far too many round trips for me to be satisfied.
Any idea if something like this is at all possible?? Or if not, is there a way of solving this without hard-coding the category values in the front end? Even if I can bring back multiple recordsets in one call, that would be a performance enhancement.
I've asked the same question on the NetSuite forums but to no avail.
Thanks
At first read it sounds like you are trying to query a set of fields from entities. The fields may be custom fields or built in fields. Can you not just query the entities where your saved search has all the potential category columns and then transform the received data into categories?
Otherwise please provide more specifics in Netsuite terms about what you are trying to do.

Solr faceted search building widgets

We want to build a faceted search within our application. For example, if we have quantity field whose values range from 1-20 for 2000 records. We need to allow the user to filter by those values.
To, accomplish this we are planning to extract the quantity field sort, eliminate duplicate records and build a widget on the left hand side of the screen, so the user can select what we need.
Is there a way to get this faceted criteria from Solr or any better way to implement it.
This is what Solr calls a Facet, and is enabled using facet=true
&facet=true&facet.field=quantity
.. will give you a facet entry back in the response, containing a count for each unique value in the quantity field. When the user clicks a quantity link, apply a fq for that particular quantity value, such as fq=quantity:4.
You can use facet.sort to determine if the facet should be sorted by hits (most popular quantity first) or alphabetical.
Multi-Select Facets and Local Params might also be useful, if you want to still show the original counts while allowing the user to drill down into the selection when applying an fq with the selected quantity as a criteria.

How do I get the first and last document per SOLR facet, sorted by some field?

I have documents with multiple facets. I have different views on the website I'm creating to view the facet stats.
As well as showing the facet stats, I would like to show example documents from each facet - specifically, the first and last documents ordered by another field.
For example, properties for sale, I want to see the first and last (based on price) for each facet (the facet can be street, area, city, post code etc).
I can solve this by calling SOLR multiple times for each facet, but it seems like something that should be built in and if so, it would reduce roundtrips a LOT. (it would mean probably 2 SOLR calls per page instead of 30 or possibly more)
Instead of faceting, you can look into
https://wiki.apache.org/solr/FieldCollapsing
Then you need to do only two queries with group.sort ASC or DESC on the field by which you want to sort.

Solr: one of each in all categories

I have product index at solr, product has category field and I need to select one product (better would be random) from each category, how query would look like?
if you are looking for sql group by feature,
with solr 3.3 on-wards,
it has the similar feature called FieldCollapsing
Field Collapsing collapses a group of results with the same field value down to a single (or fixed number) of entries. For example, most search engines such as Google collapse on site so only one or two entries are shown, along with a link to click to see more results from that site. Field collapsing can also be used to suppress duplicate documents.

Index document "linked" to multiple users

Hi I want to index a Solr Document and tag the document with multiple associated users. I want to enable searches like "give me the documents assocaited with userid 1000,1003...9300 containing the word X. More people will be added to the document during the lifetime of the document. I want to potentially associate thousands of users to one document. There is no need to show the associated users in the results, just for search, will indexing of userid or username be more performant and scalable. What field type would be more performant and scalable, appending to a text field, a multivalued field or any other approach?
I believe that using the userid (as an integer) would be the most performant. (At least from my experience so far). Also, using a multivalued field will allow you to use a filter query on the userid field to help improve the query response time.

Resources