Index few attributes of document in hybris Solr

Index few attributes of document in hybris Solr - solr

We have around 100k products in our website and each product have around 30 attributes which are indexed. Most of the time we only update price of products but we still have to index the whole product. Is it possible in hybris to index only the price attribute(or description attribute) of all 100k products.

It is possible since Solr 4.0. This feature is called partial update, where you can update only the fields changed, in your case, price and description.
The official documentation is here.

Marco is right. You can do a Partial Update.
For Hybris, there is some documentation is in Creating and Configuring Indexed Types. SolrIndexerQuery.type attribute lets you choose partial_update.
You have the following values to choose from:
FULL: recreates the index
UPDATE: updates some documents in the index
PARTIAL_UPDATE: allows you to select the fields for the update
DELETE: deletes documents from the index

Related

Apache Solr Querying by search term from multiple tables and in all columns

I am new to Apache Solr and have worked with single table and importing it in Solr to get data using query.
Now I want to do following.
query from multiple tables ..... Like if I find by a word, it should return all occurances in multiple tables.
Search in all fields of table ....like I query by word in all fields in single table too.
Do I need to create single document by importing data from multiple tables using joins in data-config.xml? And then querying over it?
Any leads and guidance is welcome.
TIA.

Do I need to create single document by importing data from multiple tables using joins in data-config.xml? And then querying over it?
Yes. Solr uses a document model (rather than a relational model) and the general approach is to index a single document with the fields that you need for searching.
From the Apache Solr guide:
Solr’s basic unit of information is a document, which is a set of data
that describes something. A recipe document would contain the
ingredients, the instructions, the preparation time, the cooking time,
the tools needed, and so on. A document about a person, for example,
might contain the person’s name, biography, favorite color, and shoe
size. A document about a book could contain the title, author, year of
publication, number of pages, and so on.

Solr: deltaQuery / parentDeltaQuery / deltaImportQuery

Solr's documentation for DataImportHandler gives this table for the entity query attributes.
That's not extremely descriptive. Can someone express here the difference and interaction between these query attributes? I have seen some code use deltaQuery and parentDeltaQuery to support nested entities, and I have seen others use deltaQuery and deltaImportQuery.
What is the purpose of choosing one of those over the other?

I see it now in the Solr Wiki:
* The query gives the data needed to populate fields of the Solr document in full-import
* The deltaImportQuery gives the data needed to populate fields when running a delta-import
* The deltaQuery gives the primary keys of the current entity which have changes since the last index time
* The parentDeltaQuery uses the changed rows of the current table (fetched with deltaQuery) to give the changed rows in the parent table. This is necessary because whenever a row in the child table changes, we need to re-generate the document which has that field.
I missed this explanation on the first pass, and expected that information to show up in the table I posted. Strangely enough, Solr In Action spent less than 1 page of 600 explaining how to use DataImportHandler to read a database.

Solr roll up query

I have a specific query with SOLR that I cannot seem to find a solution for. I have an index full of products and sku's. A product has multiple sku's and every sku has 1 product. I want to perform a search against my SKU's only, group by the parent product and return just the details of the parent product (but not the details of the items).But, I want the facets to represent the original list of items. Is this possible with SOLR today? and what version is this available at?

I think it is possible, my suggestion is to design your core so that the document represent only one SKU, or one item. So, your Unique Id will be the SKU Id. Then you need a productId that is not unique and could have the same value for SKUs that have the same parent product.
You can also de-normalize product details across all documents. So, when you return the details of the item, you also have the details of the produce with it.
The trick here on the query is to use grouping, or field collapsing feature in Solr.
See more details here: https://wiki.apache.org/solr/FieldCollapsing
But as a start I suggest setting these values in the query:
Set group=true (this will enable grouping)
Set group.field=productId (to group, or collapse items by productId)
Set group.facet=false (to include details of all items in facet counts)
So, this will enable you to search across all items, return results grouped by ProductId, and facet numbers will be applied to all items.
This is not a new feature, if you have any Solr 3.3, or 4.x you should be able to use grouping.

You could use :
"sort":"map(special_price,1,99999,special_price,price) desc"
"sort":"map(special_price,1,99999,special_price,price) asc"

Solr: one of each in all categories

I have product index at solr, product has category field and I need to select one product (better would be random) from each category, how query would look like?

if you are looking for sql group by feature,
with solr 3.3 on-wards,
it has the similar feature called FieldCollapsing
Field Collapsing collapses a group of results with the same field value down to a single (or fixed number) of entries. For example, most search engines such as Google collapse on site so only one or two entries are shown, along with a link to click to see more results from that site. Field collapsing can also be used to suppress duplicate documents.

return only one document for each filter defined in the query

In one of my latest projects I use Solr 1.4 for searching products.However I have ran into a slight problem, which I aint sure if its possible to do using Solr.
All products are indexed by "country" and "category" and the "id", "class" and "description" are stored values. I now have been requested to extract a sample list of products that we have for a give "category" and "ONLY RETURNING ONE" product for each country where the product is available.
In my current implementation, I have a dismax query to get a list of all the countries that correspond to the catergory, then I call again solr to extract all products for each country, limiting the no. rows by the size of the countries found in the previous query.
The problem I have with this current implementation is I can not be certain that I have one product for each country in the list. Therefore would anyone know if it possible to tell solr that you want only one product per country provided in the query?
Any guidance would be useful.

Take a look at Field collapsing. It allows you to do something analogous to a GROUP BY in SQL. Unfortunately, as of this writing, this feature is not even in trunk. You have to download the latest patch from the issue tracker, apply it to trunk and compile.