We have a situation where we are keeping two indexes with different schemas.
For example: suppose we have an index for seller where the key value is seller id and other attributes are seller information. Now another index is book where book id is unique key and it keeps book related information.
Is it possible to query both these indexes in a single query and get collective results?
I have checked Solr but as per my findings we can do this through distributed search in Solr but it works on same kind of schema being distributed in at max 3 indexes.
I am a newbie to Solr so please ignore if this is a stupid question.
You need to think about what makes sense for a search query but there are some rules.
The first requirement is that the unique keys need to have the same name and be unique across collections or Solr cannot collate results.
If you are then hoping to get some kind of sensible ranking of your results you need some common fields. For example I have two collections: one of product data and one containing product related documents. I have a unique key: id and I have common title and contents fields for when I want to query across the two collections. I also have an advanced search interface where I can query on specific fields like product id.
A "unification core" is a typical way of handling search across two or more cores, see this Stack Overflow answer on how to set that up
Query multiple collections with different fields in solr
Other techniques are to use federated search with something like Carrot or to issue two queries and show the results in different tabs in the search results.
Related
I have the following database (left out fields for sake of simplicity):
companies
id
name
registration_number
address
machines
id
company_id
type
serialnr
make
model
year
I created a SOLR index for companies. By adding fields like type_x_count and type_y_count we can do selections like 'get all companies that have at least 2 type x machines'. This works perfect. Same for a available_brands field that contains all brands the company owns.
Now we want to improve on it with queries like:
'get all companies that have machines with make=x and year<2019'. So I came across embedded / child documents in SOLR that seems to do the trick. Unfortunately, there is another request:
`get all companies that have at least 2 machines with make=x and year<2019'. Is this possible with child documents? Or is there another option? Or is it not possible at all with SOLR?
I found out about a path component to store make/model/year/etc, but the number of fields shown above is just a small selection of the real fields. We also like to have any combination possible, so that would require an almost infinite amount of paths.
I'm using SOLR together with the PHP Solarium library, but I can always go around it if necessary.
I am new to Apache Solr and have worked with single table and importing it in Solr to get data using query.
Now I want to do following.
query from multiple tables ..... Like if I find by a word, it should return all occurances in multiple tables.
Search in all fields of table ....like I query by word in all fields in single table too.
Do I need to create single document by importing data from multiple tables using joins in data-config.xml? And then querying over it?
Any leads and guidance is welcome.
TIA.
Do I need to create single document by importing data from multiple tables using joins in data-config.xml? And then querying over it?
Yes. Solr uses a document model (rather than a relational model) and the general approach is to index a single document with the fields that you need for searching.
From the Apache Solr guide:
Solr’s basic unit of information is a document, which is a set of data
that describes something. A recipe document would contain the
ingredients, the instructions, the preparation time, the cooking time,
the tools needed, and so on. A document about a person, for example,
might contain the person’s name, biography, favorite color, and shoe
size. A document about a book could contain the title, author, year of
publication, number of pages, and so on.
I'm trying to understand how to approach search requirements I have.
The first one is a normal product search that I know Solr can handle appropriately, where you search for a term and Solr returns relevant documents.
The second one is a search for products within a certain category. I have a hierarchical structure in my database that consists in a category with many subcategories and those have products.
The thing is, when some very specific words are searched for, the first approach shouldn't be used, instead a search for a category should be done and only products within that category should be returned, which for me is a very basic SQL query (select * from products where categoryId = 1000).
Does Solr should or can be used in the second case? If so, what is the normal approach to use?
Besides what #Mysterion proposed of filter queries you should take a look at Solr Facets which gives you very powerful catogory-like searching.
You also might want to consider multivalue field for categoryParentIds which will contain the parent categories that the product is in and thus combined with filter query and or facets will get your parent category searching.
Yes, you could use similar approach in Solr, by attributing your products with categoryId and later, while searching add filter query similiar to SQL, categoryId:10000
For more info about filter query, take a look here - http://wiki.apache.org/solr/CommonQueryParameters#fq
I have two tables contacts and inventory. These two tables are not related. I want to index these two tables and search using Solr.
Is this possible?
If some part of your application needs to search for contacts, and another one needs to search in the inventory, create two separate indices. Storing wildly different data in the same index is almost never a good idea, it complicates things unnecessarily. As the Solr wiki wisely says:
The more heterogeneous (different
kinds of data) you have in one field
or in one index, the less useful it
is.
You don't need to have multiple Solr instances to accomodate multiple indices, you can easily manage this with multi-core.
I found a very helpful answer to this question here, including some guidance on using "multiple indexes" vs. "multiple document types in one index". The post also links to example code on github that I found very useful.
Yes, you can do that. Simply create a Solr schema, that contains all fields necessary for both tables and add another field, that contains the table name. During indexing, add the table name property to the fields you want to index. During searching also always include a query parameter for the table name field.
As an alternative, you can setup multiple instances of Solr. But you should do this only, if we are talking about massive amounts of data here (like millions of table rows).
I am a bit confused as to where SOLR usage ends and where it begins.
I use php with a relational mysql db for a shopping site where all tables are related to the product table joining the tables as theyre queried. Needless to say its too slow!
e.g.
Category table - catid, catname, catdesc
Brand table - brandid, brandname, branddesc
Product table - productid, productname, productdesc, catid, brandid
(I also use ranges for price ranges etc)
I am wondering whether I should use SOLR to index the whole relational schema or whether just to index the product table alone and let my application work as it currently does.
If I just switch the product table to use SOLR are there any caveats to this?
e.g. in mysql I can do a fulltext search while joining the brand table. This will allow brands to also be searched upon. Is it possible to achieve the same thing just by switching the product table to SOLR? Are there any other caveats I should be looking out for.
I also would like to create a new table for "searches". This would allow me to use keywords in a mysql table in the following way:
Searches table - searchterm (e.g. lipstick), synonyms (e.g. lipstick, lips etc.)
ie. this would allow me to search upon multiple terms at the same time - a good time to use SOLR facets maybe instead of storing searches in mysql?, or should I just use mysql to store the searches and pull the products from SOLR?
Any help is gladly appreciated
NO NEED TO SWITCH
You don't want to "switch" -- just like using full-text indexing in MySQL (or using something like Sphinx), the full-text index is separate from the database tables.
What you want to do is figure out what you're searching for and index that in Solr -- it may well be just products. That's certainly an easy first step.
Basically you'll:
index the appropriate column(s) into Solr
use Solr for the searching
use the Solr results to point back to the records from the database
I'm more Ruby and Java than PHP, but you'll basically be talking to Solr for the full-text search and using that to find the records you want to display.