Joins & Auto Generated Id in Solr 4.9 - solr

I have the following 2 questions:
1) I was going through the Join query parser in Solr here https://wiki.apache.org/solr/Join. From the above example, what I understand is that it is not possible to join between 2 separate schemas in Solr. The only join that is feasible is a self-join. Did I understand it correctly?
2) I was trying to find a way to create auto-generated id's in Solr.
I came across this link https://wiki.apache.org/solr/UniqueKey, what I understand from this link is that there is a way to create the unique id in Solr, but what if I have 2 separate fields in my schema that I want to be auto-generated? Is there a way to achieve that?

You can join across different cores the syntax is something like:
{!join from=fromId to=toId fromIndex=Core2}query
So if you have two core that looks something like
PersonCore - ID, Name
AddressCore - ID, Address, PersonID
You can find all people at a specific address by querying the PersonCore like this:
{!join from=PersonID to=ID fromIndex=AddressCore}Address:Address1
You can only have one UniqueKey. It might be possible to auto generate another unique field value though, have a look here:
http://solr.pl/en/2013/07/08/automatically-generate-document-identifiers-solr-4-x/
I've never used this because I always used a unique key from the data being indexed but it might be worth looking into? If you were to add another field name into the updateRequestProcessorChain section descibed in solrconfig.xml would that generate another unique ID? I'm not sure but it's something to try

Related

Querying Solr multiple indexes with different schema in single query

We have a situation where we are keeping two indexes with different schemas.
For example: suppose we have an index for seller where the key value is seller id and other attributes are seller information. Now another index is book where book id is unique key and it keeps book related information.
Is it possible to query both these indexes in a single query and get collective results?
I have checked Solr but as per my findings we can do this through distributed search in Solr but it works on same kind of schema being distributed in at max 3 indexes.
I am a newbie to Solr so please ignore if this is a stupid question.
You need to think about what makes sense for a search query but there are some rules.
The first requirement is that the unique keys need to have the same name and be unique across collections or Solr cannot collate results.
If you are then hoping to get some kind of sensible ranking of your results you need some common fields. For example I have two collections: one of product data and one containing product related documents. I have a unique key: id and I have common title and contents fields for when I want to query across the two collections. I also have an advanced search interface where I can query on specific fields like product id.
A "unification core" is a typical way of handling search across two or more cores, see this Stack Overflow answer on how to set that up
Query multiple collections with different fields in solr
Other techniques are to use federated search with something like Carrot or to issue two queries and show the results in different tabs in the search results.

How to approach a hierarchical search in Solr?

I'm trying to understand how to approach search requirements I have.
The first one is a normal product search that I know Solr can handle appropriately, where you search for a term and Solr returns relevant documents.
The second one is a search for products within a certain category. I have a hierarchical structure in my database that consists in a category with many subcategories and those have products.
The thing is, when some very specific words are searched for, the first approach shouldn't be used, instead a search for a category should be done and only products within that category should be returned, which for me is a very basic SQL query (select * from products where categoryId = 1000).
Does Solr should or can be used in the second case? If so, what is the normal approach to use?
Besides what #Mysterion proposed of filter queries you should take a look at Solr Facets which gives you very powerful catogory-like searching.
You also might want to consider multivalue field for categoryParentIds which will contain the parent categories that the product is in and thus combined with filter query and or facets will get your parent category searching.
Yes, you could use similar approach in Solr, by attributing your products with categoryId and later, while searching add filter query similiar to SQL, categoryId:10000
For more info about filter query, take a look here - http://wiki.apache.org/solr/CommonQueryParameters#fq

Can Solr use field values of a known document in a query?

I would like to perform a Solr search using the values of certain fields of an indexed document which I can identify by its id. With MLT this is somehow possible, but I would prefer a regular query parser. Can I somehow use subqueries to inject the result of a subquery into the main query?
For example, let's say I have indexed information about books into solr, where each document represents a book, with an id, title and author field. At query time I have only the document id availible and I would like to search for books by the same author in a single step. Is this possible without using MLT?
You can use JOIN.
http://HOST:PORT/CORE/select?q={!join from=author to=author}id:<ID>

Query two custom objects joining on the Name field

I want to create a join on two custom objects joining on the Name field. Normally joins require a lookup or master-detail relationship between the two objects, but I just want to do a text match.
I think this is a Salesforce limitation but I couldn't find any docs on whether this was so. Can anyone confirm this?
Yes, you can make a join (with dot notation or as subquery) only if there's a relationship present. And relationships (lookup or master-detail) can be made only by Id. There are several "mutant fields" (like Task.WhoId) but generally speaking you can't write a JOIN in SOQL and certainly can't use a text column as a foreign key.
http://www.salesforce.com/us/developer/docs/soql_sosl/Content/sforce_api_calls_soql_relationships.htm#relate_query_limits
Relationship queries are not the same as SQL joins. You must have a
relationship between objects to create a join in SOQL.
There are some workarounds though. Why exactly do you need the join?
Apex / SOQL - have a look at SOQL in apex - Getting unmatched results from two object types for example. Not the prettiest thing in the world but it works. If you want to try something really crazy - SOSL that would search your 2 objects at the same time?
Reports - you should have no problem grouping by text field - that means a joined report might give you results you're after. Since Winter'13 joined reports allow charts and exporting, that was quite a limiting factor...
Easy building of links between data - use external ids and upsert operation, especially if you plan to load data from outside SF easily. Check my answer for Can I insert deserialized JSON SObjects from another Salesforce org into my org?
Uniqueness constraints - you can still mark fields as required & unique.
Check against "dictionary" of allowed values - Validation rule with VLOOKUP might do what you're after.

Use SOLR on Mysql relational database. Use SOLR for only product table

I am a bit confused as to where SOLR usage ends and where it begins.
I use php with a relational mysql db for a shopping site where all tables are related to the product table joining the tables as theyre queried. Needless to say its too slow!
e.g.
Category table - catid, catname, catdesc
Brand table - brandid, brandname, branddesc
Product table - productid, productname, productdesc, catid, brandid
(I also use ranges for price ranges etc)
I am wondering whether I should use SOLR to index the whole relational schema or whether just to index the product table alone and let my application work as it currently does.
If I just switch the product table to use SOLR are there any caveats to this?
e.g. in mysql I can do a fulltext search while joining the brand table. This will allow brands to also be searched upon. Is it possible to achieve the same thing just by switching the product table to SOLR? Are there any other caveats I should be looking out for.
I also would like to create a new table for "searches". This would allow me to use keywords in a mysql table in the following way:
Searches table - searchterm (e.g. lipstick), synonyms (e.g. lipstick, lips etc.)
ie. this would allow me to search upon multiple terms at the same time - a good time to use SOLR facets maybe instead of storing searches in mysql?, or should I just use mysql to store the searches and pull the products from SOLR?
Any help is gladly appreciated
NO NEED TO SWITCH
You don't want to "switch" -- just like using full-text indexing in MySQL (or using something like Sphinx), the full-text index is separate from the database tables.
What you want to do is figure out what you're searching for and index that in Solr -- it may well be just products. That's certainly an easy first step.
Basically you'll:
index the appropriate column(s) into Solr
use Solr for the searching
use the Solr results to point back to the records from the database
I'm more Ruby and Java than PHP, but you'll basically be talking to Solr for the full-text search and using that to find the records you want to display.

Resources