Currently I had 2 different schema set (setA/ and setB/) sitting under multicore/ folder in a jetty solr path /opt/solr/example/multicore.
If I wanna create shads for each schema, how should I go about it?
Thanks,
Two shards will have the same configuration, but different documents. So you make a copy of your configuration on a new server, then put half the documents on each server.
The Solr page on distributed search gives a little bit of information about querying across multiple shards.
Related
Currently I have a zookeeper multi solr server, single shard setup. Unique ids are generated automatically by solr.
I now have a zookeeper mult solr server, multi shard requirement. I need to be able to route updates to a specific shard.
After reading http://searchhub.org/2013/06/13/solr-cloud-document-routing/ I am concerned that I cannot allow solr to generate random unique ids if I want to route updates to a specific shard.
Cannot anyone confirm this for me and perhaps give an explanation of the best approach.
Thanks
There is no way you can route your documents to a particular shard since it is being managed by the zookeeper.
Solution to your problem is that you should create two collections instead of two shards. Use your 1st collection with two servers and 2nd collection can use the third server and then you can send your updates to particular servers.The design should look like
collection1---->shard1---->server1,server2
collection2---->shard1----->server3
This way you can separate your indexes as per your requirement.
What are the pros and cons of having multiple Solr applications for completely different searches comparing to having a single Solr application but have different searches setup as separate cores?
What is the Solr's preferred method? Is having a single Solr application with multicore setup (for various search indexes) is always a right way?
There is no preferred method. It depends on what you are trying to solve. So by nature, can handle multiple cores on the single Solr instance or can have cores across Solr application servers , can handle the collection (in solrcloud).
Having said that, usually you go for
1) Single core on a Solr instance if your data is fairly small - few million documents.
2) You go for multiple solr instances with a single core on each if you want to shard your data incase of billions of documents and want to get better indexing and query performance.
3) You go for multiple cores on single or multiple solr instances if you have multitenancy separating, example a core for each customer or a for catalog another core for skus.
It depends on your use case, the volume of data and query response times etc.
Can I use different format (schema.xml) for a document (like car document)?
So that I can use different index to query the same class of documents differently?
(OK.. I can use two instances of Solr, but.. that's the only way? )
Only one schema is possible for a Core.
You can always have different Cores within the same solr with the Multicore configuration.
However, if you have the same entity and want to query it differently, you can have the same schema.xml to hold the values different fields and different field types (Check copyfield) and have different query handler to have weighted queries depending upon the needs.
As far as I know you can only have one schema file per Solr core.
Each core uses its own schema file so if you want to have two different schema files then either set -up a 2nd Solr core or run another instance of Solr
In SOLR, what is multicore?
Is it a way to create multiple tables (inside a single solr app) with their own set of schema files, or is it about creating different databases (inside a single solr app)?
If we want to create multiple tables (with their respective schema.xml files) for solr web app then what is the best way to do this, or how can we achieve this in SOLR?
Solr Multicore is basically a set up for allowing Solr to host multiple cores.
These Cores which would host a complete different set of unrelated entities.
You can have a separate Core for each table as well.
For e.g. If you have collections for Documents, People, Stocks which are completely unrelated entities you would want to host then in different collections
Multicore setup would allow you to
Host unrelated entities separately so that they don't impact each other
Having a different configuration for each core with different behavior
Performing activities on each core differently (Update data, Load, Reload, Replication)
keep the size of the core in check and configure caching accordingly
I don't understand in Solr wiki, whether Solr takes one schema.xml, or can have multiple ones.
I took the schema from Nutch and placed it in Solr, and later tried to run examples from Solr. The message was clear that there was error in schema.
If I have a Solr, am I stuck to a specific schema? If not, where is the information for using multiple ones?
From the Solr Wiki - SchemaXml page:
The schema.xml file contains all of the details about which fields
your documents can contain, and how those fields should be dealt with
when adding documents to the index, or when querying those fields.
Now you can only have one schema.xml file per instance/index within Solr. You can implement multiple instances/indexes within Solr by using the following strategies:
Running Multiple Indexes - please see this Solr Wiki page for more details.
There are various strategies to take when you want to manage multiple "indexes" in a Single Servlet Container
Running Multiple Cores within a Solr instance. - Again, see the Solr Wiki page for more details...
Multiple cores let you have a single Solr instance with separate
configurations and indexes, with their own config and schema for very
different applications, but still have the convenience of unified
administration. Individual indexes are still fairly isolated, but you
can manage them as a single application, create new indexes on the fly
by spinning up new SolrCores, and even make one SolrCore replace
another SolrCore without ever restarting your Servlet Container.