Automate creation of some Solr cores on a Linux machine - solr

I need to create a bunch of solr cores on a Linux box. I can do this relatively easily with a combination of command line interactions to create the necessary directory structure, and the solr admin console to actually create the cores.
I would like to automate this process, but I'm not sure how to proceed. I can create the cores using the REST API, but the directory structure needs to already exist as far as I can tell. Also, I am a Windows user. Is there any way this can be done entirely from a Windows machine?
I'm not looking for code samples, I'm looking for advice on the technology/techniques I would use to accomplish this?

The url for creating core is "http://localhost:8983/Solr/admin/cores?action=CREATE&name=core-name&instanceDir=path/to/dir&config=solrconfig.xml&dataDir=data"
Here you can write a scheduler for it creating the core. Before creating the core you can check if the instanceDir exist. If not you can create the same and map it to the core creating url.
Next is solr core requires the configset, you can create your own configset and add the required files to it. Again map the the config set path to the solr core url.
Data dir is the path where indexes are stored. Create the folder and map the path of it to the solr core creation url.
You can do the same by adding all these values in the database like storing the values of configset, instanceDir etc in the tables. Use those values for creating the core. You can change these values in the database as required. You need not have to change the values at the code side. Without the code modification it will continue working.
if you are running it on unix, then you can run the cron job for creating the core as well.

Related

Update managed-schema file in solr in multiple cores at once

I am working on Solr. I have multiple cores with same fields and types(same schema).Every core has its own schema(managed-schema) file in Solr .I want to add new field to the schema for all cores.
I am doing it via admin panel for each core manually.Is there any way that I can add new fields to schema of all cores at once.
Configure your cores to use a configset instead.
On a multicore Solr instance, you may find that you want to share configuration between a number of different cores. You can achieve this using named configsets, which are essentially shared configuration directories stored under a configurable configset base directory.
From the reference manual:
If you are using Solr in standalone mode, configsets are created on the filesystem.
To create a configset, add a new directory under the configset base directory. The configset will be identified by the name of this directory. Then into this copy the configuration directory you want to share. The structure should look something like this:
/configset1
/conf
/managed-schema
/solrconfig.xml
/configset2
/conf
/managed-schema
/solrconfig.xml
The default base directory is $SOLR_HOME/configsets
To create a new core using a configset, pass configSet as one of the core properties. For example, if you do this via the CoreAdmin API:
curl http://localhost:8983/admin/cores?action=CREATE&name=mycore&instanceDir=path/to/instance&configSet=configset2
As far as I know there is no way to make an existing core use a config set, so you'll have to back up your configuration and cores, then remove the cores from Solr (do not delete the directories), then readd the cores with the configSet parameter set to the name of your configset.

Create new core in Solr with new schema without using solr command

I have a use case that is to create a new core in Solr with new schema dynamically in a program without pre-creating the schema and configurations in instancedir.
I have tried using Solr Core Admin API by calling to:
~/admin/cores?action=CREATE&name=core-name&configSet=basic_configs
And managed to create a new core with the schema of the basic_configs.
However, later I realised that when I change the schema in that core, the respective changes will reflect to the schema of basic_configs as well (as configSet is a shared configuration). Hence, I could not reuse the same API call to create subsequent new cores with new schema.
I understand that this could be achieved using solr command to create cores but I would like to have it in REST API or SolrJ way.
Also, I am not using Solr in SolrCloud mode.
You can give an explicit instanceDir when creating the core:
admin/cores?action=CREATE&name=core-name&instanceDir=path/to/dir&config=solrconfig.xml&dataDir=data
If there are no config sets, then the instanceDir specified in the CREATE call must already exist, and it must contain a conf directory which in turn must contain solrconfig.xml, your schema, which is usually named either managed-schema or schema.xml, and any files referenced by those configs.
The config and schema filenames can be specified with the config and schema parameters, but these are expert options. One thing you could do to avoid creating the conf directory is use config and schema parameters that point at absolute paths, but this can lead to confusing configurations unless you fully understand what you are doing.

How do I create a Solr core without creating the config file first?

I am making a Solr web-based application and one of the features is the user can create a core and schema to the Solr. My friend made it using child process by going to the directory of the Solr first and then using the command 'bin/solr create -c...' the core can be created. But I am thinking of another approach, like using the http api request. I found this.
http://localhost:8983/solr/admin/cores?action=CREATE&name=mycore&instanceDir=path/to/instance&configSet=configset2
But apparently, it cannot run properly because you need to make the config file first for the core. The error says like this.
Error CREATEing SolrCore 'mycore': Unable to create core [mycore] Caused by: Could not load configuration from directory/opt/solr/server/solr/configsets/configset2
So I am wondering what kind of approach I can do, since it seems like I can't make a core without setting up a config first. Or should I make an input menu with create core, create schema and only after the user clicks 'submit' it will process everything, from making a config file, creating schema, and then finally creating the core? I wonder if it's the best approach.
I am looking forward to any help.
You always need to provide a configuration when creating a core.
When your friend run the command, it actually used the default configuration data_driven_schema_configs, which you can confirm by reading the explanation from create_core command (create is an alias for create_core for non Cloud setup):
bin/solr create_core -h
The solr script copied that configuration and then created the core with it.
The example you showed is only valid for SolrCloud. If you are not using SolrCloud, you need to be using Core Admin API directly and manually setup the directory with configuration.
Notice that configsets are a bit of a tricky thing in the sense that if you create several cores from the same configset, that configset is shared and changes made to it by one core affect all of them. So, you most likely don't want to use them, but instead copy the configuration as I described above.

SOLR - need to create core dynamically (from php script)

I have one core defined as my template for creating the rest. I need to create new core from my template examle core. New core configuration will be only slightly different. Schema - the same, data config also the same, except some jdbc connection details (database schema/username/password).
I can make copy my core directory, add corresponding core definition to solrc.xml like this <core name="NewCore" instanceDir="NewCore" /> and then edit my data config xml file, then restart solr (webapp on tomcat).
It works, however I need all of this to be done automatically from php script. End user will create new page and there should be new core automatically created for it.
What is the best way to do what I want?
Solr exposes the CoreAdmin Handler, which allows you to do core management through a REST-ish interface.
Use CREATE for creating a new core (giving the relevant options where necessary).

Apache Solr setup for two diffrent projects

I just started using apache solr with it's data import functionality on my project
by following steps in http://tudip.blogspot.in/2013/02/install-apache-solr-on-ubuntu.html
but now I have to make two different instances of my project on same server with different databases but with same configuration of solr for both projects. How can I do that?
Please help me if anyone can?
Probably the closest you can get is having two different Solr cores. They will run under the same server but will have different configuration (which you can copy paste).
When you say "different databases", do you mean you want to copy from several databases into one join collection/core? If so, you just define multiple entities and possibly multiple datasources in your DataImportHandler config and run either all of them or individually. See the other question for some tips.
If, on the other hand, you mean different Solr cores/collections, then you just want to run Solr with multiple cores. It is very easy, you just need solr.xml file above your collection level as described on the Solr wiki. Once you get the basics, you may want to look at sharing instance directories and having separate data directories to avoid changing the same config twice (instanceDir vs. dataDir settings on each core).

Resources