How does solr loads configsets in examples provided by solr? - solr

I have started learning solr.I have downloaded the latest zip(5.1.0) provided by solr and run the server using bin/solr start -e cloud -noprompt.
I check that this internally calls
bin/solr start -cloud -s example/cloud/node1/solr -p 8983
bin/solr start -cloud -s example/cloud/node2/solr -p 7574 -z localhost:9983
I check that these is no config(conf/solrconfig.xml) defined in example/cloud/node1/solr so how does solr load config from the SOLR_HOME/configsets directory?
I read the documentation on several places but i am still unable to figure out the use of cloud like in 'bin/solr start -cloud -s ... ' and use of zookeeper.
Please help.

when you are working on solr cloud with zookeeper, you have to upload your solr config on zookeeper.
./bin/solr zk -upconfig -z localhost:2181,localhost:2182,localhost:2182 -n my-config -d server/solr/files/conf/
using upconfig you can upload your solr config, only have to provide path of your config directory.
You can use config name(my-config) for create core using api.
http://XXX.XXX.XXX.XXX:8983/solr/admin/collections?action=CREATE&name=irTest&numShards=3&replicationFactor=2&maxShardsPerNode=3&collection.configName=my-config
So it will create core using your config only.

Download the latest version of the Apache solr reference guide.
https://www.apache.org/dyn/closer.cgi/lucene/solr/ref-guide/apache-solr-ref-guide-5.1.pdf
check this section in the PDF.
Configuration Directories and SolrCloud
Since you are not specifying a specific configset, the default is loaded.
First, if you don't provide the -d or -n options, then the default
configuration ($SOLR_HOME/server/solr/con
figsets/data_driven_schema_configs/conf) is uploaded to ZooKeeper
using the same name as the collection

Related

How to start Solr Cloud collection without -e -cloud

I use Solr Cloud for my project and I always start it with bin/solr start -e -cloud that always prompt all the questions for making a new collection or re-use one.
But is there a command that can directly start the collection I want, like bin/solr start [collectionName] -cloud -p 8983?
I didn't find anything in the Solr Manual.
Thanks :)
All collections are available by default, so starting Solr in cloud mode should be enough.
bin/solr start -c
The default port is 8983, so there is no need to give the -p parameter unless you're changing it.

Setting Up Apache Solr in Cloud Mode

I have to do the following:
I have to deploy Solr on 2 servers/nodes.
Deploy Zookeeper on another server.
Upload a custom config to Zookeeper
Create a custom collection with 2 shards and 2 replicas
Version of Solr 7.4.0 & Zookeeper: 3.4.12
I have done the following:
Set Up Zookeeper:
Created a Zookeeper data folder & made a zoo.conf & put the dataDir there.
Started zookeeper using ./zkServer.sh start
Set up Solr:
Started Solr using:
./solr start -cloud -s /home/demo/LocalFolder/Downloads/SolrHome -p 8987 -z localhost:2181
Trying to upload config in Zookeeper using:
./solr create -c mycollection -d /media/sf_VM/Dump/conf
It is giving me an exception:
Caused by: javax.servlet.UnavailableException: Error processing the request. CoreContainer is either not initialized or shutting down.
at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:341)
at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:323)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1634)
at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:533)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:146)
at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:257)
at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1595)
at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:255)
at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1253)
at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:203)
at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:473)
at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1564)
at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:201)
at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1155)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:144)
at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:219)
at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:126)
... 17 more
</pre>
I have searched many pages & seen Solr tutorials but there they have used the default examples. I just dont have any step by step idea to
How to Upload a config in Zookeeper?
Then what I need to do to create a collection pointing to that config. I want that collection to have 2 shards & 2 replicas.
Where will be the solr.xml. If it should be in Zookeeper how do I upload it there
How do I see in Zookeeper that the config has been uploaded?
I know this question might be a duplicate. I have read several of posts but not able to come up a solution. Please help.
Well I figured out how to setup. Please note that I have read these steps from different site & gathered here step by step:
1. Set Up ZooKeeper:
This is used to keep the collection specific configuration files to store in a central space & map it with a name. Later we can use this name to create collection which would point to this config. This config is the same as you find in the below folder:
solr-7.4.0\server\solr\configsets\sample_techproducts_configs\conf
1.1 Download latest Zookeeper. I have used 3.4.12
1.2 Unpack downloaded archive and copy conf/zoo_sample.cfg to conf/zoo.cfg
1.3 Modify zoo.cfg:
1.3.1 Change dataDir to directory where you want to hold all cluster configuration data.
dataDir=/var/zookeeper/data
1.3.2 Add information about all Zookeeper servers: I have used only 1 ZooKeeper server so this is not required. If you want to add more servers then please go through the below links:
DZone Tutorial
Apache Tutorial
1.3.3 Start ZooKeeper using (after going to zookeeper-3.4.12/):
./bin/zkServer.sh start-foreground conf/zoo.cfg
OR
./bin/zkServer.sh start conf/zoo.cfg
Note: You can stop the ZooKeeper using the below command:
bin/zkServer.sh stop
1.3.4 ZooKeeper Status:
Hit the below:
bin/zkServer.sh status
Or do telnet localhost 2181 and hit stats when connected.
2. Set Up Solr
2.1 Download the Solr
2.2 Extract install_solr_service.sh from the .tar file. Solr includes a service installation script (bin/install_solr_service.sh) to help you install Solr as a service on Linux. For more info click here.
tar -xzf solr-7.4.0.tgz solr-7.4.0/bin/install_solr_service.sh --strip-components=2
2.3 Install Solr as a service using the above script:
sudo bash ./install_solr_service.sh solr-7.4.0.tgz
This will also extracting solr-7.4.0.tgz to /opt/solr
2.4 Go to /opt/solr and do the following:
mkdir solr/server/solr2
mkdir solr/server/solr3
mkdir solr/server/solr4
cp solr/server/solr/solr.xml solr/server/solr2
cp solr/server/solr/solr.xml solr/server/solr3
cp solr/server/solr/solr.xml solr/server/solr4
2.5 Change the jetty port in solr.xml. Do this for all the 3 solr.xml mentioned in the above step:
vi solr/server/solr2/solr.xml
Search for the port 8983 & change it 8984 (for solr2), change it 8985 (for solr3), change it 8986 (for solr4)
2.6 Stop the Solr running at 8983
root#dev-base:/opt/solr/bin# ./solr stop -p 8983
2.7 Start all the solr instances:
root#dev-base:/opt/solr# bin/solr start -c -s server/solr -p 8983 -z localhost:2181 -noprompt -force
root#dev-base:/opt/solr# bin/solr start -c -s server/solr2 -p 8984 -z localhost:2181 -noprompt -force
root#dev-base:/opt/solr# bin/solr start -c -s server/solr3 -p 8985 -z localhost:2181 -noprompt -force
root#dev-base:/opt/solr# bin/solr start -c -s server/solr4 -p 8986 -z localhost:2181 -noprompt -force
Note: Running solr as a root is not recommended for security reason.
2.8 See Solr Status:
root#dev-base:/opt/solr# bin/solr status
3 Make a Custom Config
3.1 Copy the conf directory from solr-7.4.0\server\solr\configsets\sample_techproducts_configs\conf to another location (In my case it is /media/sf_VM/Dump/new/conf ).
3.2 Change the managed-schema file inside conf to specify the fields you are using.
4 Upload the config to ZooKeeper:
root#dev-base:/opt/solr# bin/solr zk -z localhost:2181 upconfig -d /media/sf_VM/Dump/new/conf -n myConf6
The name of the config I have uploaded is myConf6
5 Create a Solr Collection using this custom config
root#dev-base:/opt/solr# bin/solr create -c myNewCollection -n myConf6 -shards 2 -replicationFactor 2 -force
Hit Solr Admin URL
6 Index Data Using POST API using Json
URL: http://localhost:8983/solr/myNewCollection/update
Method: POST
Body:
[{
"_id": "99999999999999999999",
"author": [
"New Inserted 9000"
],
"authorLastName": [
"New Inserted 9000"
],
"impn": "New Inserted 9000",
"isbn10": "9999999999",
"isbn13": "9999999999999",
"title": "New Inserted 9000",
"publisher": "New Inserted 9000",
"sales_a": 5.0,
"sales_t": 5.0,
"haveImage": 1,
"pages": "76",
"image": "http://ip.ip.com/is/image/",
"format": "Paper",
"mtc_id": "99999999999",
"subjects" : [
"9000"
]
"rating": 0,
"description_long": "Snahashis call me in your marriage."
}
Delete a configuration in Zookeeper:
If you want to delete an old/wrong configuration already uploaded to ZooKeeper run the below command:
./server/scripts/cloud-scripts/zkcli.sh -cmd clear -z "<ZK_HOST>:<ZK_PORT>" /configs/AAA
The path of the configuration is /configs/< name of the configset >
To delete a specific file:
zkcli.sh --zkhost <ZK_HOST>:<ZK_PORT> -cmd clear /configs/<MY_COLLECTION>/solrconfig.xml
Upload an updated file:
zkcli.sh --zkhost <ZK_HOST>:<ZK_PORT> -cmd putfile /configs/<MY_COLLECTION>/solrconfig.xml /<MY_UPDATED_FILE_LOCAL_FOLDER>/solrconfig.xml
Then we need to restart the solr nodes.
To Delete a Collection Via API:
First delete alias created on this collection (if any)
http://localhost:8983/solr/admin/collections?action=DELETEALIAS&name=aliasName
Delete Collection:
http://localhost:8983/solr/admin/collections?action=DELETE&name=collectionName

How do we edit config files uploaded to zookeeper ensemble

I have uploaded solr config files to zookeeper and created collection indexed few documents into collection. Now I want to update schema file. Is it possible to edit the schema config file uploaded to zookeeper. ? If yes how do we do it. some source say upload updated schema config file to zookeeper which overwrites old one.
uploading updated config file is not tedious task But want to know if there is a way to edit the existing config file in zookeeper.
Thanks in advance,
vinod
To update schema download latest configset associated with schema
Latest configset can be downloaded using following command:
solr zk -downconfig -d directory to download -n configset name -z ip:port of zookeeper
for example : solr zk -downconfig -d C:\solr\workingConfig -n configsetName -z localhost:2181
make required changes
Then Upload latest configset to zookeeper using following command.
solr zk -upconfig -d directory to upload -n configset name -z ip:port of zookeeper
For example: solr zk -upconfig -d C:\solr\workingConfig -n configsetName -z localhost:2181
Then you need to Reload schema so that changes can take effect immediately.
Schema can be reloaded using following command:
http://ip:port/solr/admin/collections?action=RELOAD&name=

Solr : Path for the config files in embedded zookeeper

I am using solr-6.0.0
Using the cloud example,
I started solr in cloud mode using the following commands
bin/solr start -cloud -p 8983 -s "example/cloud/node1/solr"
bin/solr start -cloud -p 7574 -s "example/cloud/node2/solr" -z localhost:9983
I would like to index data from my database. Had it been stand-alone mode I would have edited the managed-schema and solrconfig.xml files accordingly. But for cloud-mode I cannot find those files.
According to the docs :
Note that the SolrCloud example does not include a conf directory for
each Solr Core (so there is no solrconfig.xml or Schema file). This is
because the configuration files usually found in the conf directory
are stored in ZooKeeper so they can be propagated across the cluster.
So where can I edit those files or do I need to upload a new set of config files and override the already uploaded ones?
Found this in the docs:
See the section : Uploading configs using zkcli or SolrJ
at Uploading configs using zkcli
You can do something like this to push a file :
zkcli.sh -zkhost localhost:2181 -cmd putfile /solr.xml /path/to/solr.xml
and something like this to upload the config files:
./server/scripts/cloud-scripts/zkcli.sh -zkhost localhost:9983 -cmd upconfig -confname <conf-name> -confdir <path-of-the-local-conf-dir>

docker of solr and nutch working together?

I'm trying to activate nutch and solr as dockers out of the box to consume that in REST.
Tried
same docker - started it with
docker run --name nutch_solr -d -p 8899:8899 -p 8983:8983 -t momer/docker-solr-nutch:4.6.1
can't surf to ..:8983, or ...:8899
Tried different dockers - nutch and solr
started it with:
docker run --name my_nutch -d -p 8899:8899 -e SOLRURL=192.168.99.100:8983 -t meabed/nutch
Solr is up and running, but can't really work with nutch (my question)
tried more another nutch docker - what am I missing? I'd like to write a post on how to raise solr and nutch dockers in 5 minutes, but I just can't seem to work with it..
so, does someone know how to activate it including send a sample of crawling job for work?

Resources