Solrcloud - remove a node - solr

I have a Solrcloud setup, which runs 3 ZKs and 3 Solrs (version 4.10.3). I would like to take out one of the Solr servers completely from this setup so that I only have 2 Solrs and 3 ZKs.
I've tried googling and the only results I can find are to remove replicas or collections but not to remove a node.
Any idea how I can remove a node in SolrCloud?

you can first remove shards and replica. using below command
<SOLR_URL>/solr/admin/collections?action=DELETEREPLICA&collection=<Collection_name>&shard=<shard_name>&replica=<replica_node_naeme>
The shard name (shard) and replica node name (coreNodeName) values can be found in the core.properties file located under /solr/server/solr/.
For example:
http://localhost:9501/solr/admin/collections?action=DELETEREPLICA&collection=wblib&shard=shard1&replica=core_node1
Then you can delete the installation files for the removed Solr node.

Related

On Solr Cloud(with zookeeper setup) a perticular node tlog fille growing unexceptionaly

I have cluster of 7 nodes (6 solr + 1 zookeper) in my local system setup. My used configuration is solr 5.5.0 and zookeeper-3.4.8. Actually I am having a testing environment in local before moving to production environment. For that I have created 6 instances of solr as well as Zookeeper (1 instance of each on a laptop) and 1 instance of zookeeper especially on a laptop. Now when I test it in local using soap request My 1 node's tlog have unexceptional size at same time rest are fine.
what can be issue, how should I approach to rectify. I am not able to understand what is problem because my rest nodes are fine except one node.
Note:- Every machine(laptop) have same solrconfig.xml.

Solr AutoScaling - Add replicas on new nodes

Using Solr version 7.3.1
Starting with 3 nodes:
I have created a collection like this:
wget "localhost:8983/solr/admin/collections?action=CREATE&autoAddReplicas=true&collection.configName=my_col_config&maxShardsPerNode=1&name=my_col&numShards=1&replicationFactor=3&router.name=compositeId&wt=json" -O /dev/null
In this way I have a replica on each node.
GOAL:
Each shard should add a replica to new nodes joining the cluster.
When a node are shoot down. It should just go away.
Only one replica for each shard on each node.
I know that it should be possible with the new AutoScalling API but I am having a hard time finding the right syntax. The API is very new and all I can find is the documentation. Its not bad but I am missing some more examples.
This is how its looks today. There are many small shard each with a replication factor that match the numbers of nodes. Right now there are 3 nodes.
This video was uploaded yesterday (2018-06-13) and around 30 min. into the video there is an example of the Solr.HttpTriggerListener that can be used to call any kind of service, for example an AWS Lamda to add new nodes.
The short answer is that your goals are not not achievable today (till Solr 7.4).
The NodeAddedTrigger only moves replicas from other nodes to the new node in an attempt to balance the cluster. It does not support adding new replicas. I have opened SOLR-12715 to add this feature.
Similarly, the NodeLostTrigger adds new replicas on other nodes to replace the ones on the lost node. It, too, has no support for merely deleting replicas from cluster state. I have opened SOLR-12716 to address that issue. I hope to release both the enhancements in Solr 7.5.
As for the third goal:
Only one replica for each shard on each node.
To achieve this, a policy rule given in the "Limit Replica Placement" example should suffice. However, looking at the screenshot you've posted, you actually mean a (collection,shard) pair which is unsupported today. You'd need a policy rule like the following (following does not work because collection:#EACH is not supported):
{"replica": "<2", "collection": "#EACH", "shard": "#EACH", "node": "#ANY"}
I have opened SOLR-12717 to add this feature.
Thank you for these excellent use-cases. I'll recommend asking questions such as these on the solr-user mailing list because not a lot of Solr developers frequent Stackoverflow. I could only find this question because it was posted on the docker-solr project.

How to scale and distribute the SOLR CLOUD nodes

I have initially setup the SOLR CLOUD with two solr nodes as shown below.
I have to add a new solr node (i.e) with additional shard and same number of replicas with the existing SOLR CLUSTER nodes.
I have already gone through the SOLR scaling and distributing https://cwiki.apache.org/confluence/display/solr/Introduction+to+Scaling+and+Distribution
But the above link contains information of scaling only for SOLR standalone mode. That's the sad part.
I have started the SOLR CLUSTER nodes using the following command
./bin/solr start -c -s server/solr -p 8983 -z [zkip's] -noprompt
Kindly share the command command for creating the new shard for adding new node.
Thanks in advance.
From my knowledge am sharing this answer.
Adding a new SOLR CLOUD /SOLR CLUSTER node is that having the copy of
all the SHARDs into the new box(through replication of all SHARDs).
SHARD : The actual data is equally splitted across the number of SHARDs we create (while creating the collection).
So while adding the new SOLR CLOUD node make sure that all the SHARD should be available on the new node(RECOMENDED) or as required.
Naming Standards of SOLR CORE in SOLR CLOUD MODE/ CLUSTER MODE
Syntax:
<COLLECTION_NAME>_shard<SHARD_NUMBER>_replica<REPLICA_NUMBER>
Example
CORE NAME : enter_2_shard1_replica1
COLLECTION_NAME : enter_2
SHARD_NUMBER : 1
REPLICA_NUMBER : 1
STEPS FOR ADDING THE NEW SOLR CLOUD/CLUSTER NODE
Create a core with the common collection name as we used in the existing SOLR CLOUD nodes.
Notes while creating a new core in new node
Example :
enter_2_shard1_replica1
enter_2_shard1_replica2
From the above example the maximum value/number of repilca of the corresponding shard is 2(enter_2_shard1_replica2)
So in the new node while creating a core, give the replica numbering as 3 "enter_2_shard1_replica3" so that SOLR will take this as the third replication of the corresponding SHARD.
Note : replica numbering should be in a incremental oreder of 1
Give time to replicate the data from the existing node to the new node.

Rebuild Solr index in a specific Search node

I've accidentally cancelled the Solr index build in one my Search nodes. How do I restart the indexing on that node?
nodetool rebuild_index doesn't work. The command exits almost immediately - probably because it is meant to work with native Cassandra indexes whereas my table's indexes are of custom type "com.datastax.bdp.search.solr.Cql3SolrSecondaryIndex"
Clicking the "Reindex"/"Full reindex" button in the Solr core admin UI, on the other hand, will trigger the re-indexing of the whole columnfamily across all Search nodes.
Is there a way to trigger the indexing in that node only? I'm using DSE 4.0.1 (Cassandra 2.0.5, Solr 4.6.0.1)
In order to reindex a single node, you have to reload its core with the reindex=true and distributed=false parameters, as explained in: http://www.datastax.com/documentation/datastax_enterprise/4.0/datastax_enterprise/srch/srchReldCore.html

Solr cloud distributed search on collections

Currently I have a zookeeper instance controlling replication on 3 physical servers. It is the solr integrated zookeeper. 1 shard, 1 collection.
I have a new requirement in which I will need a new static solr instance (1 new collection, no replication). Same schema as previous collection. A copy of this instance will also be placed on the 3 physical servers mentioned above. A caveat is that I need to perform distributed searches across the 2 collections and have the results blended.
Thanks to javacreed I now know that sharding is not in my solution. Previous questions answers here and here.
In my current setup I run the following command on the server running zookeeper -
java -Dbootstrap_confdir=solr/myApp/conf -Dcollection.configName=myConfig -DzkRun -DnumShards=1 -jar start.jar
Am I correct in saying that this will not change and I will now also manually start the non replicated collection. I really only need to change my search queries to include the 'collection' parameter? Something like -
http://localhost:8983/solr/collection1/select?collection=collection1,collection2
This example is from Solr documentation. I am slightly confused as to whether it should be ...solr/collection1/select?... or ...solr/collection2/select?... or if it even matters?
Thanks
Thanks for your kind word stewart.You can search it directly on solr as
http://localhost:8983/solr/select?collection=collection1,collection2
There is no need to mention any collection path since you are defining them in the collection parameters.

Resources