Solr - Collections API timeouts - solr

I've got a setup with 3x zoo keeper's and 4x solrcloud node's.
This is all working, all nodes are seeing each other and I initially had a default collection.
From there, I used the collections API to create a new collection which successfully completed and all it's successfully sharded across 2 nodes, with the other 2 being used for replica's. I can also successfully save documents to that collection. Browsing the solr web GUI on any of the boxes all works, no speed issues.
However, anytime I try to use the collections API I get timeouts. Creating a new collection, reloading one of the existing collections, deleting a collection... all of them timeout.
Any thoughts on why would be much appreciated
Cheers

I have also faced similar issue:
Solr process 24214 running on port 8983
Failed to get system information from http://localhost:8983/solr/ due to: org.apache.solr.client.solrj.SolrServerException: clusterstatus the collection time out:180s
at org.apache.solr.util.SolrCLI.getJson(SolrCLI.java:537)
at org.apache.solr.util.SolrCLI.getJson(SolrCLI.java:471)
at org.apache.solr.util.SolrCLI$StatusTool.getCloudStatus(SolrCLI.java:721)
at org.apache.solr.util.SolrCLI$StatusTool.reportStatus(SolrCLI.java:704)
at org.apache.solr.util.SolrCLI$StatusTool.runTool(SolrCLI.java:662)
at org.apache.solr.util.SolrCLI.main(SolrCLI.java:215)
So to solve this issue I have followed given instruction and resolved it.:
Stop all Solr instances
Stop all Zookeeper instances
Start all Zookeeper instances
Start Solr instances one at a time.

Such timeout can occur when Solr is not able to obtain cluster state. If following call is results in timeout, then this is the case
http://solr-hostname:8983/solr/admin/collections?action=CLUSTERSTATUS&wt=json
This may be caused by incorrect entries present in /clusterstate.json
To fix this:
get clusterstate from ZooKeeper by calling
zkcli.sh -zkhost localhost:2181 -cmd get /clusterstate.json > clusterstate.json
edit extracted clusterstate.json file and remove sections with wrong IPs or not existing hosts
clear the clusterstate in ZooKeeper by calling
zkcli.sh -zkhost localhost:2181 -cmd clear /clusterstate.json
save corrected state in ZooKeeper by sending updated JSON file
zkcli.sh -zkhost localhost:2181 -cmd putfile /clusterstate.json ./clusterstate.json`
restart Solr instances
After that, if your clusterstate shows correct info, you should no longer have timeouts when accessing Collections API.
Note
Be careful when editing clusterstate JSON, limit your changes only to removing not existing hosts/replicas/shards.

I also had timeout issues with the collections API. To fix this problem, I added the server's IP address to the solr.xml file that you find in /var/solr/data/solr.xml. My setup consists of 3 Ubuntu servers that run ZooKeeper (3.4.6) and SolrCloud (5.2.1) on each server.

Ended up being Zoo Keeper config mismatch

Related

SOLR CLOUD - Uploading LTR features

We have a SOLR Cloud cluster with 4 nodes. Collections are created with 4 shards and 2 replicas.
I was using REST endpoints (pointing to a single instance for all operations), to create feature(s) and model(s).
http://{{SOLRCLOUD-HOST}}:8983/solr/{{ACTIVE_INDEX_NAME}}/schema/feature-store
http://{{SOLRCLOUD-HOST}}:8983/solr/{{ACTIVE_INDEX_NAME}}/schema/model-store
When I execute REST endpoint to fetch the existing feature(s) and models(s)
http://{{SOLRCLOUD-HOST}}:8983/solr/{{ACTIVE_INDEX_NAME}}/schema/feature-store
http://{{SOLRCLOUD-HOST}}:8983/solr/{{ACTIVE_INDEX_NAME}}/schema/model-store
I see my feature/model created sometimes and the other times it says they don't exist.
At this point, when restart my cluster, thre GET calls always return the created features and models.
Couple of questions -
Like config sets, is there a way to upload features and models without using REST endpoint?
Is restart required after uploading features and models.
Should the feature/mode be executed against all collections in the cluster (assume I have more than one collection with the same data created for different purpose, plz don't ask why, I have them)
Are the features/models created available for collections created in the future with the same config set, I ask this question because the feature/model uploaded is seen inside the config set as - _schema_model-store.json and _schema_feature-store.json
Please advice. Thanks!
Did you find any answers?
I was stuck with feature-store not being available on all shards. Your suggestion of restarting solr helped. Is that the permanent solution?
To answer your Question #3:
You need to upload the features/models for each collection, since collection is part of the upload url, notice the "techproducts" in feature upload example from solr doc:
curl -XPUT 'http://localhost:8983/solr/techproducts/schema/feature-store' --data-binary "#/path/myFeatures.json" -H 'Content-type:application/json'
Just reload the collection to make the feature and model json file to be available on all shards of the collection. The restart of solr is not required.

solr error Error getting leader from zk

org.apache.solr.common.SolrException: There is conflicting information about the leader of shard: shard2 our state says:http://xxxxx:9003/solr/collectionname_shard2_replica1/ but zookeeper says:http://xxxxxx:9006/solr/collectionname_shard2_replica1/
at org.apache.solr.cloud.ZkController.getLeader(ZkController.java:1013)
at org.apache.solr.cloud.ZkController.register(ZkController.java:940)
at org.apache.solr.cloud.ZkController.register(ZkController.java:883)
at org.apache.solr.core.ZkContainer$2.run(ZkContainer.java:184)
above mentioned error is displayed in solr admin console. 9003 is valid instance. I want to remove 9006 from clusterjson and leader file. How??
Look into your solr gui under cloud -> tree. Make sure that in the folder /overseer_elect/election are only your current solr instances.
A simple way to recognize if there are dead solr instances in the /overseer_elect/election folder is to shutdown solr and then use the zkCli.sh zookeeper script to look into the /overseer_elect/election folder. If you still have files in this folder, you have dead solr instances. To solve this issue remove this instances with the zkCli.sh script and restart solr.

Solr with Zookeeper and own schema

I'm a newbie to SOLR and there's a problem I can't solve so far: When I'm starting SOLR cloud with Zookeeper I like to create a collection with a personal schema. However, SOLR only loads the default 'example-data-driven-schema'.
Any suggestion what I should do in order to put my defined schema to it?
In order to create a new collection with your own schema, you need to use zkCli.sh and SolrCloud Collection API.
In particular, you could:
a) upload in Zookeeper (using Solr zkCli) the configuration directory for your new collection, for instance in
<my_new_config>
Examples of Solr ZkCli commands to upload your changes in ZooKeeper can be found here.
In particular, if you want to upload your configuration directory on Zk, you can:
STEP 1) run the command:
./server/scripts/cloud-scripts/zkcli.sh -zkhost 127.0.0.1:9983 \   -cmd upconfig -confname my_new_config -confdir server/solr/configsets/basic_configs/conf
STEP 2) Restart your Solr nodes so they can pick up the configuration changes.
Please remember that if you wish to replace an existing file in Zk you will need to use zkCli.sh clear to delete the existing one from ZooKeeper and then the putfile command to add the new one.
b) call the following API from your browser:
/admin/collections?action=CREATE&name=<my_collection_name>&collection.configName=<my_new_config>

Solr Cloud : no servers hosting shard

We have a cluster of standalone Solr cores (Solr 4.3) for which we had built some custom plugins. I'm now trying to prototype converting the cluster to a Solr Cloud cluster. This is how I am trying to deploy the cores (in 4.7.2).
Start solr with zookeeper embedded.
java -DzkRun -Djetty.port=8985 -jar start.jar
upload a config into Zookeeper (same config as the standalone cores)
zkcli.bat -zkhost localhost:9985 -cmd upconfig -confdir myconfig -confname myconfig
Create a new collection (mycollection) of 2 shards using the Collections API
http://localhost:8985/solr/admin/collections?action=CREATE&name=mycollection&numShards=2&replicationFactor=1&maxShardsPerNode=2&collection.configName=myconfig
So at this point I have two shards under my solr directory with the appropriate core.properties
But when I go to http://localhost:8985/solr/#/~cloud, I see that the two shards' status is "Down" when they are supposed to be active by default.
And when I try to index documents in them using SolrJ (via CloudSolrServer API) , I get the error "No live SolrServers available to handle this request". I restarted Solr but same issue.
private CloudSolrServer cloudSolr;
cloudSolr = new CloudSolrServer(zkHOST);
cloudSolr.setZkClientTimeout(zkClientTimeout);
cloudSolr.setDefaultCollection(collectionName);
cloudSolr.connect();
cloudSolr.add(doc)
What am I doing wrong? I did a lot of digging around and saw an old Jira bug saying that Solr Cloud shards won't be active until there are some documents in the index. If that is the reason, that's kind of like a catch-22 isn't it?
So anyways, I also tried adding some test documents manually and committed to see if things improved. Now on the shard statistics page, it correctly gives me the Numdocs count but when I try to query it says "no servers hosting shard". I next tried passing in shards.tolerant=true as a query parameter and search, but no cigar. It says 0 documents found.
Any help would be appreciated. My main objective is to rebuilt the old standalone cores using SolrCloud and test to see if our custom requesthandlers still work as expected. And at this point, I can't index documents inside of the 4.7 Solr Cloud collection I have created.
Thanks and Regards

Query solr cluster for state of nodes

I'm trying to tweak our system status check to see the state of the Solr nodes in our SolrCloud. I'm facing the following problems:
We send a query to each of the Solr nodes separately. If we get a response and the status of the response is 0, we assume the node is running. Unfortunately, we've seen cases in which the node is recovering or even down and select queries are still handled.
In hope to prevent this, we've added a check which sends a ping request to solr. If the status returned by this is request reads 'OK' we assume the node is up. Unfortunately even with this request, if the node is recovering or down, this check won't fail.
My question is: What is the correct way to check the status of a node in SolrCloud?
If you are using a SolrCloud, it's recommended to maintain an explicit zookeeper ensemble as well. Because zookeeper ensemble maintains the SolrCloud's current status of each node and each shard wise. This status is actually get reflected from the SolrCloud admin window.
Go to the Admin window. Click on "Cloud".
Then click on "Tree" to get a tree view of your SolrCloud architecture.
Click /clusterstate.json to view the SolrCloud status.
This (clusterstate.json) json file holds the SolrCloud status information. Now if you are running an explicit zookeeper ensemble, following are the steps to get SolrCloud status.
Go to the path "zookeeper/installation/directory/bin"
Execute ./zkCli.sh -server ZK_IP:ZK_PORT (E.g ./zkCli.sh -server localhost:2181)
Execute get /clusterstate.json
You'll find the SolrCloud status.
Note : ZK_IP - The HOST IP where zoopeeper is running.
ZK_PORT - Zookeeper's client port.
You actually don't want /clusterstate.json - as this only covers the case where collections are already present. From ZooKeeper you need /live_nodes
Because Zookeeper is the authority for what Solr Nodes are members of the Solr cloud cluster, it follows that you should go to it first, to discover what members are accessible. This is how all Solr cloud clients work, and probably is the best way to approach the problem.
/live_nodes contains a file for each live Solr node, regardless of what collections exist or where the replicas are located.
Once you have resolved /live_nodes... you can call clusterstatus on any Solr instance with the address and port from one of the live-nodes.
http://localhost:8983/solr/admin/collections?action=clusterstatus&wt=json
clusterstatus provides a detailed overview of Solr nodes, collections, replicas, etc. Everything you would want to know.
As a final note, it's very wise to set SOLR_HOST inside of solr.in.sh configuration (/etc/default/solr.in.sh) - by default 'localhost' is used to reference the solr node. Setting this value to the public address you want the Solr node identified by will prevent ZooKeeper from returning the address "localhost" to clients when attempting to reach a Solr Node.

Resources