I am having an interesting situation with SolrCloud. Basically, I dont know why, but Solr instance, which does not in the cloud normally, is displayed on SolrCloud page and also visible in live_nodes path in Zookepeer.
Here are details about the situation:
I have one Solr instance, running as a standalone application on a virtual machine, located on a remove machine. We will call it virtual1 from now on.
This is the script for running it:
java
-server
-XX:+UnlockExperimentalVMOptions
-XX:+UseG1GC
-XX:+UseCompressedStrings
-Dcom.sun.management.jmxremote
-d64
-Xmx4096m
-Dcom.sun.management.jmxremote.port=9999
-Dcom.sun.management.jmxremote.ssl=false
-Dcom.sun.management.jmxremote.authenticate=false
-Djava.rmi.server.hostname=remotehost -jar start.jar
this instance is running on port 8983, so when you go to virtual1:8983, you see the classical admin page of solr. the rest of the configs are all the same with example solr, which comes with solr distrubutions.
Then, on my local machine (will be called local from now on), I am running my zookeeper servers on ports 2181 and 2182
then for adding my solr instances to the cloud, I am simply running one instance on my local machine and two more on virtual1 and the scripts for starting them as are below:
Solr Instance on my local:
java -Dbootstrap_conf=true -DzkHost=zkhost:2181 -Djetty.port=8984 -jar start.jar
Solr Instances on remote:
java -DzkHost=zkhost:2181 -Djetty.port=8985 -jar start.jar
java -DzkHost=zkhost:2182 -Djetty.port=8986 -jar start.jar
Until here, there are no exceptions or errors in either Solr or Zookeeper logs.
When I check virtual1:8985 and virtual1:8986 they are all running, as well as the instance on my local.
But when I check the cloud (both from Solr admin page and also zookeeper CLI). i can only see local:8983 and virtual1:8983 in the cloud, while virtual1:8985 and virtual1:8986 are not added at all... The weird point is that virtual1:8983 doesnt know anything about Zookeeper servers as you can see from the starting scripts above.
In addition the facts above, I have tried another thing. On another virtual machine(virtual2) which is running on the same hard machine with virtual1
, I have created Solr instances with:
java -DzkHost=zkhost:2181 -Djetty.port=8985 -jar start.jar
java -DzkHost=zkhost:2182 -Djetty.port=8986 -jar start.jar
So in this case I will have instances as virtual2:8985 and virtual2:8986, which should be in the cloud. But it doesnt happen... I can only see virtual2:8983, which does not exists in real. It simply shows the standalone Solr's port, which is running on virtual1.
can anyone explain why is this happening?
You should try to remove zookeeper data and specify -Dbootstrap_confdir=/path/to/conf/dir which pushes configs and node state to zookeeper. Also if you're running zookeeper ensemble check their state (leader or follower, or if there is no connections to node) with
echo stat | nc zk_host zk_port
And you can use zkCli.sh to see cluster state with
get /clusterstate.json
And show zookeeper config, they should look something like this:
clientPort=2181 (2182 for second instance)
server.1=zk_host:2888:3888
server.2=zk_host:2888:3888
And check both Solr and Zookeeper logs for the successfull connections.
Related
For a customer, I need to write a search engine running on Linux. I am using SolrJ and did not configure anything else so far.
I followed https://lucene.apache.org/solr/guide/7_4/using-solrj.html#common-build-systems and thus added SolrJ in the project pom.xml, and also that tutorial.
The SolR client is instanciated like :
solrClient = new HttpSolrClient.Builder(
GeneralSettings.getRootSolrPath() + "/" + getCollectionName()).
build();
But for any query or commit I keep getting org.apache.solr.client.solrj.SolrServerException: Server refused connection at: http://localhost:8983/solr/test. I read http://lucene.472066.n3.nabble.com/Default-query-error-quot-Server-refused-connection-quot-td4010806.html but I am already using the expected port.
My understanding of the java doc SolrClient ’s handle the work of connecting to and communicating with Solr, and are where most of the user configuration happens. is that I only need to import the jar and then everything will work out of the box.
But as I keep getting this "Server refused connection" error I may have to configure something, but I could not find how to configure SolrJ (use solrconfig.xml or core.properties or call System.setProperty or call an API).
Please note that Apache may be running somewhere because I used to test some sites on it.
So how to get rid of this "Server refused connection" error?
Any help or tutorial to set SolrJ up based on Solr available doc would be very much appreciated,
Edit 2018-08-12 16:10
I thought SolrJ could work like Lucene, without a server, but it looks that I missed one essential piece: installing Solr (see https://www.baeldung.com/apache-solrj). I'll give it a try and post updates.
In case it might help someone else starting with SolrJ here are the steps I did to get rid of the error mentionned in the title (actually I followed https://www.baeldung.com/apache-solrj).
Downloaded the latest binary release of Solr
Extracted it somewhere
CDed into that dir
Lauched bin/solr start from that dir
Created a core with bin/solr create -c coreName (maybe another way exists but I hadn't been able to make it work so far)
Then Solr was running and listening on port 8983, and my Java app could connect to it via SolrJ.
org.apache.solr.common.SolrException: There is conflicting information about the leader of shard: shard2 our state says:http://xxxxx:9003/solr/collectionname_shard2_replica1/ but zookeeper says:http://xxxxxx:9006/solr/collectionname_shard2_replica1/
at org.apache.solr.cloud.ZkController.getLeader(ZkController.java:1013)
at org.apache.solr.cloud.ZkController.register(ZkController.java:940)
at org.apache.solr.cloud.ZkController.register(ZkController.java:883)
at org.apache.solr.core.ZkContainer$2.run(ZkContainer.java:184)
above mentioned error is displayed in solr admin console. 9003 is valid instance. I want to remove 9006 from clusterjson and leader file. How??
Look into your solr gui under cloud -> tree. Make sure that in the folder /overseer_elect/election are only your current solr instances.
A simple way to recognize if there are dead solr instances in the /overseer_elect/election folder is to shutdown solr and then use the zkCli.sh zookeeper script to look into the /overseer_elect/election folder. If you still have files in this folder, you have dead solr instances. To solve this issue remove this instances with the zkCli.sh script and restart solr.
I am using Apache Solr 6.5 with Query Elevation Component (/elevate endpoint) to support elevation of particular documents in my queries.
I am using elevate.xml file in
<instanceDir>/conf/<config-file>
directory. However on production server (Solr configured as cloud with zookepper to maintain configuration) changes in the elevate.xml are not loaded. To check if it works, I just query the elevated phrase and as a result I get elevated documents from previous version of elevate.xml.
Of course I am restarting all Solr cloud instances after loading new version of elevate.xml and updated file is visible in the Solr's Admin UI files section (of the particular core of course).
The query I am using to test results, to prove that I am not using stardard /query component:
/elevate?df=name&fl=id,name,[elevated]&indent=on&q=heart&wt=json
What should I do to actually tell Solr that the new elevate.xml was loaded? That works fine on my development standalone configuration of Solr (not cloud one) after the solr service is restarted, the documents are elevated by updated elevate.xml file.
You should be doing this:
upload the elevate.xml file to zookeeper as explained here
reload the collection with the RELOAD collection api, no need to restart Solr
Ok I think I figured it out. The problem was with a directory in which elevate.xml file was anticipated by Solr.
The documentation states (as for 6.5.0 version)
config-file
Path to the file that defines query elevation.
This file must exist in <instanceDir>/conf/<config-file> or <dataDir>/<config-file>.
In my dev configuration that was true for both cases (the file was read either from conf or data directory). However on production (in cloud environment with external zooKeepers) the file was read from the root directory of a core (aka collection), where also solrconfig.xml and schema.xml exists.
Probably it has some explanation and it's just data dir itself, but since I don't know how to check what are values of instanceDir and dataDir variables, the documentation was misleading for me.
I hope it helps other Solr adepts.
We have a cluster of standalone Solr cores (Solr 4.3) for which we had built some custom plugins. I'm now trying to prototype converting the cluster to a Solr Cloud cluster. This is how I am trying to deploy the cores (in 4.7.2).
Start solr with zookeeper embedded.
java -DzkRun -Djetty.port=8985 -jar start.jar
upload a config into Zookeeper (same config as the standalone cores)
zkcli.bat -zkhost localhost:9985 -cmd upconfig -confdir myconfig -confname myconfig
Create a new collection (mycollection) of 2 shards using the Collections API
http://localhost:8985/solr/admin/collections?action=CREATE&name=mycollection&numShards=2&replicationFactor=1&maxShardsPerNode=2&collection.configName=myconfig
So at this point I have two shards under my solr directory with the appropriate core.properties
But when I go to http://localhost:8985/solr/#/~cloud, I see that the two shards' status is "Down" when they are supposed to be active by default.
And when I try to index documents in them using SolrJ (via CloudSolrServer API) , I get the error "No live SolrServers available to handle this request". I restarted Solr but same issue.
private CloudSolrServer cloudSolr;
cloudSolr = new CloudSolrServer(zkHOST);
cloudSolr.setZkClientTimeout(zkClientTimeout);
cloudSolr.setDefaultCollection(collectionName);
cloudSolr.connect();
cloudSolr.add(doc)
What am I doing wrong? I did a lot of digging around and saw an old Jira bug saying that Solr Cloud shards won't be active until there are some documents in the index. If that is the reason, that's kind of like a catch-22 isn't it?
So anyways, I also tried adding some test documents manually and committed to see if things improved. Now on the shard statistics page, it correctly gives me the Numdocs count but when I try to query it says "no servers hosting shard". I next tried passing in shards.tolerant=true as a query parameter and search, but no cigar. It says 0 documents found.
Any help would be appreciated. My main objective is to rebuilt the old standalone cores using SolrCloud and test to see if our custom requesthandlers still work as expected. And at this point, I can't index documents inside of the 4.7 Solr Cloud collection I have created.
Thanks and Regards
I've got a setup with 3x zoo keeper's and 4x solrcloud node's.
This is all working, all nodes are seeing each other and I initially had a default collection.
From there, I used the collections API to create a new collection which successfully completed and all it's successfully sharded across 2 nodes, with the other 2 being used for replica's. I can also successfully save documents to that collection. Browsing the solr web GUI on any of the boxes all works, no speed issues.
However, anytime I try to use the collections API I get timeouts. Creating a new collection, reloading one of the existing collections, deleting a collection... all of them timeout.
Any thoughts on why would be much appreciated
Cheers
I have also faced similar issue:
Solr process 24214 running on port 8983
Failed to get system information from http://localhost:8983/solr/ due to: org.apache.solr.client.solrj.SolrServerException: clusterstatus the collection time out:180s
at org.apache.solr.util.SolrCLI.getJson(SolrCLI.java:537)
at org.apache.solr.util.SolrCLI.getJson(SolrCLI.java:471)
at org.apache.solr.util.SolrCLI$StatusTool.getCloudStatus(SolrCLI.java:721)
at org.apache.solr.util.SolrCLI$StatusTool.reportStatus(SolrCLI.java:704)
at org.apache.solr.util.SolrCLI$StatusTool.runTool(SolrCLI.java:662)
at org.apache.solr.util.SolrCLI.main(SolrCLI.java:215)
So to solve this issue I have followed given instruction and resolved it.:
Stop all Solr instances
Stop all Zookeeper instances
Start all Zookeeper instances
Start Solr instances one at a time.
Such timeout can occur when Solr is not able to obtain cluster state. If following call is results in timeout, then this is the case
http://solr-hostname:8983/solr/admin/collections?action=CLUSTERSTATUS&wt=json
This may be caused by incorrect entries present in /clusterstate.json
To fix this:
get clusterstate from ZooKeeper by calling
zkcli.sh -zkhost localhost:2181 -cmd get /clusterstate.json > clusterstate.json
edit extracted clusterstate.json file and remove sections with wrong IPs or not existing hosts
clear the clusterstate in ZooKeeper by calling
zkcli.sh -zkhost localhost:2181 -cmd clear /clusterstate.json
save corrected state in ZooKeeper by sending updated JSON file
zkcli.sh -zkhost localhost:2181 -cmd putfile /clusterstate.json ./clusterstate.json`
restart Solr instances
After that, if your clusterstate shows correct info, you should no longer have timeouts when accessing Collections API.
Note
Be careful when editing clusterstate JSON, limit your changes only to removing not existing hosts/replicas/shards.
I also had timeout issues with the collections API. To fix this problem, I added the server's IP address to the solr.xml file that you find in /var/solr/data/solr.xml. My setup consists of 3 Ubuntu servers that run ZooKeeper (3.4.6) and SolrCloud (5.2.1) on each server.
Ended up being Zoo Keeper config mismatch