I am using external zookeper for testing I am using on local system steps I followed are as bellow.
Step 1. created 3 zookeper server with data containing myid file containing unique numbers 1,2,3 respectively.
Step 2. I started all three zookeper server useing command
./zkServer.sh start
Step 3. check status of each server 2 showing status as leader and remaining 2 as Mode: follower
Step4 : try to run solr cloud example as
/opt/solr$bin/solr start -e cloud -z localhost:2181,localhost:2182,localhost:2183
It ask me no of shards replicas etc. System ask me collection name I entered test but it throws an exception like
`basic_configs, data_driven_schema_configs, or sample_techproducts_configs [data_driven_schema_configs]
Exception in thread "main" org.apache.solr.client.solrj.SolrServerException: Error loading config name for collection phrases
at org.apache.solr.util.SolrCLI.getJson(SolrCLI.java:537)
at org.apache.solr.util.SolrCLI.getJson(SolrCLI.java:471)
at org.apache.solr.util.SolrCLI$StatusTool.getCloudStatus(SolrCLI.java:721)
at org.apache.solr.util.SolrCLI$StatusTool.reportStatus(SolrCLI.java:704)
at org.apache.solr.util.SolrCLI.getZkHost(SolrCLI.java:1160)
at org.apache.solr.util.SolrCLI$CreateCollectionTool.runTool(SolrCLI.java:1210)
at org.apache.solr.util.SolrCLI.main(SolrCLI.java:215)
Enabling auto soft-commits with maxTime 3 secs using the Config API
POSTing request to Config API: http://localhost:8990/solr/sai/config
{"set-property":{"updateHandler.autoSoftCommit.maxTime":"3000"}}
Exception in thread "main" org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error from server at http://localhost:8990/solr: Expected mime type application/octet-stream but got text/html. <html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/>
<title>Error 404 Not Found</title>
</head>
<body><h2>HTTP ERROR 404</h2>
<p>Problem accessing /solr/sai/config. Reason:
<pre> Not Found</pre></p><hr><i><small>Powered by Jetty://</small></i><hr/>
</body>
</html>
at org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:529)
at org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:235)
at org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:227)
at org.apache.solr.client.solrj.SolrClient.request(SolrClient.java:1220)
at org.apache.solr.util.SolrCLI.postJsonToSolr(SolrCLI.java:1882)
at org.apache.solr.util.SolrCLI$ConfigTool.runTool(SolrCLI.java:1856)
at org.apache.solr.util.SolrCLI.main(SolrCLI.java:215)
SolrCloud example running, please visit http://localhost:8990/solr `
It showing error Exception in thread "main" org.apache.solr.client.solrj.SolrServerException: Error loading config name for collection phrases and I am trying to create collection test
If you are trying to create a collection on solr cloud with your custom configuration, you have to upload it first on to Zookeeper. Then you can create a collection using that config. You can also check what all configurations are there currently on the solrcloud through solr admin UI (http://localhost:8983/solr/#/~cloud?view=tree).
Uploading configuration to zookeeper:
Using zookeeper client in solr (solr-6.5.1\server\scripts\cloud-scripts)
zkcli -zkhost <zookeeper host> -cmd upconfig -confname <configname> -solrhome <solr home directory> -confdir <config directory path>
ex: zkcli -zkhost localhost:2181 -cmd upconfig -confname sampleconfig -solrhome ../solr -confdir ../../solr/configsets/sampleconfig/conf
Now that your configuration is uploaded, you can create a collection on solrcloud
start solr cloud with external zookeeper
solr start -c -z localhost:2181
Create collection
http://localhost:8983/solr/admin/collections?action=CREATE&name=<collectionname>&numShards=1&replicationFactor=1&collection.configName=<configname>
-e cloud is an example provided by SOLR and it works with implicit ZooKeepers.
For explicit ZooKeeper implementation refer either of the below ones
http://amn-solr.blogspot.in/
SolrCloud 5 and Zookeeper config upload
Related
I am trying to setup up in my local laptop with 3 Solr, 3 ZooKeeper and 1 Load Balancer.
I have followed the post https://www.codehousegroup.com/insight-and-inspiration/tech-stream/how-to-configure-sitecore-with-solr-cloud and https://medium.com/#sarkaramrit2/setting-up-solr-cloud-6-3-0-with-zookeeper-3-4-6-867b96ec4272 and few others
I have setup successfully 3 solr in my laptop and URL's are as follows
https://solrcloud1:6161/solr/#/
https://solrcloud2:6162/solr/#/
https://solrcloud3:6163/solr/#/
I have installed the local load balancer "GoBetween" and mapped my above 3 solr paths over there. When I hit https://solrcloud:3010/solr/#/ this URL, I am able to receive response from different Solr instances also. It looks it's working fine as well.
ZooKeeer
I have downloaded zookeeper and placed that in all the 3 Solr locations as well
E.g.: SolrCloud1 = \LocalSolrCloud\SolrCloud1\ contains "solr-8.8.2" and "zookeeper-3.5.6" same structure for SolrCloud2 and SolrCloud3 also.
In Each zookeeper location, I have created a data folder and created "myid" file and placed the value as "1", "2", "3" respectively.
Each zoopkeeper's "zoo.cfg" file contains the below
tickTime=2000
initLimit=10
syncLimit=5
dataDir=/SolrCloud1/zookeeper-3.5.6/data
clientPort=2181
autopurge.snapRetainCount=4
autopurge.purgeInterval=24
server.1=locahost:2888:3888
server.2=locahost:2889:3889
server.3=locahost:2890:3890
When I run the zooKeeper from the command prompt, I am getting the below error.
2022-12-15 11:47:20,051 [myid:1] - WARN [WorkerSender[myid=1]:QuorumCnxManager#679] -
Cannot open channel to 2 at election address localhost/127.0.0.1:3889
java.net.ConnectException: Connection refused: connect
at java.net.DualStackPlainSocketImpl.waitForConnect(Native Method)
at java.net.DualStackPlainSocketImpl.socketConnect(DualStackPlainSocketImpl.java:85)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:172)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:589)
at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:650)
at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:707)
at org.apache.zookeeper.server.quorum.QuorumCnxManager.toSend(QuorumCnxManager.java:620)
at org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSender.process(FastLeaderElection.java:477)
at org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSender.run(FastLeaderElection.java:456)
at java.lang.Thread.run(Thread.java:748)
I am trying to apply patch LUCENE-2899.patch to Solr.
I have done this:
Cloned solr from official repo (I am on master branch)
Downloaded and installed ant and GNU patch, i get it here http://gnuwin32.sourceforge.net/packages/patch.htm
Put Ant and GNU patch to PATH env var.
And I got this...
```
D:\utils\solr_master\lucene-solr>patch -p1 -i LUCENE-2899.patch --dry-run
patching file dev-tools/idea/.idea/ant.xml
Assertion failed: hunk, file ../patch-2.5.9-src/patch.c, line 354
This application has requested the Runtime to terminate it in an unusual way.
Please contact the application's support team for more information.
```
UPDATE 1
I am trying to compile, but build failed.
D:\utils\solr_master\lucene-solr>ant compile
Buildfile: D:\utils\solr_master\lucene-solr\build.xml
BUILD FAILED
D:\utils\solr_master\lucene-solr\build.xml:21: The following error occurred while executing this line:
D:\utils\solr_master\lucene-solr\lucene\common-build.xml:623: java.lang.NullPointerException
at java.util.Arrays.stream(Arrays.java:5004)
at java.util.stream.Stream.of(Stream.java:1000)
at java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:267)
at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
at java.util.Spliterators$ArraySpliterator.forEachRemaining(Spliterators.java:948)
at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:545)
at java.util.stream.AbstractPipeline.evaluateToArrayNode(AbstractPipeline.java:260)
at java.util.stream.ReferencePipeline.toArray(ReferencePipeline.java:438)
at org.apache.tools.ant.util.ChainedMapper.lambda$mapFileName$1(ChainedMapper.java:36)
at java.util.stream.ReduceOps$1ReducingSink.accept(ReduceOps.java:80)
at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1382)
at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)
at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
at java.util.stream.ReferencePipeline.reduce(ReferencePipeline.java:484)
at org.apache.tools.ant.util.ChainedMapper.mapFileName(ChainedMapper.java:35)
at org.apache.tools.ant.util.CompositeMapper.lambda$mapFileName$0(CompositeMapper.java:32)
at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175)
at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1382)
at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:545)
at java.util.stream.AbstractPipeline.evaluateToArrayNode(AbstractPipeline.java:260)
at java.util.stream.ReferencePipeline.toArray(ReferencePipeline.java:438)
at org.apache.tools.ant.util.CompositeMapper.mapFileName(CompositeMapper.java:33)
at org.apache.tools.ant.taskdefs.PathConvert.execute(PathConvert.java:363)
at org.apache.tools.ant.UnknownElement.execute(UnknownElement.java:292)
at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.tools.ant.dispatch.DispatchUtils.execute(DispatchUtils.java:106)
at org.apache.tools.ant.Task.perform(Task.java:346)
at org.apache.tools.ant.Target.execute(Target.java:448)
at org.apache.tools.ant.helper.ProjectHelper2.parse(ProjectHelper2.java:172)
at org.apache.tools.ant.taskdefs.ImportTask.importResource(ImportTask.java:221)
at org.apache.tools.ant.taskdefs.ImportTask.execute(ImportTask.java:165)
at org.apache.tools.ant.UnknownElement.execute(UnknownElement.java:292)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.tools.ant.dispatch.DispatchUtils.execute(DispatchUtils.java:106)
at org.apache.tools.ant.Task.perform(Task.java:346)
at org.apache.tools.ant.Target.execute(Target.java:448)
at org.apache.tools.ant.helper.ProjectHelper2.parse(ProjectHelper2.java:183)
at org.apache.tools.ant.ProjectHelper.configureProject(ProjectHelper.java:93)
at org.apache.tools.ant.Main.runBuild(Main.java:824)
at org.apache.tools.ant.Main.startAnt(Main.java:228)
at org.apache.tools.ant.launch.Launcher.run(Launcher.java:283)
at org.apache.tools.ant.launch.Launcher.main(Launcher.java:101)
Total time: 0 seconds
UPDATE 2
I have downloaded Solr from
https://builds.apache.org/job/Solr-Artifacts-7.3/lastSuccessfulBuild/artifact/solr/package/ and https://builds.apache.org/job/Solr-Artifacts-master/lastSuccessfulBuild/artifact/solr/package/
but neither for 7.3 version nor for 8.0(master) version I don't see opennlp dir in contrib dir. Where can I find it?
UPDATE 3
I have run version from master branch witch I have downloaded here https://builds.apache.org/job/Solr-Artifacts-master/lastSuccessfulBuild/artifact/solr/package/ and I have trying to run OpenNLP like gentleman in this post:
Exception while integrating openNLP with Solr
But I have the same error as he.
numberplate_shard1_replica_n1:
org.apache.solr.common.SolrException:org.apache.solr.common.SolrException: >Could not load conf for core numberplate_shard1_replica_n1: Can't load schema >managed-schema: Plugin init failure for [schema.xml] fieldType >"text_opennlp_nvf": Plugin init failure for [schema.xml] analyzer/tokenizer: >Error instantiating class: 'org.apache.lucene.analysis.opennlp.OpenNLPTokenizerFactory'
If patch LUCENE-2899 is merged into master why I have this error?
UPDATE 5
I have restarted solr and errors were gone. But...
I was trying to add fields ( to managed-schema ) to form example ( https://wiki.apache.org/solr/OpenNLP ) :
<fieldType name="text_opennlp" class="solr.TextField">
<analyzer>
<tokenizer class="solr.OpenNLPTokenizerFactory"
sentenceModel="opennlp/en-sent.bin"
tokenizerModel="opennlp/en-token.bin"
/>
</analyzer>
</fieldType>
<field name="content" type="text_opennlp" indexed="true" termOffsets="true" stored="true" termPayloads="true" termPositions="true" docValues="false" termVectors="true" multiValued="true" required="true"/>
But when I try to run Solr in Cloud mode I got this:
D:\utils\solr-7.3.0-7\solr-7.3.0-7\bin>solr -e cloud
Welcome to the SolrCloud example!
This interactive session will help you launch a SolrCloud cluster on your local workstation.
To begin, how many Solr nodes would you like to run in your local cluster? (specify 1-4 nodes) [2]:
1
Ok, let's start up 1 Solr nodes for your example SolrCloud cluster.
Please enter the port for node1 [8983]:
Solr home directory D:\utils\solr-7.3.0-7\solr-7.3.0-7\example\cloud\node1\solr already exists.
Starting up Solr on port 8983 using command:
"D:\utils\solr-7.3.0-7\solr-7.3.0-7\bin\solr.cmd" start -cloud -p 8983 -s "D:\utils\solr-7.3.0-7\solr-7.3.0-7\example\cloud\node1\solr"
Waiting up to 30 to see Solr running on port 8983
Started Solr server on port 8983. Happy searching!
INFO - 2018-03-26 14:42:26.961; org.apache.solr.client.solrj.impl.ZkClientClusterStateProvider; Cluster at localhost:9983 ready
Now let's create a new collection for indexing documents in your 1-node cluster.
Please provide a name for your new collection: [gettingstarted]
numberplate
Collection 'numberplate' already exists!
Do you want to re-use the existing collection or create a new one? Enter 1 to reuse, 2 to create new [1]:
1
Enabling auto soft-commits with maxTime 3 secs using the Config API
POSTing request to Config API: http://localhost:8983/solr/numberplate/config
{"set-property":{"updateHandler.autoSoftCommit.maxTime":"3000"}}
ERROR: Error from server at http://localhost:8983/solr: Expected mime type application/octet-stream but got text/html. <html>
<head>
<meta http-equiv="Content-Type" content="text/html;charset=utf-8"/>
<title>Error 404 Not Found</title>
</head>
<body><h2>HTTP ERROR 404</h2>
<p>Problem accessing /solr/numberplate/config. Reason:
<pre> Not Found</pre></p>
</body>
</html>
SolrCloud example running, please visit: http://localhost:8983/solr
D:\utils\solr-7.3.0-7\solr-7.3.0-7\bin>
UPDATE 6
I have created new collection and I get more precise error:
test_collection_shard1_replica_n1: > org.apache.solr.common.SolrException:org.apache.solr.common.SolrException: > Could not load conf for core test_collection_shard1_replica_n1: Can't load > schema managed-schema: org.apache.solr.core.SolrResourceNotFoundException: > Can't find resource 'opennlp/en-sent.bin' in classpath or '/configs/_default', > cwd=D:\utils\solr-7.3.0-7\solr-7.3.0-7\server
Please check your logs for more information
Maybe I need to copy somewhere OpenNLP models http://opennlp.sourceforge.net/models-1.5/
But where can I put this models?
Can you help me? What I do wrong?
As you can see on LUCENE-2899, the patch is already applied to 8.0 (master), as well as 7.3.
You can find pre-built nightlies at Solr-Artifacts-master for (currently) 8.0 and at Solr-Artifacts-7.3 for 7.3.
The opennlp libraries are bundled inside the artifacts:
solr-8.0.0-3304 find . -name '*nlp*'
[...]
./contrib/langid/lib/opennlp-tools-1.8.3.jar
./contrib/analysis-extras/lib/opennlp-maxent-3.0.3.jar
./contrib/analysis-extras/lib/opennlp-tools-1.8.3.jar
./contrib/analysis-extras/lucene-libs/lucene-analyzers-opennlp-8.0.0-3304.jar
You then have to tell Solr to load these jars, which you can do through solrconfig.xml.
<lib dir="../../../contrib/analysis-extras/lib/" regex="opennlp-.*\.jar" />
<lib dir="../../../contrib/analysis-extras/lucene-libs/lucene-analyzers-opennlp-.*\.jar" regex=".*\.jar" />
Confirm that the jars are loaded as you expect in Solr's log file.
Under a Solr 5.3.1 installation with /update working as expected I tried to index a .tar.gz file with the update/extract query handler,
curl "http://localhost:8983/solr/#/myfirstcore/update/extract?literal.id=adocument&commit=true" -H 'Content-type:application/octet-stream' --data-binary "#encapsulate.tar.gz"
But receive the following,
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/>
<title>Error 405 HTTP method POST is not supported by this URL</title>
</head>
<body><h2>HTTP ERROR 405</h2>
<p>Problem accessing /solr/admin.html. Reason:
<pre> HTTP method POST is not supported by this URL</pre></p><hr><i><small>Powered by Jetty://</small></i><hr/>
</body>
</html>
Under the admin panel, the update/extract specification is
/update/extract
class:org.apache.solr.handler.extraction.ExtractingRequestHandler
version:5.3.1
description:Add/Update Rich document
src:null
And solr was generally installed according to these directions: Digital Ocean: Installing Solr 5.2.1 on Ubuntu 14.4
Given the above error message how can I configure Solr to index zipped files (including .tar.gz)? The use case is to associate content with taxonomy metadata stored in json format by zipping them together. This way Solr will index both documents and associated taxonomy metadata together and follow on partial update commands are not needed.
Solution, change:
curl "http://localhost:8983/solr/#/myfirstcore/update/extract?literal.id=adocument&commit=true" -H 'Content-type:application/octet-stream' --data-binary "#encapsulate.tar.gz"
to
curl "http://localhost:8983/solr/myfirstcore/update/extract?literal.id=adocument&commit=true" -H 'Content-type:application/octet-stream' --data-binary "#encapsulate.tar.gz"
And a query for id=adocument returns 1 hit. That it didn't pick up the fields is a separate issue.
I am trying to set up a solrCloud with external zookeeper ensemble of 3 servers and a replicated solr on 2 servers.
Assumed that an external zookeeper should be independent from other storages I can't find out how to set the -solrhome parameter. Is the zookeeper supposed to read data from the worker nodes?
How do you upload the config and link it with target collection?
We had a lot of problems using solr.home so save yourself some stress and just keep your directories how solr likes them by default.
Example:
/example/solr/collection1/conf/schema.xml
/example/solr/collection1/conf/solrconfig.xml
/example/solr/collection1/core.properties
/example/start.jar
To get your configuration into Zookeeper, get familiar with solr's zkcli.sh script. You want to use this to manage your solr configs. It will create/update the files in ZK under the /configs node.
./zkcli.sh -cmd upconfig -confdir /example/solr/collection1/conf -confname collection1 -z 127.0.0.1
After running the upconfig cmd above, the files in /example/solr/collection1/conf will be uploaded to ZK under /configs/collection1.
Also need to link your config to your collection (creates a node under the /collections node in ZK)
# only need to link the config once
./zkcli.sh -cmd linkconfig -collection collection1 -confname collection1 -z 127.0.0.1
Then you can just start solr like this:
java -DzkHost=127.0.0.1 -jar start.jar
The other servers in your cloud will now get the configuration from zookeeper! Some more info in a pretty good blog post here: SolrCloud Cluster (Single Collection) Deployment
Note: 127.0.0.1 is a comma delimited list of your ZK servers and collection1 is your collection
You can specify the root of the Solr configuration as part of your Zookeeper connection string: -zkhost host1,host2,hostN/solr
I am trying to crawl the web using nutch and I followed the documentation steps in the nutch's official web site (run the crawl successfully, copy the scheme-solr4.xml into solr directory). but when I run the
bin/nutch solrindex http://localhost:8983/solr/ crawl/crawldb -linkdb crawl/linkdb crawl/segments/*
I get the following error:
Indexer: starting at 2013-08-25 09:17:35
Indexer: deleting gone documents: false
Indexer: URL filtering: false
Indexer: URL normalizing: false
Active IndexWriters :
SOLRIndexWriter
solr.server.url : URL of the SOLR instance (mandatory)
solr.commit.size : buffer size when sending to SOLR (default 1000)
solr.mapping.file : name of the mapping file for fields (default solrindex-mapping.xml)
solr.auth : use authentication (default false)
solr.auth.username : use authentication (default false)
solr.auth : username for authentication
solr.auth.password : password for authentication
Indexer: java.io.IOException: Job failed!
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1357)
at org.apache.nutch.indexer.IndexingJob.index(IndexingJob.java:123)
at org.apache.nutch.indexer.IndexingJob.run(IndexingJob.java:185)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.nutch.indexer.IndexingJob.main(IndexingJob.java:195)
I have to mention that the solr is running but I cannot browse http://localhost:8983/solr/admin (it redirects me to http://localhost:8983/solr/#).
On the other hand, when I stop the solr, I get the same error! Does anybody have any idea about what is wrong with my setting?
P.S. the url that I crawl is: http://localhost/NORC
Check your configuration against: Solr and Nutch
Nutch and Solr's schema files should be the same or you may encounter problems so make sure they match up
When I meet same problem in nutch, the solr's log appear a error message "unknown field host".
After modifying the schema.xml in solr, the nutch's error vanish.
You are missing the name of the core inside your command.
e.g.:
./bin/crawl -i -D solr.server.url=http://localhost:8983/solr/#/your_corname urls/ crawl 1