Nutch 1.11 crawl Issue

Nutch 1.11 crawl Issue - solr

I have followed the tutorial and configured nutch to run on Windows 7 using Cygwin and i'm using Solr 5.4.0 to index the data
But nutch 1.11 is having problem in executing a crawl.
Crawl Command
$ bin/crawl -i -D solr.server.url=http://127.0.0.1:8983/solr /urls /TestCrawl 2
Error/Exception
Injecting seed URLs /apache-nutch-1.11/bin/nutch inject /TestCrawl/crawldb /urls
Injector: starting at 2016-01-19 17:11:06
Injector: crawlDb: /TestCrawl/crawldb
Injector: urlDir: /urls
Injector: Converting injected urls to crawl db entries.
Injector: java.lang.NullPointerException
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1012)
at org.apache.hadoop.util.Shell.runCommand(Shell.java:445)
at org.apache.hadoop.util.Shell.run(Shell.java:418)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:650)
at org.apache.hadoop.util.Shell.execCommand(Shell.java:739)
at org.apache.hadoop.util.Shell.execCommand(Shell.java:722)
at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:633)
at org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.java:421)
at org.apache.hadoop.fs.FilterFileSystem.mkdirs(FilterFileSystem.java:281)
at org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmissionFiles.java:125)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:348)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1285)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1282)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1282)
at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:562)
at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:557)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:557)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:548)
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:833)
at org.apache.nutch.crawl.Injector.inject(Injector.java:323)
at org.apache.nutch.crawl.Injector.run(Injector.java:379)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.nutch.crawl.Injector.main(Injector.java:369)
Error running:
/home/apache-nutch-1.11/bin/nutch inject /TestCrawl/crawldb /urls
Failed with exit value 127.

I can see there are multiple problems with your command, try this:
bin/crawl -i -Dsolr.server.url=http://127.0.0.1:8983/solr/core_name path_to_seed crawl 2
The first problem is that there is a space when you pass the solr parameter. The second problem is that the solr url should include the core name as well.

hadoop-core jar file is needed when you are working with nutch
with nutch 1.11 compatible hadoop-core jar is 0.20.0
please download jar from this link :
http://www.java2s.com/Code/Jar/h/Downloadhadoop0200corejar.htm
paste that jar into "C:\cygwin64\home\apache-nutch-1.11\lib" folder and it will run
successfully.

Related

Can not apply patch LUCENE-2899.patch to SOLR on Windows

I am trying to apply patch LUCENE-2899.patch to Solr.
I have done this:
Cloned solr from official repo (I am on master branch)
Downloaded and installed ant and GNU patch, i get it here http://gnuwin32.sourceforge.net/packages/patch.htm
Put Ant and GNU patch to PATH env var.
And I got this...
```
D:\utils\solr_master\lucene-solr>patch -p1 -i LUCENE-2899.patch --dry-run
patching file dev-tools/idea/.idea/ant.xml
Assertion failed: hunk, file ../patch-2.5.9-src/patch.c, line 354
This application has requested the Runtime to terminate it in an unusual way.
Please contact the application's support team for more information.
```
UPDATE 1
I am trying to compile, but build failed.
D:\utils\solr_master\lucene-solr>ant compile
Buildfile: D:\utils\solr_master\lucene-solr\build.xml
BUILD FAILED
D:\utils\solr_master\lucene-solr\build.xml:21: The following error occurred while executing this line:
D:\utils\solr_master\lucene-solr\lucene\common-build.xml:623: java.lang.NullPointerException
at java.util.Arrays.stream(Arrays.java:5004)
at java.util.stream.Stream.of(Stream.java:1000)
at java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:267)
at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
at java.util.Spliterators$ArraySpliterator.forEachRemaining(Spliterators.java:948)
at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:545)
at java.util.stream.AbstractPipeline.evaluateToArrayNode(AbstractPipeline.java:260)
at java.util.stream.ReferencePipeline.toArray(ReferencePipeline.java:438)
at org.apache.tools.ant.util.ChainedMapper.lambda$mapFileName$1(ChainedMapper.java:36)
at java.util.stream.ReduceOps$1ReducingSink.accept(ReduceOps.java:80)
at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1382)
at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)
at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
at java.util.stream.ReferencePipeline.reduce(ReferencePipeline.java:484)
at org.apache.tools.ant.util.ChainedMapper.mapFileName(ChainedMapper.java:35)
at org.apache.tools.ant.util.CompositeMapper.lambda$mapFileName$0(CompositeMapper.java:32)
at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175)
at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1382)
at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:545)
at java.util.stream.AbstractPipeline.evaluateToArrayNode(AbstractPipeline.java:260)
at java.util.stream.ReferencePipeline.toArray(ReferencePipeline.java:438)
at org.apache.tools.ant.util.CompositeMapper.mapFileName(CompositeMapper.java:33)
at org.apache.tools.ant.taskdefs.PathConvert.execute(PathConvert.java:363)
at org.apache.tools.ant.UnknownElement.execute(UnknownElement.java:292)
at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.tools.ant.dispatch.DispatchUtils.execute(DispatchUtils.java:106)
at org.apache.tools.ant.Task.perform(Task.java:346)
at org.apache.tools.ant.Target.execute(Target.java:448)
at org.apache.tools.ant.helper.ProjectHelper2.parse(ProjectHelper2.java:172)
at org.apache.tools.ant.taskdefs.ImportTask.importResource(ImportTask.java:221)
at org.apache.tools.ant.taskdefs.ImportTask.execute(ImportTask.java:165)
at org.apache.tools.ant.UnknownElement.execute(UnknownElement.java:292)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.tools.ant.dispatch.DispatchUtils.execute(DispatchUtils.java:106)
at org.apache.tools.ant.Task.perform(Task.java:346)
at org.apache.tools.ant.Target.execute(Target.java:448)
at org.apache.tools.ant.helper.ProjectHelper2.parse(ProjectHelper2.java:183)
at org.apache.tools.ant.ProjectHelper.configureProject(ProjectHelper.java:93)
at org.apache.tools.ant.Main.runBuild(Main.java:824)
at org.apache.tools.ant.Main.startAnt(Main.java:228)
at org.apache.tools.ant.launch.Launcher.run(Launcher.java:283)
at org.apache.tools.ant.launch.Launcher.main(Launcher.java:101)
Total time: 0 seconds
UPDATE 2
I have downloaded Solr from
https://builds.apache.org/job/Solr-Artifacts-7.3/lastSuccessfulBuild/artifact/solr/package/ and https://builds.apache.org/job/Solr-Artifacts-master/lastSuccessfulBuild/artifact/solr/package/
but neither for 7.3 version nor for 8.0(master) version I don't see opennlp dir in contrib dir. Where can I find it?
UPDATE 3
I have run version from master branch witch I have downloaded here https://builds.apache.org/job/Solr-Artifacts-master/lastSuccessfulBuild/artifact/solr/package/ and I have trying to run OpenNLP like gentleman in this post:
Exception while integrating openNLP with Solr
But I have the same error as he.
numberplate_shard1_replica_n1:
org.apache.solr.common.SolrException:org.apache.solr.common.SolrException: >Could not load conf for core numberplate_shard1_replica_n1: Can't load schema >managed-schema: Plugin init failure for [schema.xml] fieldType >"text_opennlp_nvf": Plugin init failure for [schema.xml] analyzer/tokenizer: >Error instantiating class: 'org.apache.lucene.analysis.opennlp.OpenNLPTokenizerFactory'
If patch LUCENE-2899 is merged into master why I have this error?
UPDATE 5
I have restarted solr and errors were gone. But...
I was trying to add fields ( to managed-schema ) to form example ( https://wiki.apache.org/solr/OpenNLP ) :
<fieldType name="text_opennlp" class="solr.TextField">
<analyzer>
<tokenizer class="solr.OpenNLPTokenizerFactory"
sentenceModel="opennlp/en-sent.bin"
tokenizerModel="opennlp/en-token.bin"
/>
</analyzer>
</fieldType>
<field name="content" type="text_opennlp" indexed="true" termOffsets="true" stored="true" termPayloads="true" termPositions="true" docValues="false" termVectors="true" multiValued="true" required="true"/>
But when I try to run Solr in Cloud mode I got this:
D:\utils\solr-7.3.0-7\solr-7.3.0-7\bin>solr -e cloud
Welcome to the SolrCloud example!
This interactive session will help you launch a SolrCloud cluster on your local workstation.
To begin, how many Solr nodes would you like to run in your local cluster? (specify 1-4 nodes) [2]:
1
Ok, let's start up 1 Solr nodes for your example SolrCloud cluster.
Please enter the port for node1 [8983]:
Solr home directory D:\utils\solr-7.3.0-7\solr-7.3.0-7\example\cloud\node1\solr already exists.
Starting up Solr on port 8983 using command:
"D:\utils\solr-7.3.0-7\solr-7.3.0-7\bin\solr.cmd" start -cloud -p 8983 -s "D:\utils\solr-7.3.0-7\solr-7.3.0-7\example\cloud\node1\solr"
Waiting up to 30 to see Solr running on port 8983
Started Solr server on port 8983. Happy searching!
INFO - 2018-03-26 14:42:26.961; org.apache.solr.client.solrj.impl.ZkClientClusterStateProvider; Cluster at localhost:9983 ready
Now let's create a new collection for indexing documents in your 1-node cluster.
Please provide a name for your new collection: [gettingstarted]
numberplate
Collection 'numberplate' already exists!
Do you want to re-use the existing collection or create a new one? Enter 1 to reuse, 2 to create new [1]:
1
Enabling auto soft-commits with maxTime 3 secs using the Config API
POSTing request to Config API: http://localhost:8983/solr/numberplate/config
{"set-property":{"updateHandler.autoSoftCommit.maxTime":"3000"}}
ERROR: Error from server at http://localhost:8983/solr: Expected mime type application/octet-stream but got text/html. <html>
<head>
<meta http-equiv="Content-Type" content="text/html;charset=utf-8"/>
<title>Error 404 Not Found</title>
</head>
<body><h2>HTTP ERROR 404</h2>
<p>Problem accessing /solr/numberplate/config. Reason:
<pre> Not Found</pre></p>
</body>
</html>
SolrCloud example running, please visit: http://localhost:8983/solr
D:\utils\solr-7.3.0-7\solr-7.3.0-7\bin>
UPDATE 6
I have created new collection and I get more precise error:
test_collection_shard1_replica_n1: > org.apache.solr.common.SolrException:org.apache.solr.common.SolrException: > Could not load conf for core test_collection_shard1_replica_n1: Can't load > schema managed-schema: org.apache.solr.core.SolrResourceNotFoundException: > Can't find resource 'opennlp/en-sent.bin' in classpath or '/configs/_default', > cwd=D:\utils\solr-7.3.0-7\solr-7.3.0-7\server
Please check your logs for more information
Maybe I need to copy somewhere OpenNLP models http://opennlp.sourceforge.net/models-1.5/
But where can I put this models?
Can you help me? What I do wrong?

As you can see on LUCENE-2899, the patch is already applied to 8.0 (master), as well as 7.3.
You can find pre-built nightlies at Solr-Artifacts-master for (currently) 8.0 and at Solr-Artifacts-7.3 for 7.3.
The opennlp libraries are bundled inside the artifacts:
solr-8.0.0-3304 find . -name '*nlp*'
[...]
./contrib/langid/lib/opennlp-tools-1.8.3.jar
./contrib/analysis-extras/lib/opennlp-maxent-3.0.3.jar
./contrib/analysis-extras/lib/opennlp-tools-1.8.3.jar
./contrib/analysis-extras/lucene-libs/lucene-analyzers-opennlp-8.0.0-3304.jar
You then have to tell Solr to load these jars, which you can do through solrconfig.xml.
<lib dir="../../../contrib/analysis-extras/lib/" regex="opennlp-.*\.jar" />
<lib dir="../../../contrib/analysis-extras/lucene-libs/lucene-analyzers-opennlp-.*\.jar" regex=".*\.jar" />
Confirm that the jars are loaded as you expect in Solr's log file.

Solr Map reduce indexer tool not able to fetch aliases through zk

Hi While working with MapReduceIndexerTool with solr 4.10 cloud, the code is successfully able to connect to Zookeeper, but while fetching the aliases.json, it fails to fetch the data. Below is the command and stack trace:
command:
hadoop --config /etc/hadoop/conf jar target/search-mr-*-job.jar org.apache.solr.hadoop.MapReduceIndexerTool -D 'mapred.child.java.opts=-Xmx500m' --log4j src/test/resources/log4j.properties --morphline-file /home/impadmin/app_quotes_morphline.conf --output-dir hdfs://impetus-i0056.impetus.co.in:8020/user/impadmin/MapReduceIndexerTool/output2 --zk-host 172.26.45.69:9983/solr --collection app.quotes hdfs://impetus-i0056.impetus.co.in:8020/apps/hive/warehouse/kst
stack trace:
WARNING: Use "yarn jar" to launch YARN applications.
1 [main] INFO org.apache.solr.common.cloud.SolrZkClient - Using default ZkCredentialsProvider
87 [main] INFO org.apache.solr.common.cloud.ConnectionManager - Waiting for client to connect to ZooKeeper
114 [main-EventThread] INFO org.apache.solr.common.cloud.ConnectionManager - Watcher org.apache.solr.common.cloud.ConnectionManager#1568159 name:ZooKeeperConnection Watcher:172.26.45.69:9983/solr got event WatchedEvent state:SyncConnected type:None path:null path:null type:None
115 [main] INFO org.apache.solr.common.cloud.ConnectionManager - Client is connected to ZooKeeper
115 [main] INFO org.apache.solr.common.cloud.SolrZkClient - Using default ZkACLProvider
Exception in thread "main" net.sourceforge.argparse4j.inf.ArgumentParserException: java.lang.IllegalArgumentException: Cannot find expected information for SolrCloud in ZooKeeper: 172.26.45.69:9983/solr
at org.apache.solr.hadoop.MapReduceIndexerTool.verifyZKStructure(MapReduceIndexerTool.java:1418)
at org.apache.solr.hadoop.MapReduceIndexerTool.run(MapReduceIndexerTool.java:716)
at org.apache.solr.hadoop.MapReduceIndexerTool.run(MapReduceIndexerTool.java:681)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at org.apache.solr.hadoop.MapReduceIndexerTool.main(MapReduceIndexerTool.java:668)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Caused by: java.lang.IllegalArgumentException: Cannot find expected information for SolrCloud in ZooKeeper: 172.26.45.69:9983/solr
at org.apache.solr.hadoop.ZooKeeperInspector.extractDocCollection(ZooKeeperInspector.java:88)
at org.apache.solr.hadoop.ZooKeeperInspector.extractShardUrls(ZooKeeperInspector.java:56)
at org.apache.solr.hadoop.MapReduceIndexerTool.verifyZKStructure(MapReduceIndexerTool.java:1415)
... 10 more
Caused by: org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode for /aliases.json
at org.apache.zookeeper.KeeperException.create(KeeperException.java:111)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1155)
at org.apache.solr.common.cloud.SolrZkClient$7.execute(SolrZkClient.java:351)
at org.apache.solr.common.cloud.SolrZkClient$7.execute(SolrZkClient.java:348)
at org.apache.solr.common.cloud.ZkCmdExecutor.retryOperation(ZkCmdExecutor.java:61)
at org.apache.solr.common.cloud.SolrZkClient.getData(SolrZkClient.java:348)
at org.apache.solr.hadoop.ZooKeeperInspector.checkForAlias(ZooKeeperInspector.java:164)
at org.apache.solr.hadoop.ZooKeeperInspector.extractDocCollection(ZooKeeperInspector.java:85)
... 12 more
Please help me to identify the root cause.

The issue was with the URL that was being hit to access zk solr configs. thus correcting the URL solved the issue. In case of embedded solr instance the URL does not have application solr available, but rather puts it directly under zk root.

I cannot create core in Solr 5.2.1

I have solr clouds 5.2.1. I deploy solr and zookeeper. When I try to create a core this errors are throwing :
org.apache.solr.common.SolrException: Could not load conf for core contracts_shard1_replica1: Error loading solr config from solrconfig.xml
at org.apache.solr.core.ConfigSetService.getConfig(ConfigSetService.java:78)
at org.apache.solr.core.CoreContainer.create(CoreContainer.java:635)
at org.apache.solr.core.CoreContainer.create(CoreContainer.java:611)
at org.apache.solr.handler.admin.CoreAdminHandler.handleCreateAction(CoreAdminHandler.java:628)
at org.apache.solr.handler.admin.CoreAdminHandler.handleRequestInternal(CoreAdminHandler.java:213)
at org.apache.solr.handler.admin.CoreAdminHandler.handleRequestBody(CoreAdminHandler.java:193)
at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:143)
at org.apache.solr.servlet.HttpSolrCall.handleAdminRequest(HttpSolrCall.java:660)
at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:431)
at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:227)
at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:196)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652)
at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:577)
org.apache.solr.common.SolrException: Error CREATEing SolrCore 'contracts_shard1_replica1': Unable to create core [contracts_shard1_replica1] Caused by: Can't find resource 'solrconfig.xml' in classpath or '/configs/contracts', cwd=C:\CM_10.1.0\INDEXSERVER\searchserver-distribution\target\searchserver\solr\server
at org.apache.solr.handler.admin.CoreAdminHandler.handleCreateAction(CoreAdminHandler.java:661)
at org.apache.solr.handler.admin.CoreAdminHandler.handleRequestInternal(CoreAdminHandler.java:213)
at org.apache.solr.handler.admin.CoreAdminHandler.handleRequestBody(CoreAdminHandler.java:193)
at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:143)
at org.apache.solr.servlet.HttpSolrCall.handleAdminRequest(HttpSolrCall.java:660)
at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:431)
at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:227)
at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:196)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652)
at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
]
I created contracts inside of C:\CM_10.1.0\INDEXSERVER\searchserver-distribution\target\searchserver\solr\server and copied " conf" folder which is in solr\configsets\basic_configs" into contracts. But problem didn't solved.
I do need help to solve this problem. Does anyone help me?
Thanks

Since you are using the zookeeper, you must first send the config files to the zookeeper. I'm not sure how it is in Windows :P, but in Linux it would be:
cd /searchserver/solr/server/scripts/cloud-scripts
./zkcli.sh -cmd upconfig -confdir /searchserver/solr/server/solr/corename/conf -confname myconfname -z zoo1:2181,zoo2:2181,zoo3:2181
In Windows, use zkcli.bat in the same directory.
Another way to do this is by adding
SOLR_OPTS="$SOLR_OPTS -Dbootstrap_confdir=./solr/corename/conf/"
SOLR_OPTS="$SOLR_OPTS -Dcollection.configName=myconfname"
to the solr.in.sh file, then (re)starting solr. In Windows, the file is solr.in.cmd, and you add the following lines:
set SOLR_OPTS=%SOLR_OPTS% -Dbootstrap_confdir=./solr/corename/conf/
set SOLR_OPTS=%SOLR_OPTS% -Dcollection.configName=myconfname
The solr.in.sh/solr.in.cmd file is included into the solr (colr.cmd) command that you use to start the solr server. Myconfname above (in both methods) is an arbitrary name you give to indicate the sets of config files that you've added to the zookeeper. Then you can create the core using the collections API:
http://localhost:8983/solr/admin/collections?action=CREATE&name=coreName&numShards=2&shards=shard1,shard2&collection.configName=myconfname&createNodeSet=localhost:8983_solr

Unable to access SOLR server admin page

I am new to SOLR. I am building SOLR from source using solr-5.0.0-src.tgz. After running
ant compile
at solr-5.0.0/, I run
bin/solr start
at solr-5.0.0/solr/. And it says
Waiting to see Solr listening on port 8983 [/]
Started Solr server on port 8983 (pid=20151). Happy searching!
However, when visiting http://localhost:8983/solr/, I receive HTTP ERROR
HTTP ERROR: 503
Problem accessing /solr/. Reason:
Service Unavailable
Powered by Jetty://
And
bin/solr status
gives
Found 1 Solr nodes:
Solr process 20151 running on port 8983
Error: Could not find or load main class org.apache.solr.util.SolrCLI
I wonder if this is the reason admin page is unavailable? If so, how I could solve the problem. If not, what is it?
Thanks.

change to solr directory and run:
ant server
Then restart the server
bin/solr stop && bin/solr start
Check that everything is working:
bin/solr status

You have not mentioned the full stack trace...
Here it is ....
Exception in thread "main" java.lang.UnsupportedClassVersionError: org/apache/solr/util/SolrCLI : Unsupported major.minor version 51.0
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClassCond(ClassLoader.java:631)
at java.lang.ClassLoader.defineClass(ClassLoader.java:615)
at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:141)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:283)
at java.net.URLClassLoader.access$000(URLClassLoader.java:58)
at java.net.URLClassLoader$1.run(URLClassLoader.java:197)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
Could not find the main class: org.apache.solr.util.SolrCLI. Program will exit.
To fix the problem you need to upgrade the java ...to J2SE 7

Nutch 1.2 Solr 3.6 integration issue

I have crawled a site successfully using NUTCH 1.2 .Now I want to integrate this with solr 3.6 . Problem is when I am issuing command
$ bin/nutch solrindex //localhost:8080/solr/ crawl/crawldb crawl/linkdb crawl/segments/* an error occurs
SolrIndexer: starting at 2013-07-08 14:52:27
java.io.IOException: Job failed!
Please help me to solve this issue
Here is my nutch log
java.lang.RuntimeException: Invalid version (expected 2, but 60) or the data in not in 'javabin' format
at org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCodec.java:99)
at org.apache.solr.client.solrj.impl.BinaryResponseParser.processResponse(BinaryResponseParser.java:41)
at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:469)
at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:249)
at org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:105)
at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:69)
at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:54)
at org.apache.nutch.indexer.solr.SolrWriter.close(SolrWriter.java:75)
at org.apache.nutch.indexer.IndexerOutputFormat$1.close(IndexerOutputFormat.java:48)
at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:474)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:411)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:216)
2013-07-08 15:17:39,539 ERROR solr.SolrIndexer - java.io.IOException: Job f

This is mainly the javabin incompatiblity between the Solrj version jars used by Nutch and the Solr 3.6 which you are trying to integrate.
You would need to update the Solrj jars and regenerate the jobs.
Follow the steps as mentioned in the forum.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Nutch 1.11 crawl Issue - solr

Related

Can not apply patch LUCENE-2899.patch to SOLR on Windows

Solr Map reduce indexer tool not able to fetch aliases through zk

I cannot create core in Solr 5.2.1

Unable to access SOLR server admin page

Nutch 1.2 Solr 3.6 integration issue

Categories

Resources