Solr 6.x cannot create collection on one Debian Jessie 8.6 - solr

Trying to install Solr 6.2.1 on a Debian Jessie I hit a roadblock whereby I am not able to create a collection. The sequence of commands
unzip solr-6.2.1.zip
export JAVA_HOME=/opt/java64/jdk1.8.0_101
./solr-6.2.1/bin/solr -c
./solr-6.2.1/bin/solr create_collection -c hktesting
could not be more innocent and works on another Debian Jessie as well as on an Ubuntu 16.10. On this machine, however, the create_collection runs into a timeout. I even used a freshly added user without any dotfile customizations.
The full log is quite long, so I try to pick the lines I think are relevant. It all starts nicely with:
2016-11-03 08:08:36.360 INFO (qtp110456297-20) [ ] o.a.s.h.a.CollectionsHandler Invoked Collection Action :create with params replicationFactor=1&maxShardsPerNode=1&collection.configName=hktesting&name=hktesting&action=CREATE&numShards=1&wt=json and sendToOCPQueue=true
Messages follow which all look ok showing a lot going-ons in creating the collection. It all comes to an intermediate end with:
2016-11-03 08:08:38.399 WARN (qtp110456297-18) [c:hktesting s:shard1 r:core_node1 x:hktesting_shard1_replica1] o.a.s.c.SolrCore [hktesting_shard1_replica1] Solr index directory '/home/badsolr/tmp/solr-6.2.1/server/solr/hktesting_shard1_replica1/data/index' doesn't exist. Creating new index...
2016-11-03 08:08:38.408 INFO (qtp110456297-18) [c:hktesting s:shard1 r:core_node1 x:hktesting_shard1_replica1] o.a.s.c.CachingDirectoryFactory return new directory for /home/badsolr/tmp/solr-6.2.1/server/solr/hktesting_shard1_replica1/data/index
which still looks ok to me. Then comes a three minute hole in the log with nothing happening after which a failure is reported:
2016-11-03 08:11:36.373 ERROR (qtp110456297-20) [ ] o.a.s.h.RequestHandlerBase org.apache.solr.common.SolrException: create the collection time out:180s
at org.apache.solr.handler.admin.CollectionsHandler.handleResponse(CollectionsHandler.java:289)
at org.apache.solr.handler.admin.CollectionsHandler.handleRequestBody(CollectionsHandler.java:215)
at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:154)
at org.apache.solr.servlet.HttpSolrCall.handleAdminRequest(HttpSolrCall.java:658)
at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:440)
at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:257)
at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:208)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1668)
at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:581)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1160)
at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:511)
at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1092)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)
at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
at org.eclipse.jetty.server.Server.handle(Server.java:518)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:308)
at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:244)
at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:273)
at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95)
at org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)
at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceAndRun(ExecuteProduceConsume.java:246)
at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:156)
at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:654)
at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:572)
at java.lang.Thread.run(Thread.java:745)
This machine has kerberos authentication enabled, which is about the only possibly relevant difference to the other machines I would know of (but don't jump to conclusions).

It turned out that a mount on /media was unresponsive. Even an ls /media hung forever. With strace I could see that Solr got stuck when it tried to access /media. In the strace log I could see that Solr first read in mtab and then steps into /media. Not that I would know why Solr should care about mtab and this mount point, but after getting the mount fixed, Solr started to work normal.

Related

"CoreContainer is either not initialized or shutting down" error while trying to create new Solr collection

I've just downloaded the latest Solr version from the official website (8.9.0) and tried to create a collection with
solr create -c portal
But the command fails with the error
Caused by:</h3><pre>javax.servlet.ServletException: javax.servlet.UnavailableException: Error processing the request. CoreContainer is either not initialized or shutting down.
at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:162)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
at org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:322)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
at org.eclipse.jetty.server.Server.handle(Server.java:516)
at org.eclipse.jetty.server.HttpChannel.lambda$handle$1(HttpChannel.java:388)
at org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:633)
There is no other (more specific) error message. Just that CoreContainer is either not initialized or shutting down.
For info, I can start solr with solr start which returns
Found 1 Solr nodes:
Solr process 81485 running on port 8983
But then a few seconds later I also get
javax.servlet.ServletException: javax.servlet.UnavailableException: Error processing the request. CoreContainer is either not initialized or shutting down.
at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:162)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
at org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:322)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
with no other specific error message. I have not changed anything from the default values in any config file. This is how I start my solr:
solr start -c -s /path/to/server/solr -m 1g -z localhost:2181,server2:2181,server3:2181
where server2 and server3 are IP addresses of the other two machines where I also have solr and zookeeper installed.
What can I do?

Solr 8.4 Getting Async exception during distributed update: java.io.IOException: Broken pipe when trying to post a document

Why I am seeing the error:
Async exception during distributed update: java.io.IOException: Broken pipe when trying to post a document to solr
Solr version: 8.4.1
Zookeeper: 3.4.14
OpenJDK 11;
2 solr node + 1 zookeeper (hosted zookeeper in one of the solr node)
Using basic authentication on solr and also with TLS1.2.
I am not seeing any error when trying to post to solr node which is a leader and it's working consistently; it is the issue only when I post to the non-leader node which is throwing the below error.
java.io.IOException: java.io.IOException: Broken pipe
at org.eclipse.jetty.client.util.DeferredContentProvider.flush(DeferredContentProvider.java:193)
at org.eclipse.jetty.client.util.OutputStreamContentProvider$DeferredOutputStream.flush(OutputStreamContentProvider.java:152)
at org.eclipse.jetty.client.util.OutputStreamContentProvider$DeferredOutputStream.write(OutputStreamContentProvider.java:146)
at org.apache.solr.common.util.FastOutputStream.flush(FastOutputStream.java:216)
at org.apache.solr.common.util.FastOutputStream.flushBuffer(FastOutputStream.java:209)
at org.apache.solr.common.util.JavaBinCodec.marshal(JavaBinCodec.java:172)
at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec.marshal(JavaBinUpdateRequestCodec.java:103)
at org.apache.solr.client.solrj.impl.BinaryRequestWriter.write(BinaryRequestWriter.java:83)
at org.apache.solr.client.solrj.impl.Http2SolrClient.send(Http2SolrClient.java:339)
at org.apache.solr.client.solrj.impl.ConcurrentUpdateHttp2SolrClient$Runner.sendUpdateStream(ConcurrentUpdateHttp2SolrClient.java:236)
at org.apache.solr.client.solrj.impl.ConcurrentUpdateHttp2SolrClient$Runner.run(ConcurrentUpdateHttp2SolrClient.java:181)
at com.codahale.metrics.InstrumentedExecutorService$InstrumentedRunnable.run(InstrumentedExecutorService.java:181)
at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:210)
at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor$$Lambda$142/0000000000000000.run(Unknown Source)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:834)
Suppressed: java.io.IOException: java.io.IOException: Broken pipe
at org.eclipse.jetty.client.util.DeferredContentProvider.flush(DeferredContentProvider.java:193)
at org.eclipse.jetty.client.util.OutputStreamContentProvider$DeferredOutputStream.flush(OutputStreamContentProvider.java:152)
at org.eclipse.jetty.client.util.OutputStreamContentProvider$DeferredOutputStream.write(OutputStreamContentProvider.java:146)
at org.apache.solr.common.util.FastOutputStream.flush(FastOutputStream.java:216)
at org.apache.solr.common.util.FastOutputStream.flushBuffer(FastOutputStream.java:209)
at org.apache.solr.common.util.JavaBinCodec.close(JavaBinCodec.java:1269)
at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec.marshal(JavaBinUpdateRequestCodec.java:104)
... 10 more
Caused by: java.io.IOException: Broken pipe
at org.eclipse.jetty.io.ssl.SslConnection$DecryptedEndPoint.flush(SslConnection.java:927)
at org.eclipse.jetty.io.WriteFlusher.flush(WriteFlusher.java:393)
at org.eclipse.jetty.io.WriteFlusher.write(WriteFlusher.java:277)
at org.eclipse.jetty.io.AbstractEndPoint.write(AbstractEndPoint.java:380)
at org.eclipse.jetty.http2.HTTP2Flusher.process(HTTP2Flusher.java:247)
at org.eclipse.jetty.util.IteratingCallback.processing(IteratingCallback.java:241)
at org.eclipse.jetty.util.IteratingCallback.iterate(IteratingCallback.java:224)
at org.eclipse.jetty.http2.HTTP2Session.frame(HTTP2Session.java:755)
at org.eclipse.jetty.http2.HTTP2Session.frames(HTTP2Session.java:734)
at org.eclipse.jetty.http2.client.HTTP2ClientConnectionFactory$HTTP2ClientConnection.onOpen(HTTP2ClientConnectionFactory.java:130)
at org.eclipse.jetty.io.AbstractEndPoint.upgrade(AbstractEndPoint.java:441)
at org.eclipse.jetty.io.NegotiatingClientConnection.replaceConnection(NegotiatingClientConnection.java:115)
at org.eclipse.jetty.io.NegotiatingClientConnection.onFillable(NegotiatingClientConnection.java:85)
at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:305)
at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:103)
at org.eclipse.jetty.io.ssl.SslConnection$DecryptedEndPoint.onFillable(SslConnection.java:427)
at org.eclipse.jetty.io.ssl.SslConnection.onFillable(SslConnection.java:321)
at org.eclipse.jetty.io.ssl.SslConnection$2.succeeded(SslConnection.java:159)
at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:103)
at org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:117)
When we had this issue it was because we were running the wrong JDK. We had installed OpenJDK 16 and that was still being used by Solr and Zookeeper. There was some incompatibility that was causing the broken pipe error when the leader was communicating with followers.
To fix this we had to install Open JDK Java 1.8 and force Solr and Zookeeper to use that.
Client disconnected the connection because of too much waiting time
on the server. Higher timeouts may cause performance issues.
Another reason might be the OS cache. Check the disk space and try to
increase the disk space.

Apache Flink Kubernetes Job Arguments

I'm trying to setup a cluster (Apache Flink 1.6.1) with Kubernetes and get following error when I run a job on it:
2018-10-09 14:29:43.212 [main] INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint - --------------------------------------------------------------------------------
2018-10-09 14:29:43.214 [main] INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Registered UNIX signal handlers for [TERM, HUP, INT]
Exception in thread "main" java.lang.NoSuchMethodError: org.apache.flink.runtime.entrypoint.ClusterConfiguration.<init>(Ljava/lang/String;Ljava/util/Properties;[Ljava/lang/String;)V
at org.apache.flink.runtime.entrypoint.EntrypointClusterConfiguration.<init>(EntrypointClusterConfiguration.java:37)
at org.apache.flink.container.entrypoint.StandaloneJobClusterConfiguration.<init>(StandaloneJobClusterConfiguration.java:41)
at org.apache.flink.container.entrypoint.StandaloneJobClusterConfigurationParserFactory.createResult(StandaloneJobClusterConfigurationParserFactory.java:78)
at org.apache.flink.container.entrypoint.StandaloneJobClusterConfigurationParserFactory.createResult(StandaloneJobClusterConfigurationParserFactory.java:42)
at org.apache.flink.runtime.entrypoint.parser.CommandLineParser.parse(CommandLineParser.java:55)
at org.apache.flink.container.entrypoint.StandaloneJobClusterEntryPoint.main(StandaloneJobClusterEntryPoint.java:153)
My job takes a configuration file (file.properties) as a parameter. This works fine in standalone mode but apparently the Kubernetes cluster cannot parse it
job-cluster-job.yaml:
args: ["job-cluster", "--job-classname", "com.test.Abcd", "-Djobmanager.rpc.address=flink-job-cluster",
"-Dparallelism.default=1", "-Dblob.server.port=6124", "-Dquery.server.ports=6125", "file.properties"]
How to fix this?
Update: The job was built for Apache 1.4.2 and this might be the issue, looking into it.
The job was built for 1.4.2, the class with the error (EntrypointClusterConfiguration.java) was added in 1.6.1 (https://github.com/apache/flink/commit/ab9bd87e521d19db7c7d783268a3532d2e876a5d#diff-d1169e00afa40576ea8e4f3c472cf858) it seems, so this caused the issue.
We updated the job's dependencies to point to new 1.6.1 release and the arguments are parsed correctly.

Apache Flink: Registration name clash. KvState with name *** has already been registered by another operator

I am facing the issue while I run the flink job.
Registration name clash. KvState with name 'XXXX' has already been registered by another operator (fab4c54085fa3ee85a6e1bb1062c20af).
Exception:
org.apache.flink.runtime.execution.SuppressRestartsException: Unrecoverable failure. This suppresses job restarts. Please check the stack trace for the root cause.
at org.apache.flink.runtime.query.KvStateLocationRegistry.notifyKvStateRegistered(KvStateLocationRegistry.java:120)
at org.apache.flink.runtime.jobmanager.JobManager.org$apache$flink$runtime$jobmanager$JobManager$$handleKvStateMessage(JobManager.scala:1517)
at org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1.applyOrElse(JobManager.scala:740)
at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:36)
at org.apache.flink.runtime.LeaderSessionMessageFilter$$anonfun$receive$1.applyOrElse(LeaderSessionMessageFilter.scala:49)
at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:36)
at org.apache.flink.runtime.LogMessages$$anon$1.apply(LogMessages.scala:33)
at org.apache.flink.runtime.LogMessages$$anon$1.apply(LogMessages.scala:28)
at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:123)
at org.apache.flink.runtime.LogMessages$$anon$1.applyOrElse(LogMessages.scala:28)
at akka.actor.Actor$class.aroundReceive(Actor.scala:502)
at org.apache.flink.runtime.jobmanager.JobManager.aroundReceive(JobManager.scala:122)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:526)
at akka.actor.ActorCell.invoke(ActorCell.scala:495)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:257)
at akka.dispatch.Mailbox.run(Mailbox.scala:224)
at akka.dispatch.Mailbox.exec(Mailbox.scala:234)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Caused by: java.lang.IllegalStateException: Registration name clash. KvState with name 'XXXX' has already been registered by another operator (fab4c54085fa3ee85a6e1bb1062c20af).
at org.apache.flink.runtime.query.KvStateLocationRegistry.notifyKvStateRegistered(KvStateLocationRegistry.java:116)
... 20 more
When I run the job from the IntelliJ IDE, it works well, but when I run the job using the following command then I face this issue.
bin/flink run -c com.xxx.XXXJob
What should be the reason for this error?
I have restarted flink and rerun the job but I'm still getting this error.
I have configured the same name of the queryable state in two process function.
#Override
public void open(Configuration parameters) {
ValueStateDescriptor<? super Output> descriptor = new ValueStateDescriptor(queryableStateName, Xyz.class);
descriptor.setQueryable("downloads");
state = getRuntimeContext().getState(descriptor);
}
In above code descriptor.setQueryable("downloads"); where I had given downloads in multiple process function.
Regarding my surprising about why it is running from IDE and not through bin/flink run command.
So the reason is the 'flink-queryable-state-runtime_2.11-1.4.2.jar'
When I run from the IDE, IDE could not find this jar from the classpath. So queryable feature is disabled and due to that job never complains. So job runs successfully.
But when I use bin/flink run then it found from the flink's lib.
So it gives an error due to same name and job goes fail.

HBase Indexer + Solr

I'm trying to integrate HBase with Solr. Everything seems to work, Hbase indexer communicates with zookeeper on a different HBase Machine. But rows I put into Hbase are notr created under Solr.
HBase version: 1.1.2.2.5.5.0-157 (Hortonworks)
Solr : 5.2.1 (LucidWorks)
Tested Hbase Indexer :
Hbase-indexer : default version, compiled with ( mvn clean install -DskipTests -Dhbase.api=1.1)
Everytime I start the HBase Indexer Server, I got this exception:
17/06/16 19:02:43 INFO zookeeper.ClientCnxn: Session establishment complete on server sr1/172.20.21.15:2181, sessionid = 0x15cb1081862007e, negotiated timeout = 40000
17/06/16 19:02:44 INFO zookeeper.LeaderElection: Elected as leader for the position of Indexer Master
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/mapred/JobClient
at com.ngdata.hbaseindexer.master.IndexerMaster.getJobClient(IndexerMaster.java:181)
at com.ngdata.hbaseindexer.master.IndexerMaster.start(IndexerMaster.java:144)
at com.ngdata.hbaseindexer.Main.startServices(Main.java:124)
at com.ngdata.hbaseindexer.Main.run(Main.java:96)
at com.ngdata.hbaseindexer.Main.main(Main.java:84)
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.mapred.JobClient
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 5 more
Does anybody know how to deal with this jar conflict problem?
Actually it was a Solr version problem. It should be 5.5 at least to use HBase 1.1.2.

Resources