I performed shard splitting on 2 of the shards out of 3 with the async request id.
The shard splitting task is failed when it trying to attach the replica for the splitted shard, but i found out on both of the leader splitted shards, document is splitted correctly (total numFound of both shards = total numFound of parent shard)
So, i proceed to manually change the clusterstate.json, splitted shards changed status from construction to active, and parent shards changed from active to inactive. I also manually attach the replica for the splitted shards, then only remove the parent shards by unload the parent shards core.
Here the problems come, when i issue a commit in SolrCloud, both of the parent return back in the solr cloud cloud graph and clusterstate.json with node status = down & shard status = active.
1) I bring up the parent shards node again and try with another unload core again, but every time i issue a commit, it will back into the graph and clusterstate with node status = down & shard status = active.
2) So my second attempt is to delete the shards with collection API using /admin/collections?action=DELETESHARD
I got the error as below :
org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException:Cannot >unload non-existent core
Operation deleteshard caused exception:
org.apache.solr.common.SolrException: Could not fully remove collection: >candidates shard: candidates_shard2 at
org.apache.solr.handler.admin.CollectionsHandler.handleResponse(CollectionsHandler.java:364) at org.apache.solr.handler.admin.CollectionsHandler.handleResponse(CollectionsHandler.java:320) at org.apache.solr.handler.admin.CollectionsHandler.handleDeleteShardAction(CollectionsHandler.java:563) at org.apache.solr.handler.admin.CollectionsHandler.handleRequestBody(CollectionsHandler.java:176) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135) at org.apache.solr.servlet.SolrDispatchFilter.handleAdminRequest(SolrDispatchFilter.java:729) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:267) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:207) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:220) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:122) at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:612) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:170) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:103) at org.apache.catalina.valves.AutoLoginValve.invoke(AutoLoginValve.java:67) at org.apache.catalina.valves.RequestFilterValve.process(RequestFilterValve.java:304) at org.apache.catalina.valves.RemoteAddrValve.invoke(RemoteAddrValve.java:82) at org.apache.catalina.valves.RemoteIpValve.invoke(RemoteIpValve.java:683) at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:950) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:116) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:421) at org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1070) at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:611) at org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:316) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61) at java.lang.Thread.run(Thread.java:745)
3) My third attempt is the delete the replica using the collection API admin/collections?action=DELETEREPLICA
I got the error as below :
org.apache.solr.common.SolrException:org.apache.solr.common.SolrException: Could not remove replica
org.apache.solr.common.SolrException: Could not remove replica : candidates/candidates_shard2/core_node48 at org.apache.solr.handler.admin.CollectionsHandler.handleResponse(CollectionsHandler.java:364) at org.apache.solr.handler.admin.CollectionsHandler.handleResponse(CollectionsHandler.java:320) at org.apache.solr.handler.admin.CollectionsHandler.handleRemoveReplica(CollectionsHandler.java:495) at org.apache.solr.handler.admin.CollectionsHandler.handleRequestBody(CollectionsHandler.java:184) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135) at org.apache.solr.servlet.SolrDispatchFilter.handleAdminRequest(SolrDispatchFilter.java:729) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:267) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:207) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:220) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:122) at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:612) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:170) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:103) at org.apache.catalina.valves.AutoLoginValve.invoke(AutoLoginValve.java:67) at org.apache.catalina.valves.RequestFilterValve.process(RequestFilterValve.java:304) at org.apache.catalina.valves.RemoteAddrValve.invoke(RemoteAddrValve.java:82) at org.apache.catalina.valves.RemoteIpValve.invoke(RemoteIpValve.java:683) at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:950) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:116) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:421) at org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1070) at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:611) at org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:314) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61) at java.lang.Thread.run(Thread.java:745)
How should i remove that 2 unwanted parent shard from the cluster?
Related
I'm getting random "connection is closed" in Teiid 13.1.0 to SQL Server:
2021-01-08 10:20:23,949 DEBUG [org.teiid.COMMAND_LOG.SOURCE] (Worker513_QueryProcessorQueue9800) Cz9nti5G/vUr ERROR SRC COMMAND: endTime=2021-01-08 10:20:23.949 requestID=Cz9nti5G/vUr.0 sourceCommandID=0 executionID=9632 txID=null modelName=customer translatorName=sqlserver sessionID=Cz9nti5G/vUr principal=sforce-app-user
2021-01-08 10:20:23,949 WARN [org.teiid.CONNECTOR] (Worker513_QueryProcessorQueue9800) Cz9nti5G/vUr Connector worker process failed for atomic-request=Cz9nti5G/vUr.0.0.9632: org.teiid.translator.jdbc.JDBCExecutionException: 0 TEIID11008:TEIID11004 Error executing statement(s): [Prepared Values: ['(111)111-1111'] SQL: SELECT g_0.Id AS c_0, g_0.Email AS c_1, g_0.Phone AS c_2, g_0.parent AS c_3 FROM Customer g_0 WHERE g_0.Phone = ? ORDER BY c_0 OFFSET 0 ROWS FETCH NEXT 2001 ROWS ONLY]
at org.teiid.translator.jdbc.JDBCQueryExecution.execute(JDBCQueryExecution.java:127)
at org.teiid.dqp.internal.datamgr.ConnectorWorkItem.execute(ConnectorWorkItem.java:402)
at sun.reflect.GeneratedMethodAccessor101.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.teiid.dqp.internal.datamgr.ConnectorManager$1.invoke(ConnectorManager.java:228)
at com.sun.proxy.$Proxy44.execute(Unknown Source)
at org.teiid.dqp.internal.process.DataTierTupleSource.getResults(DataTierTupleSource.java:302)
at org.teiid.dqp.internal.process.DataTierTupleSource$1.call(DataTierTupleSource.java:108)
at org.teiid.dqp.internal.process.DataTierTupleSource$1.call(DataTierTupleSource.java:104)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at org.teiid.dqp.internal.process.FutureWork.run(FutureWork.java:59)
at org.teiid.dqp.internal.process.DQPWorkContext.runInContext(DQPWorkContext.java:281)
at org.teiid.dqp.internal.process.ThreadReuseExecutor$RunnableWrapper.run(ThreadReuseExecutor.java:124)
at org.teiid.dqp.internal.process.ThreadReuseExecutor$2.run(ThreadReuseExecutor.java:212)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: com.microsoft.sqlserver.jdbc.SQLServerException: The connection is closed.
at com.microsoft.sqlserver.jdbc.SQLServerException.makeFromDriverError(SQLServerException.java:234)
at com.microsoft.sqlserver.jdbc.SQLServerConnection.checkClosed(SQLServerConnection.java:1130)
at com.microsoft.sqlserver.jdbc.SQLServerConnection.prepareStatement(SQLServerConnection.java:3536)
at org.jboss.jca.adapters.jdbc.BaseWrapperManagedConnection.doPrepareStatement(BaseWrapperManagedConnection.java:758)
at org.jboss.jca.adapters.jdbc.BaseWrapperManagedConnection.prepareStatement(BaseWrapperManagedConnection.java:744)
at org.jboss.jca.adapters.jdbc.WrappedConnection$4.produce(WrappedConnection.java:478)
at org.jboss.jca.adapters.jdbc.WrappedConnection$4.produce(WrappedConnection.java:476)
at org.jboss.jca.adapters.jdbc.SecurityActions.executeInTccl(SecurityActions.java:97)
at org.jboss.jca.adapters.jdbc.WrappedConnection.prepareStatement(WrappedConnection.java:476)
at org.teiid.translator.jdbc.JDBCBaseExecution.getPreparedStatement(JDBCBaseExecution.java:198)
at org.teiid.translator.jdbc.JDBCQueryExecution.execute(JDBCQueryExecution.java:117)
... 17 more
2021-01-08 10:20:23,949 DEBUG [jboss.jdbc.spy] (default task-88) Cz9nti5G/vUr java:/datasources/DATASOURCE [Connection] close()
2021-01-08 10:20:23,949 DEBUG [org.jboss.jca.core.connectionmanager.pool.strategy.OnePool] (default task-88) Cz9nti5G/vUr DATASOURCE: returnConnection(714b1b5a, false) [1/20]
Originally I was seeing this when the SQL Server was restarted: Teiid was not validating the connections in the pool and I'd have to restart Teiid to get connections back. To fix this I added
<pool>
<flush-strategy>EntirePool</flush-strategy>
</pool>
which I tested and worked. However I am still getting "The connection is closed" errors at random times.
SQL Server marks the connections as idle after 10 minutes. I do not have a <idle-timeout-minutes> on my data source.
My configuration is:
<datasource jta="true" jndi-name="java:/datasources/DATASOURCE" pool-name="DATASOURCE" enabled="true" spy="true" use-ccm="false" statistics-enabled="true">
<connection-url>jdbc:sqlserver://1.1.1.1:1433;DatabaseName=DATABASE</connection-url>
<driver-class>com.microsoft.sqlserver.jdbc.SQLServerDriver</driver-class>
<driver>mssql-jdbc-8.2.0.jre8.jar</driver>
<pool>
<flush-strategy>EntirePool</flush-strategy>
</pool>
<security>
<user-name>USERNAME</user-name>
<password>PASSWORD</password>
</security>
<validation>
<valid-connection-checker class-name="org.jboss.jca.adapters.jdbc.extensions.mssql.MSSQLValidConnectionChecker"/>
<background-validation>false</background-validation>
</validation>
</datasource>
Any idea why Teiid isn't validating and rebuilding the pool when this happens? If it can detect dead connections when I reboot the SQL Server, why can it not detect the dead connections when this random unknown event happens?
How can I investigate further? I'm blind to why the connections die randomly every few days and do not know if CCM would help debug this or if I should be monitoring with netstat.
Teiid does not maintain the connection pools, the WildFly server does. Teiid just requests a connection and uses it when one is returned, which could a closed connection if the pool is not validated.
Validation checks seem correct above. You can alternatively follow similar techniques for validation defined here [1]
<validation>
<check-valid-connection-sql>select 1</check-valid-connection-sql>
<validate-on-match>false</validate-on-match>
<background-validation>true</background-validation>
<background-validation-millis>10000</background-validation-millis>
</validation>
[1] http://www.mastertheboss.com/jboss-server/jboss-datasource/how-to-automatically-reconnect-to-the-database-in-wildfly
I have 2 nodes SolrCloud setup. Version is 6.6.6. I have taken Solr backup from other instance where there are 4 collection shards.
I have used following command to take back that works fine
http://10.11.31.11:8983/solr/admin/collections?action=BACKUP&name=hms&collection=collection1&location=/tmp/solr_backup&async=1001
After this I copied backup to one node of Solr Cloud and executed following command to restore.
http://10.11.31.12:8983/solr/admin/collections?action=RESTORE&name=hms&location=/home/hduser/Documents/search/data&collection=newCollection&maxShardsPerNode=4&replicationFactor=2&autoAddReplicas=true
I got following exception when executed above command
<response><lst name="responseHeader"><int name="status">500</int><int name="QTime">60</int></lst><str name="Operation restore caused exception:">org.apache.solr.common.SolrException:org.apache.solr.common.SolrException: Couldn't restore since doesn't exist: file:///home/user12/Documents/search/data/hms</str><lst name="exception"><str name="msg">Couldn't restore since doesn't exist: file:///home/user12/Documents/search/data/hms</str><int name="rspCode">500</int></lst><lst name="error"><lst name="metadata"><str name="error-class">org.apache.solr.common.SolrException</str><str name="root-error-class">org.apache.solr.common.SolrException</str></lst><str name="msg">Couldn't restore since doesn't exist: file:///home/user12/Documents/search/data/hms</str><str name="trace">org.apache.solr.common.SolrException: Couldn't restore since doesn't exist: file:///home/user12/Documents/search/data/hms
at org.apache.solr.handler.admin.CollectionsHandler.handleResponse(CollectionsHandler.java:300)
at org.apache.solr.handler.admin.CollectionsHandler.invokeAction(CollectionsHandler.java:237)
at org.apache.solr.handler.admin.CollectionsHandler.handleRequestBody(CollectionsHandler.java:215)
at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:173)
at org.apache.solr.servlet.HttpSolrCall.handleAdmin(HttpSolrCall.java:749)
at org.apache.solr.servlet.HttpSolrCall.handleAdminRequest(HttpSolrCall.java:730)
at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:510)
at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:361)
at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:305)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1691)
at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)
at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512)
at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)
at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
at org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
at org.eclipse.jetty.server.Server.handle(Server.java:534)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:320)
at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251)
at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:273)
at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95)
at org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)
at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303)
at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148)
at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136)
at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671)
at org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589)
at java.lang.Thread.run(Thread.java:748)
</str><int name="code">500</int></lst></response>
Then I copied the backup data to second node also with exactly same path as on first node and re-executed the restore command. The job was successful.
Does it mean for every backup, it is necessary to copy backup on all nodes of Solr with exactly same path or it is a bug ?
I was expecting that Solr should allow to restore collection from single node and then it should replicate shards on its own ? isn't it ?
The directory you're restoring from has to be available on all the servers at the same path. The path is assumed to be a network share available in a common location on the servers.
location
The location on the shared drive for the restore command to read from.
From the example in the documentation given above:
..&name=myBackupName&location=/path/to/my/shared/drive&...
I am using flume 1.5.0 version to migrate SQL server data to Amazon S3.
I am migrating only incremental data to s3. So its something like whenever a new record gets inserted into my sql server then it should be replicate on S3.
I can write sql server data to s3 in north virginia region, but when I create a bucket in mumbai region and write data in mumbai region then its throwing below error-
19/09/07 13:01:08 WARN hdfs.HDFSEventSink: HDFS IO error
java.io.IOException: s3n://BUCKET-NAME : 400 : Bad Request
at org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.processException(Jets3tNativeFileSystemStore.java:453)
at org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.processException(Jets3tNativeFileSystemStore.java:427)
at org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.handleException(Jets3tNativeFileSystemStore.java:411)
at org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.retrieveMetadata(Jets3tNativeFileSystemStore.java:181)
at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at org.apache.hadoop.fs.s3native.$Proxy8.retrieveMetadata(Unknown Source)
at org.apache.hadoop.fs.s3native.NativeS3FileSystem.getFileStatus(NativeS3FileSystem.java:476)
at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1424)
at org.apache.hadoop.fs.s3native.NativeS3FileSystem.create(NativeS3FileSystem.java:403)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:909)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:890)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:787)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:776)
at org.apache.flume.sink.hdfs.HDFSDataStream.doOpen(HDFSDataStream.java:86)
at org.apache.flume.sink.hdfs.HDFSDataStream.open(HDFSDataStream.java:113)
at org.apache.flume.sink.hdfs.BucketWriter$1.call(BucketWriter.java:273)
at org.apache.flume.sink.hdfs.BucketWriter$1.call(BucketWriter.java:262)
at org.apache.flume.sink.hdfs.BucketWriter$9$1.run(BucketWriter.java:718)
at org.apache.flume.sink.hdfs.BucketWriter.runPrivileged(BucketWriter.java:183)
at org.apache.flume.sink.hdfs.BucketWriter.access$1700(BucketWriter.java:59)
at org.apache.flume.sink.hdfs.BucketWriter$9.call(BucketWriter.java:715)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.jets3t.service.impl.rest.HttpException: 400 Bad Request
My flume conf sink properties are as below-
# SINK
agent.sinks = s3hdfs
agent.sinks.s3hdfs.type = hdfs
agent.sinks.s3hdfs.hdfs.path = s3n://ACCESS_KEY:SECRET_KEY#BUCKET-NAME/test
agent.sinks.s3hdfs.hdfs.fileType = DataStream
agent.sinks.s3hdfs.hdfs.filePrefix = test
agent.sinks.s3hdfs.hdfs.writeFormat = Text
agent.sinks.s3hdfs.hdfs.rollCount = 0
agent.sinks.s3hdfs.hdfs.rollSize = 67108864 #64Mb filesize
agent.sinks.s3hdfs.hdfs.batchSize = 100
agent.sinks.s3hdfs.hdfs.rollInterval = 1
agent.sinks.s3hdfs.channel = ch1
I tried S3a instead of S3n but it also did not work.
So please suggest me what changes I need to make so it can write to S3 bucket in mumbai region. If I need to mention region name in flume conf file, then how would I mention that.
Thanks in advance.
S3n is obsolete. S3A will work with it -look at the hadoop docs on fs.s3a.endpoint and signing algorithm. These are supported in Hadoop 2.8+
I am running a program that crawls the web and saves data into a solr index. for mysterious reasons, the solr server crashed. And now I end up with a corrupted index that has no segment files and hence risking losing all my data collected for 5 days....
The error message reads as below when you try to search on this index. the index folder definitely has data, as it has 182 files and 2GB in size.
I have tried to use CheckIndex but get the same error about no segment files...
java.util.concurrent.ExecutionException: org.apache.solr.common.SolrException: Unable to create core [chase]
at java.util.concurrent.FutureTask.report(FutureTask.java:122)
at java.util.concurrent.FutureTask.get(FutureTask.java:192)
at org.apache.solr.core.CoreContainer.lambda$load$6(CoreContainer.java:586)
at com.codahale.metrics.InstrumentedExecutorService$InstrumentedRunnable.run(InstrumentedExecutorService.java:176)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:229)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.solr.common.SolrException: Unable to create core [chase]
at org.apache.solr.core.CoreContainer.create(CoreContainer.java:935)
at org.apache.solr.core.CoreContainer.lambda$load$5(CoreContainer.java:558)
at com.codahale.metrics.InstrumentedExecutorService$InstrumentedCallable.call(InstrumentedExecutorService.java:197)
... 5 more
Caused by: org.apache.solr.common.SolrException: Error opening new searcher
at org.apache.solr.core.SolrCore.<init>(SolrCore.java:977)
at org.apache.solr.core.SolrCore.<init>(SolrCore.java:830)
at org.apache.solr.core.CoreContainer.create(CoreContainer.java:920)
... 7 more
Caused by: org.apache.solr.common.SolrException: Error opening new searcher
at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:2069)
at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:2189)
at org.apache.solr.core.SolrCore.initSearcher(SolrCore.java:1071)
at org.apache.solr.core.SolrCore.<init>(SolrCore.java:949)
... 9 more
Caused by: org.apache.lucene.index.IndexNotFoundException: no segments* file found in LockValidatingDirectoryWrapper(NRTCachingDirectory(MMapDirectory#/home/zqz/Work/chase/aws/data/solr/chase/data/index lockFactory=org.apache.lucene.store.NativeFSLockFactory#51b2fc7e; maxCacheMB=48.0 maxMergeSizeMB=4.0)): files: [_fh2.fdt, _fh2.fdx, _fh2.fnm, _fh2.nvd, _fh2.nvm, _fh2.si, _fh2_Lucene50_0.doc, _fh2_Lucene50_0.pos, _fh2_Lucene50_0.tim, _fh2_Lucene50_0.tip, write.lock]
at org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:925)
at org.apache.solr.update.SolrIndexWriter.<init>(SolrIndexWriter.java:118)
at org.apache.solr.update.SolrIndexWriter.create(SolrIndexWriter.java:93)
at org.apache.solr.update.DefaultSolrCoreState.createMainIndexWriter(DefaultSolrCoreState.java:248)
at org.apache.solr.update.DefaultSolrCoreState.getIndexWriter(DefaultSolrCoreState.java:122)
at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:2030)
... 12 more
2017-06-20 14:38:52.428 INFO (qtp475266352-16) [ ] o.a.s.c.TransientSolrCoreCacheDefault Allocating transient cache for 2147483647 transient cores
2017-06-20 14:38:52.894 INFO (qtp475266352-13) [ ] o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/cores params={indexInfo=false&wt=json&_=1497969532681} status=0 QTime=11
2017-06-20 14:38:52.962 INFO (qtp475266352-20) [ ] o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/system params={wt=json&_=1497969532684} status=0 QTime=76
The error you mentioned is caused by the missing file :
segments* e.g. segments_3 ...
in the index files :
files: [_fh2.fdt, _fh2.fdx, _fh2.fnm, _fh2.nvd, _fh2.nvm, _fh2.si, _fh2_Lucene50_0.doc, _fh2_Lucene50_0.pos, _fh2_Lucene50_0.tim, _fh2_Lucene50_0.tip, write.lock]
That file specifies the last commit point and the last generation of segments to take into account and apparently it is missing.
Check if that file is there and is readable.
If it is not ( because for example the index writer was not closed properly due to the mulfuction, do not despair.
Chances are there that the transaction log contains still the documents you indexed, so you could just replay it and get the documents back ( clean the index dir, make solr starting and it should take care).
Solr allows also a backup functionality, so for the future you may want to configure it.
I am running a single node Cassandra instance (for dev purposes) and am looking to insert an integer row into it. My Keyspace and columnfamily are already created on Cassandra.
I am using Cassandra 1.0 with Hector 1.0.5 (Jar version). My code is as follows:
Cluster cluster = HFactory.getOrCreateCluster("Test Cluster", "10.40.14.93:9160");
Keyspace keyspaceOperator = HFactory.createKeyspace("mykeyspace", cluster)
Mutator intM = HFactory.createMutator(keyspaceOperator, IntegerSerializer.get());
for each elem in my list {
intM.insert(doc.document_id ,
"mycolfamily",
me.prettyprint.hector.api.factory.HFactory.createColumn("numAdults", doc.numAdults))
}
I get TimedOutException on my client, and in the Cassandra logs, I see a bunch of the following:
ERROR [MutationStage:357] 2012-07-20 08:15:02,106 AbstractCassandraDaemon.java (line 139) Fatal exception in thread Thread[MutationStage:357,5,main]
java.lang.RuntimeException: java.lang.NumberFormatException: For input string: ""
at org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1228)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
Caused by: java.lang.NumberFormatException: For input string: ""
at java.lang.NumberFormatException.forInputString(NumberFormatException.java:48)
at java.lang.Long.parseLong(Long.java:410)
at java.lang.Long.parseLong(Long.java:468)
at org.apache.solr.schema.TrieField.createField(TrieField.java:508)
at org.apache.solr.schema.FieldType.createFields(FieldType.java:292)
at org.apache.solr.schema.SchemaField.createFields(SchemaField.java:106)
at com.datastax.bdp.cassandra.index.solr.SolrSecondaryIndex.addFieldToDocument(SolrSecondaryIndex.java:382)
at com.datastax.bdp.cassandra.index.solr.SolrSecondaryIndex.populateDocument(SolrSecondaryIndex.java:280)
at com.datastax.bdp.cassandra.index.solr.SolrSecondaryIndex.applyIndexUpdates(SolrSecondaryIndex.java:164)
at org.apache.cassandra.db.index.SecondaryIndexManager.applyIndexUpdates(SecondaryIndexManager.java:419)
at org.apache.cassandra.db.Table.apply(Table.java:448)
at org.apache.cassandra.db.RowMutation.apply(RowMutation.java:256)
at org.apache.cassandra.service.StorageProxy$6.runMayThrow(StorageProxy.java:415)
at org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1224)
... 3 more
ERROR [MutationStage:357] 2012-07-20 08:15:02,106 AbstractCassandraDaemon.java (line 139) Fatal exception in thread Thread[MutationStage:357,5,main]
java.lang.RuntimeException: java.lang.NumberFormatException: For input string: ""
at org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1228)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
Caused by: java.lang.NumberFormatException: For input string: ""
at java.lang.NumberFormatException.forInputString(NumberFormatException.java:48)
at java.lang.Long.parseLong(Long.java:410)
at java.lang.Long.parseLong(Long.java:468)
at org.apache.solr.schema.TrieField.createField(TrieField.java:508)
at org.apache.solr.schema.FieldType.createFields(FieldType.java:292)
at org.apache.solr.schema.SchemaField.createFields(SchemaField.java:106)
at com.datastax.bdp.cassandra.index.solr.SolrSecondaryIndex.addFieldToDocument(SolrSecondaryIndex.java:382)
at com.datastax.bdp.cassandra.index.solr.SolrSecondaryIndex.populateDocument(SolrSecondaryIndex.java:280)
at com.datastax.bdp.cassandra.index.solr.SolrSecondaryIndex.applyIndexUpdates(SolrSecondaryIndex.java:164)
at org.apache.cassandra.db.index.SecondaryIndexManager.applyIndexUpdates(SecondaryIndexManager.java:419)
at org.apache.cassandra.db.Table.apply(Table.java:448)
at org.apache.cassandra.db.RowMutation.apply(RowMutation.java:256)
at org.apache.cassandra.service.StorageProxy$6.runMayThrow(StorageProxy.java:415)
at org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1224)
}
I am trialling the Datastax Enterprise (DSE) which packages Cassandra, Hadoop, Solr etc. I have created my Cassandra CF via Solr Configuration (You can post Solr config and schema xmls to a Datastax instance to create the Keyspace and CF - its a feature of DSE)
Could someone please help?
Try adding an explicit serializer to your createColumn call...like so:
me.prettyprint.hector.api.factory.HFactory.createColumn("numAdults", doc.numAdults, StringSerializer.get(), IntegerSerializer.get()))
Also, on another note, I see you're doing inserts in a loop. Doing intM.addInsertion inside the loop and then intM.execute() once its done is more efficient.