I shut down my Neo4J instance every night to do a backup. This morning I found that it failed to start up again:
2015-12-05 03:38:49.326+0000 INFO Successfully shutdown Neo4j Server
2015-12-05 03:38:49.330+0000 ERROR Failed to start Neo4j: Starting Neo4j failed: Component 'org.neo4j.server.database.LifecycleManagingDatabase#7728902c' was successfully initialized, but failed to start. Please see attached cause exception. Starting Neo4j failed: Component 'org.neo4j.server.database.LifecycleManagingDatabase#7728902c' was successfully initialized, but failed to start. Please see attached cause exception.
org.neo4j.server.ServerStartupException: Starting Neo4j failed: Component 'org.neo4j.server.database.LifecycleManagingDatabase#7728902c' was successfully initialized, but failed to start. Please see attached cause exception.
at org.neo4j.server.exception.ServerStartupErrors.translateToServerStartupError(ServerStartupErrors.java:67)
at org.neo4j.server.AbstractNeoServer.start(AbstractNeoServer.java:234)
at org.neo4j.server.Bootstrapper.start(Bootstrapper.java:97)
at org.neo4j.server.CommunityBootstrapper.start(CommunityBootstrapper.java:48)
at org.neo4j.server.CommunityBootstrapper.main(CommunityBootstrapper.java:35)
Caused by: org.neo4j.kernel.lifecycle.LifecycleException: Component 'org.neo4j.server.database.LifecycleManagingDatabase#7728902c' was successfully initialized, but failed to start. Please see attached cause exception.
at org.neo4j.kernel.lifecycle.LifeSupport$LifecycleInstance.start(LifeSupport.java:462)
at org.neo4j.kernel.lifecycle.LifeSupport.start(LifeSupport.java:111)
at org.neo4j.server.AbstractNeoServer.start(AbstractNeoServer.java:194)
... 3 more
Caused by: java.lang.RuntimeException: Error starting org.neo4j.kernel.impl.factory.CommunityFacadeFactory, /lustre/scratch116/vr/vrpipe/neo4j/production/db
at org.neo4j.kernel.impl.factory.GraphDatabaseFacadeFactory.newFacade(GraphDatabaseFacadeFactory.java:143)
at org.neo4j.kernel.impl.factory.CommunityFacadeFactory.newFacade(CommunityFacadeFactory.java:43)
at org.neo4j.kernel.impl.factory.GraphDatabaseFacadeFactory.newFacade(GraphDatabaseFacadeFactory.java:108)
at org.neo4j.server.CommunityNeoServer$1.newGraphDatabase(CommunityNeoServer.java:66)
at org.neo4j.server.database.LifecycleManagingDatabase.start(LifecycleManagingDatabase.java:95)
at org.neo4j.kernel.lifecycle.LifeSupport$LifecycleInstance.start(LifeSupport.java:452)
... 5 more
Caused by: org.neo4j.kernel.lifecycle.LifecycleException: Component 'org.neo4j.kernel.api.impl.index.LuceneLabelScanStore#28c94a12' failed to initialize. Please see attached cause exception.
at org.neo4j.kernel.lifecycle.LifeSupport$LifecycleInstance.init(LifeSupport.java:434)
at org.neo4j.kernel.lifecycle.LifeSupport.init(LifeSupport.java:66)
at org.neo4j.kernel.lifecycle.LifeSupport.start(LifeSupport.java:102)
at org.neo4j.kernel.NeoStoreDataSource.start(NeoStoreDataSource.java:600)
at org.neo4j.kernel.lifecycle.LifeSupport$LifecycleInstance.start(LifeSupport.java:452)
at org.neo4j.kernel.lifecycle.LifeSupport.start(LifeSupport.java:111)
at org.neo4j.kernel.impl.transaction.state.DataSourceManager.start(DataSourceManager.java:112)
at org.neo4j.kernel.lifecycle.LifeSupport$LifecycleInstance.start(LifeSupport.java:452)
at org.neo4j.kernel.lifecycle.LifeSupport.start(LifeSupport.java:111)
at org.neo4j.kernel.impl.factory.GraphDatabaseFacadeFactory.newFacade(GraphDatabaseFacadeFactory.java:139)
... 10 more
Caused by: java.io.IOException: Label scan store could not be read, and needs to be rebuilt. To trigger a rebuild, ensure the database is stopped, delete the files in '/lustre/scratch116/vr/vrpipe/neo4j/production/db/schema/label/lucene', and then start the database again.
at org.neo4j.kernel.api.impl.index.LuceneLabelScanStore.init(LuceneLabelScanStore.java:259)
at org.neo4j.kernel.lifecycle.LifeSupport$LifecycleInstance.init(LifeSupport.java:424)
... 19 more
I followed its advice to delete db/schema/label/lucene/*, and the database started up fine, but I can't query any existing nodes or relationships. The web front end says I have no node labels or relationship types. I tried doing match (n)-[r]-() return n,r, but that returns nothing.
How do I get my database back? Perhaps I need to force rebuilding of the lucene indexes somehow?
You took a backup before you deleted it?
You only deleted that directory?
What does the new startup log look like?
How much data do you have in your db?
What does this return? match (n) return count(*)
Related
I am getting database error while creating tasks in nexus repository management, in the logs it showing as follows.
Error Log
2022-09-23 09:57:34,637+0000 ERROR [status-delayed-tasks-2-thread-1] *SYSTEM com.orientechnologies.orient.core.db.OPartitionedDatabasePool$DatabaseDocumentTxPooled - $ANSI{green {db=component}} Error on transaction commit `52E3D568`
com.orientechnologies.orient.core.exception.OLowDiskSpaceException: Error occurred while executing a write operation to database 'component' due to limited free space on the disk (1751 MB). The database is now working in read-only mode. Please close the database (or stop OrientDB), make room on your hard drive and then reopen the database. The minimal required space is 4096 MB. Required space is now set to 4096MB (you can change it by setting parameter storage.diskCache.diskFreeSpaceLimit) .
DB name="component"
at com.orientechnologies.orient.core.storage.impl.local.OAbstractPaginatedStorage.checkLowDiskSpaceRequestsAndReadOnlyConditions(OAbstractPaginatedStorage.java:5073)
at com.orientechnologies.orient.core.storage.impl.local.OAbstractPaginatedStorage.commit(OAbstractPaginatedStorage.java:1729)
at com.orientechnologies.orient.core.tx.OTransactionOptimistic.doCommit(OTransactionOptimistic.java:541)
at com.orientechnologies.orient.core.tx.OTransactionOptimistic.commit(OTransactionOptimistic.java:99)
at com.orientechnologies.orient.core.db.document.ODatabaseDocumentTx.commit(ODatabaseDocumentTx.java:2908)
at com.orientechnologies.orient.core.db.document.ODatabaseDocumentTx.commit(ODatabaseDocumentTx.java:2870)
at org.sonatype.nexus.orient.transaction.OrientTransaction.commit(OrientTransaction.java:74)
at org.sonatype.nexus.transaction.TransactionalWrapper.proceedWithTransaction(TransactionalWrapper.java:69)
at org.sonatype.nexus.transaction.Operations.proceedWithTransaction(Operations.java:232)
at org.sonatype.nexus.transaction.Operations.transactional(Operations.java:223)
at org.sonatype.nexus.transaction.Operations.run(Operations.java:175)
at org.sonatype.nexus.orient.transaction.OrientOperations.run(OrientOperations.java:62)
at org.sonatype.nexus.orient.internal.status.OrientStatusHealthCheckStore.checkWritable(OrientStatusHealthCheckStore.java:82)
at org.sonatype.nexus.orient.internal.status.OrientStatusHealthCheckStore$$EnhancerByGuice$$180293120.GUICE$TRAMPOLINE(<generated>)
at com.google.inject.internal.InterceptorStackCallback$InterceptedMethodInvocation.proceed(InterceptorStackCallback.java:74)
at org.sonatype.nexus.common.stateguard.MethodInvocationAction.run(MethodInvocationAction.java:39)
at org.sonatype.nexus.common.stateguard.StateGuard$GuardImpl.run(StateGuard.java:272)
at org.sonatype.nexus.common.stateguard.GuardedInterceptor.invoke(GuardedInterceptor.java:54)
at com.google.inject.internal.InterceptorStackCallback$InterceptedMethodInvocation.proceed(InterceptorStackCallback.java:75)
at com.google.inject.internal.InterceptorStackCallback.invoke(InterceptorStackCallback.java:55)
at org.sonatype.nexus.orient.internal.status.OrientStatusHealthCheckStore$$EnhancerByGuice$$180293120.checkWritable(<generated>)
at org.sonatype.nexus.orient.internal.freeze.OrientFreezeService.checkWritable(OrientFreezeService.java:119)
at org.sonatype.nexus.thread.DatabaseStatusDelayedExecutor.lambda$1(DatabaseStatusDelayedExecutor.java:103)
at org.sonatype.nexus.thread.DatabaseStatusDelayedExecutor.lambda$0(DatabaseStatusDelayedExecutor.java:90)
at org.sonatype.nexus.thread.internal.MDCAwareRunnable.run(MDCAwareRunnable.java:40)
at org.apache.shiro.subject.support.SubjectRunnable.doRun(SubjectRunnable.java:120)
at org.apache.shiro.subject.support.SubjectRunnable.run(SubjectRunnable.java:108)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
so anyone knows what's the wrong with it or what may cause this problem?
I'm getting started with Kapacitor and have been trying to run the first guide in the Kapacitor documentation, but with data I already have. I managed to define a task, but I can neither enable it nor can I run a backfill. I came across this question, which is similar to my problem, but the answer there didn't help. In contrast to the error message there I get empty strings for database, retention policy, and/or measurement.
In Kapacitor config I set an InfluxDB connection to the local host instance with the name localhost (which has a database mydb and the measurements weather.current.clouds and weather.current.visibility with default retention policy autogen) and created the following weathertest.tick script:
dbrp "mydb"."autogen"
var clouds = batch
|query('select mean(value) / 100.0 as val from "mydb"."autogen"."weather.current.clouds"')
.period(1h)
.every(1h)
.groupBy(time(1m), *)
.fill(0)
var vis = batch
|query('select mean(value) / 10000.0 as val from "mydb"."autogen"."weather.current.visibility"')
.period(1h)
.every(1h)
.groupBy(time(1m), *)
.fill(0)
clouds
|join(vis)
.as('c', 'v')
|eval(lambda: 100 * (1 - "c.val") * "v.val")
.as('pcent')
|influxDBOut()
.cluster('localhost')
.database('mydb')
.retentionPolicy('autogen')
.measurement('testmetric')
.tag('host', 'myhost.local')
.tag('key', 'weather.current.lightidx')
This is what I came up with after hours of trial and (especially) error. As given in the title, when I try to enable my task with kapacitor enable weathertest, I get the error message enabling task weathertest: batch query is not allowed to request data from ""."". Same thing when I try to record as in the "Backfill" example. Also, in that example there is a start and a stop date for limiting the time frame. The time format given there is wrong and is not understood by Kapacitor. Instead of e. g. 2015-10-01 I have to put in 2015-10-01T00:00Z to make it at least pass the error message regarding time format error.
In the Kapacitor logs there is not a single line regarding these errors, only when I try to remove a record, I get something like remove /var/lib/kapacitor/replay/1f5...750.brpl: no such file or directory and this can be found in the logs. There are lots of info lines in the logs showing successful POSTs to/from InfluxDB for the _internal database with HTTP response result 204.
Has anyone an Idea what I may be doing wrong?
OK, after the weekend I tried again. Without any change it accepted my script now in the failing steps, however, now I was able to find error messages in the log. The node mentioned there was the eval node and pointed towards a type mismatch. When I changed the line
|eval(lambda: 100 * (1 - "c.val") * "v.val")
to
|eval(lambda: 100.0 * (1.0 - "c.val") * "v.val")
the error messages were gone and the command kapacitor show weathertest showed a rather sane content now.
Furthermore, I redefined, recorded, replayed and deleted the tasks and recordings during my tests over and over again and I may have forgotten to redefine tasks after making changes to the tick script (not really sure). After changing the above, redefining the task and replaying it I finally found the expected data in the InfluxDB instance.
I have just set up a WSO2 Message Broker 3.0.0 connecting to a SQL Server DB.
The DB for the Carbon MB component has been created successfully as well.
The DB for the Message Broker Data store is created and contains the table MB_QUEUE_MAPPING.
However when adding a Queue via the MB UI I see the following error in the stack trace:
[2015-12-16 15:00:41,472] ERROR {org.wso2.andes.store.rdbms.RDBMSMessageStoreImpl} - Error occurred while retrieving destination queue id for destina
tion queue TestQ
java.sql.SQLException: Invalid object name 'MB_QUEUE_MAPPING'.
at net.sourceforge.jtds.jdbc.SQLDiagnostic.addDiagnostic(SQLDiagnostic.java:372)
at net.sourceforge.jtds.jdbc.TdsCore.tdsErrorToken(TdsCore.java:2988)
at net.sourceforge.jtds.jdbc.TdsCore.nextToken(TdsCore.java:2421)
at net.sourceforge.jtds.jdbc.TdsCore.getMoreResults(TdsCore.java:671)
at net.sourceforge.jtds.jdbc.JtdsStatement.executeSQLQuery(JtdsStatement.java:505)
at net.sourceforge.jtds.jdbc.JtdsPreparedStatement.executeQuery(JtdsPreparedStatement.java:1029)
at org.wso2.andes.store.rdbms.RDBMSMessageStoreImpl.getQueueID(RDBMSMessageStoreImpl.java:1324)
at org.wso2.andes.store.rdbms.RDBMSMessageStoreImpl.getCachedQueueID(RDBMSMessageStoreImpl.java:1298)
at org.wso2.andes.store.rdbms.RDBMSMessageStoreImpl.addQueue(RDBMSMessageStoreImpl.java:1634)
at org.wso2.andes.store.FailureObservingMessageStore.addQueue(FailureObservingMessageStore.java:445)
at org.wso2.andes.kernel.AMQPConstructStore.addQueue(AMQPConstructStore.java:116)
at org.wso2.andes.kernel.AndesContextInformationManager.createQueue(AndesContextInformationManager.java:154)
at org.wso2.andes.kernel.disruptor.inbound.InboundQueueEvent.updateState(InboundQueueEvent.java:151)
at org.wso2.andes.kernel.disruptor.inbound.InboundEventContainer.updateState(InboundEventContainer.java:167)
at org.wso2.andes.kernel.disruptor.inbound.StateEventHandler.onEvent(StateEventHandler.java:67)
at org.wso2.andes.kernel.disruptor.inbound.StateEventHandler.onEvent(StateEventHandler.java:41)
at com.lmax.disruptor.BatchEventProcessor.run(BatchEventProcessor.java:128)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
The "Add Queue" screen does not go away however the Queue does get added to the MB_QUEUE table just fine in the DB. Both tables MB_QUEUE_MAPPING & MB_QUEUE_COUNTER are blank.
The "List Queues" screen does blank despite a number of Queues in the MB_QUEUE table. Stack trace also shows errors but is not included as its not relevant to the error above.
I can create a Topic just fine however.
I want to know why MB would say the table MB_QUEUE_MAPPING is an Invalid object name when the table clearly exists ?
I suspect the way you have configure the mysql database is incorrect.So you can better try out one of these below two scenarios to make sure about this issue.
1) starting the server for the first time with the -Dsetup parameter or
2) you can refer the documentation(https://docs.wso2.com/display/MB300/Configuring+MySQL) "Configuring MySQL" and follow step by step instructions given in order.
I have tried out the second scenario and I did not get any exception when I am adding queue.And the document I have mentioned will have to be update as below.
you can see this command in the step 3.
mysql -u <db_user_name> -p -D<database_name> < '<WSO2MB_HOME>/dbscripts/mb-store/mysql-mb.sql ';
db_user_name - username of db.
database_name - database name that you have created in the step 1.
WSO2MB_HOME - home directory path for MB.
Hope this could help you to resolve this issue.
It seems user connecting to MSSQL database not having correct permission. Most probably SELECT permission. Reason why I am saying is, when you adding queue, it does get added. This means user has INSERT permission. Once queue added, page redirected to Queue List page. User must have SELECT permission to retrieve queue list. Topic are not getting added to database, it keeps in registry. You can verify user who connecting to MSSQL from configuration like below in wso2mb-3.0.0/repository/conf/datasources/master-datasources.xml.
<datasource>
<name>WSO2_MB_STORE_DB</name>
<jndiConfig>
<name>WSO2MBStoreDB</name>
</jndiConfig>
<definition type="RDBMS">
<configuration>
<url>jdbc:jtds:sqlserver://localhost:1433/wso2_mb</url>
<username>sa</username>
<password>sa</password>
<driverClassName>net.sourceforge.jtds.jdbc.Driver</driverClassName>
<maxActive>200</maxActive>
<maxWait>60000</maxWait>
<minIdle>5</minIdle>
<testOnBorrow>true</testOnBorrow>
<validationQuery>SELECT 1</validationQuery>
<validationInterval>30000</validationInterval>
<defaultAutoCommit>false</defaultAutoCommit>
</configuration>
</definition>
</datasource>
I am using a TITAN-0.4.3, REXSTER 2.4 over Cassandra & Elasticsearch.
I am calling rexpro from Python. In a single gremlin-request, I am trying to add 100 vertices and commit. I am able to successfully add 40000+ vertices, in 400+ gremlin-requests. However after that , I am getting exception :
Encountered a RexProScriptException: An error occurred while processing the script for language [groov
y]. All transactions across all graphs in the session have been concluded with failure: java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: PermGen space
Rexster.sh [JVM heap size]
I tried to increase heap memory, but still throws the exception, after insertion of few more batches of vertices.
# Set Java options
if [ "$JAVA_OPTIONS" = "" ] ; then
JAVA_OPTIONS="-Xms256m -Xmx1024m"
fi
Please advice
Just a guess based on the information you provided, but.....PermGen errors usually show up in Rexster if you are not parameterizing the scripts you are sending. Most of the python libraries out there that I know of support that feature. You can read more about this issue here:
https://github.com/tinkerpop/rexster/issues/143
and other places in the gremlin users mailing list if you search around. If for some reason you can't parameterize then you can alter this JVM setting:
-XX:+CMSClassUnloadingEnabled -XX:MaxPermSize=512M
but I'd consider that a last resort. Parameterization should not only get rid of your problem but will also greatly speed up your data loading process.
I've been having these issues for quite a while already but I ignored them initially because I can still start my nodes. However, one of these issues became more serious recently that it now takes me a lot of tries in order to successfully start a node.
Issue #1: Unable to start DSE server / Plugin activation failed / Cannot find core
ERROR [main] 2015-01-28 03:30:40,058 DseDaemon.java (line 492) Unable to start DSE server.
java.lang.RuntimeException: com.datastax.bdp.plugin.PluginManager$PluginActivationException: Plugin activation failed
at com.datastax.bdp.plugin.PluginManager.activate(PluginManager.java:135)
at com.datastax.bdp.server.DseDaemon.start(DseDaemon.java:480)
at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:509)
at com.datastax.bdp.server.DseDaemon.main(DseDaemon.java:659)
Caused by: com.datastax.bdp.plugin.PluginManager$PluginActivationException: Plugin activation failed
at com.datastax.bdp.plugin.PluginManager.activate(PluginManager.java:284)
at com.datastax.bdp.plugin.PluginManager.activate(PluginManager.java:128)
... 3 more
Caused by: java.lang.IllegalStateException: Cannot find core: myks.mycf
at com.datastax.bdp.search.solr.core.SolrCoreResourceManager.doWaitForCore(SolrCoreResourceManager.java:742)
at com.datastax.bdp.search.solr.core.SolrCoreResourceManager.waitForCore(SolrCoreResourceManager.java:478)
at com.datastax.bdp.plugin.SolrContainerPlugin.waitForSecondaryIndexesLoading(SolrContainerPlugin.java:237)
at com.datastax.bdp.plugin.SolrContainerPlugin.onActivate(SolrContainerPlugin.java:98)
at com.datastax.bdp.plugin.PluginManager.initialize(PluginManager.java:334)
at com.datastax.bdp.plugin.PluginManager.activate(PluginManager.java:263)
... 4 more
INFO [Thread-3] 2015-01-28 03:30:40,059 DseDaemon.java (line 505) DSE shutting down...
INFO [StorageServiceShutdownHook] 2015-01-28 03:30:40,164 Gossiper.java (line 1307) Announcing shutdown
INFO [Thread-3] 2015-01-28 03:30:40,620 PluginManager.java (line 356) All plugins are stopped.
INFO [Thread-3] 2015-01-28 03:30:40,620 CassandraDaemon.java (line 463) Cassandra shutting down...
INFO [StorageServiceShutdownHook] 2015-01-28 03:30:42,165 MessagingService.java (line 701) Waiting for messaging service to quiesce
INFO [ACCEPT-/144.76.201.233] 2015-01-28 03:30:42,814 MessagingService.java (line 941) MessagingService has terminated the accept() thread
This exception started as a "mild" issue - mild because although it prevents a node from starting up when it happens, it usually takes me 1 more try to successfully start the affected node. However, about two weeks ago, after having not restarted any of my nodes for quite a while, I discovered that I now need a lot more attempts (20+) in order to start a node.
From the stack trace, it looks like a timeout issue (in doWaitForCore()); but I cannot find a setting to increase the amount of time that DSE would wait for a core to load during startup before giving up. The core that is mentioned in the stack trace is always the same, and I assume that this is because it is my biggest core (~1.4 billions records) and it takes the longest time to load. But when I manage to start the node successfully, there are no signs of errors - I can query the core like any other core.
--
There are two other issues that may or may not be related to the one above. Both of them always appear during startup; and unlike the first one, they do not cause a startup failure (i.e. they also appear when a node starts successfully)
Issue #2: Invalid Number: static
ERROR [searcherExecutor-67-thread-1] 2015-01-28 04:26:49,691 SolrException.java (line 124) org.apache.solr.common.SolrException: Invalid Number: static
at org.apache.solr.schema.TrieField.readableToIndexed(TrieField.java:396)
at org.apache.solr.schema.FieldType.getFieldQuery(FieldType.java:697)
at org.apache.solr.schema.TrieField.getFieldQuery(TrieField.java:343)
at org.apache.solr.parser.SolrQueryParserBase.getFieldQuery(SolrQueryParserBase.java:741)
at org.apache.solr.parser.SolrQueryParserBase.handleBareTokenQuery(SolrQueryParserBase.java:545)
at org.apache.solr.parser.QueryParser.Term(QueryParser.java:300)
at org.apache.solr.parser.QueryParser.Clause(QueryParser.java:186)
at org.apache.solr.parser.QueryParser.Query(QueryParser.java:108)
at org.apache.solr.parser.QueryParser.TopLevelQuery(QueryParser.java:97)
at org.apache.solr.parser.SolrQueryParserBase.parse(SolrQueryParserBase.java:153)
at org.apache.solr.search.LuceneQParser.parse(LuceneQParser.java:50)
at org.apache.solr.search.QParser.getQuery(QParser.java:143)
at org.apache.solr.handler.component.QueryComponent.prepare(QueryComponent.java:135)
at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:183)
I looked at the data that I imported and I couldn't find a supposedly-numeric value that was incorrectly supplied as "static". In the java application that I wrote to convert CSVs to SSTables, I cast all numeric values to int/long/double depending on the field type so I honestly don't think that it has something to do with my data.
Issue #3: Could not getStatistics on info bean com.datastax.bdp.search.solr.FilterCacheMBean
WARN [SolrSecondaryIndex myks.mycf2 index initializer.] 2015-01-28 04:26:51,770 JmxMonitoredMap.java (line 256) Could not getStatistics on info bean com.datastax.bdp.search.solr.FilterCacheMBean
java.lang.RuntimeException: java.lang.ClassCastException: org.apache.lucene.search.FieldCache$CreationPlaceholder cannot be cast to org.apache.solr.search.SolrCache
at com.datastax.bdp.search.solr.FilterCacheMBean.getStatistics(FilterCacheMBean.java:185)
at org.apache.solr.core.JmxMonitoredMap$SolrDynamicMBean.getMBeanInfo(JmxMonitoredMap.java:236)
at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getNewMBeanClassName(DefaultMBeanServerInterceptor.java:333)
at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:319)
at com.sun.jmx.mbeanserver.JmxMBeanServer.registerMBean(JmxMBeanServer.java:522)
at org.apache.solr.core.JmxMonitoredMap.put(JmxMonitoredMap.java:140)
at org.apache.solr.core.JmxMonitoredMap.put(JmxMonitoredMap.java:51)
at com.datastax.bdp.search.solr.core.CassandraCoreContainer.registerExtraMBeans(CassandraCoreContainer.java:679)
at com.datastax.bdp.search.solr.core.CassandraCoreContainer.register(CassandraCoreContainer.java:427)
at com.datastax.bdp.search.solr.core.CassandraCoreContainer.doLoad(CassandraCoreContainer.java:757)
at com.datastax.bdp.search.solr.core.CassandraCoreContainer.load(CassandraCoreContainer.java:162)
at com.datastax.bdp.search.solr.AbstractSolrSecondaryIndex$2.run(AbstractSolrSecondaryIndex.java:882)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.ClassCastException: org.apache.lucene.search.FieldCache$CreationPlaceholder cannot be cast to org.apache.solr.search.SolrCache
at com.datastax.bdp.search.solr.FilterCacheMBean.getStatistics(FilterCacheMBean.java:174)
... 16 more
I have absolutely no idea what this is.
--
Has anyone encountered these errors/exceptions/warnings before? What did you do?
Issue #1: The max waiting time to load a core was hard-coded at 1 min. So, your assumption is right: a very large core or hundreds of cores could prevent the node starting due to the excessive time to load this particular core. In the next patch release (4.5.6, 4.6.1) we address this issue by creating a new option load_max_time_per_core in dse.yaml. This option allows you to increase the max waiting time for core loading, starting at 1 min. For 500 cores you would need to increase load_max_time_per_core to about 3 minutes, for example.
Issue #2: Unfortunately, I don't know what could be causing this. We would need further info about this to see why it's happening.
Issue #3: We have currently investigating what this can be.
Regarding issue #2, are you sure you don't have a QuerySenderListener with a wrong warmup query in your solrconfig?