dashdb out of memory on Cloudant sync - cloudant

I got the following error trying to warehouse a 7.7Gb database from Cloudant. How can I resolve this?
Exception thrown by application class 'org.apache.wink.server.internal.RequestProcessor.handleRequest:195'
javax.servlet.ServletException: java.lang.OutOfMemoryError: GC overhead limit exceeded
at org.apache.wink.server.internal.RequestProcessor.handleRequest(RequestProcessor.java:195)
at org.apache.wink.server.internal.servlet.RestServlet.service(RestServlet.java:124)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:668)
at com.ibm.ws.webcontainer.servlet.ServletWrapper.service(ServletWrapper.java:1274)
at [internal classes]
Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded
... 1 more

where do you see this error? are you triggering some api to call warehouse creation in cloudant ?
Can you check if there is a _warehouser database created on the cloudant side and check if there is error in the doc reported if the transformation failed?
the other thing we can check try is increasing the java memory "-Xmx2048M -Xms512M " ?

Related

Unable to query with Stargate after adding custom Cql3SolrSecondaryIndex

I have stargate 1.0.38 running fine in my DEV server. I am able to use stargate rest api to get auth_token and running insert, select queries.
Yesterday, I have created an index Cql3SolrSecondaryIndex for a table in my Cassandra DSE 6.8. Then I see bellow error in stargate log. After that, I dropped that index. But even after dropping the index, i still see bellow error in stargate log. I also try to stop/start stargate but the still see same error.
ERROR [MigrationStage:1] 2021-10-15 00:47:13,593 PullRequestScheduler.java:245 - Configuration exception merging remote schema
org.apache.cassandra.exceptions.ConfigurationException: Unable to find custom indexer class 'com.datastax.bdp.search.solr.Cql3SolrSecondaryIndex'
at org.apache.cassandra.utils.FBUtilities.classForName(FBUtilities.java:493)
at org.apache.cassandra.schema.IndexMetadata.getCustomIndexClass(IndexMetadata.java:190)
at org.apache.cassandra.schema.IndexMetadata.validate(IndexMetadata.java:131)
at org.apache.cassandra.schema.Indexes.lambda$validate$2(Indexes.java:168)
at java.lang.Iterable.forEach(Iterable.java:75)
at org.apache.cassandra.schema.Indexes.validate(Indexes.java:168)
at org.apache.cassandra.schema.TableMetadata.validate(TableMetadata.java:512)
at java.lang.Iterable.forEach(Iterable.java:75)
at org.apache.cassandra.schema.KeyspaceMetadata.validate(KeyspaceMetadata.java:112)
at org.apache.cassandra.schema.KeyspaceMetadata.<init>(KeyspaceMetadata.java:85)
at org.apache.cassandra.schema.KeyspaceMetadata.create(KeyspaceMetadata.java:167)
at org.apache.cassandra.schema.SchemaKeyspace.fetchKeyspace(SchemaKeyspace.java:1154)
at org.apache.cassandra.schema.SchemaKeyspace.fetchKeyspaces(SchemaKeyspace.java:1769)
at org.apache.cassandra.schema.SchemaManager.merge(SchemaManager.java:893)
at org.apache.cassandra.schema.SchemaManager.mergeAndAnnounceVersion(SchemaManager.java:877)
at org.apache.cassandra.schema.PullRequestScheduler.lambda$sendPullRequest$2(PullRequestScheduler.java:240)
at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:774)
at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:750)
at java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:456)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:88)
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.lang.Thread.run(Thread.java:748)
at org.apache.cassandra.utils.concurrent.InlinedThreadLocalThread.run(InlinedThreadLocalThread.java:251)
Caused by: java.lang.ClassNotFoundException: com.datastax.bdp.search.solr.Cql3SolrSecondaryIndex not found by io.stargate.db.dse [1]
at org.apache.felix.framework.BundleWiringImpl.findClassOrResourceByDelegation(BundleWiringImpl.java:1597)
at org.apache.felix.framework.BundleWiringImpl.access$300(BundleWiringImpl.java:79)
at org.apache.felix.framework.BundleWiringImpl$BundleClassLoader.loadClass(BundleWiringImpl.java:1982)
at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:264)
at org.apache.cassandra.utils.FBUtilities.classForName(FBUtilities.java:489)
... 24 common frames omitted
Because of this error, I am still able to get auth_token but all the select queries get this error
{
"description": "Resource not found: keyspace 'test' not found",
"code": 404
}
Please help me to fix this issue.
Stargate does not currently support advanced workloads like Search and Graph. I think you might have to drop and recreate that keyspace without the Solr index for it to work again since the schema still exists on the other nodes.
This issue has been documented here. There has also been a request made to support Solr here.
#David, I was able to drop the DSE Search index, perform a rolling restart of my DSE node(s) and then Stargate node(s) and it started up just fine without any errors. I ensure all my prior data existed just fine and was able to validate running basic CRUD operations using Stargate REST, GraphQL & Document APIs without any issues post that.

Flink - Why does Flink throw error when I submit job by 'flink run' after running Flink stand-alone more than 1 month?

I'm running Flink in stand-alone mode on 1 host (JobManager, TaskManager on the same host). At first, I'm able to submit and cancel jobs normally, the jobs showed up in the web UI and ran.
However, after ~1month, when I canceled the old job and submitting a new one, I faced org.apache.flink.client.program.ProgramInvocationException: Could not retrieve the execution result.
At this moment, I was able to run flink list to list current jobs and flink cancel to cancel the job, but flink run failed. Exception was thrown and the job was now shown in the web UI.
When I tried to stop the current stand-alone cluster using stop-cluster, it said 'no cluster was found'. Then I had to find the pid of flink processes and stop them manually. Then if I run start-cluster to create a new stand-alone cluster, I was able to submit jobs normally.
The shortened stack-trace: (full stack-trace at google docs link)
org.apache.flink.client.program.ProgramInvocationException: Could not retrieve the execution result. (JobID: 7ef1cbddb744cd5769297f4059f7c531)
at org.apache.flink.client.program.rest.RestClusterClient.submitJob (RestClusterClient.java:261)
Caused by: org.apache.flink.runtime.client.JobSubmissionException: Failed to submit JobGraph.
Caused by: org.apache.flink.runtime.concurrent.FutureUtils$RetryException: Could not complete the operation. Number of retries has been exhausted. Caused by: java.util.concurrent.CompletionException: org.apache.flink.runtime.rest.ConnectionClosedException: Channel became inactive.
Caused by: org.apache.flink.runtime.rest.ConnectionClosedException: Channel became inactive.
... 37 more
The error is consistent. It always happens after I let Flink run for a while, usually more than 1 month). Why am I not able to submit job to flink after a while? What happened here?

SolrCloud Error:No live SolrServers available to handle this request

I get below exception at solrcloud client end during data ingestion:
ERROR com.aexp.ims.atworks.ingestion.service.impl.IngestionServiceImpl - Solr Exception
org.apache.solr.client.solrj.SolrServerException: No live SolrServers available to handle this request:[http://tsnet1:8888/solr/TSEACH_shard1_replica2, http://tsnet2:8888/solr/TSEACH_shard2_replica2, http://tsnet3:8888/solr/TSEACH_shard1_replica1, http://tsnet4:8888/solr/TSEACH_shard2_replica1]
Caused by: org.apache.solr.client.solrj.SolrServerException: Server refused connection at: http://tsnet5:8888/solr/TSEACH_shard2_replica2
Caused by: org.apache.http.conn.HttpHostConnectException: Connection to http://tsnet2:8888 refused
Caused by: java.net.ConnectException: Connection refused
Solr log error:
ERROR update.UpdateLog - Error inspecting tlog tlog{file=/data/tsearch/solr/TSEACH_shard2_replica1/data/tlog/tlog.0000000000000005405 refcount=2}
ERROR update.UpdateLog - Error inspecting tlog tlog{file=/data/tsearch/solr/TSEACH_shard2_replica1/data/tlog/tlog.0000000000000005406 refcount=2}
ERROR update.UpdateLog - Error inspecting tlog tlog{file=/data/tsearch/solr/TSEACH_shard4_replica1/data/tlog/tlog.0000000000000005433 refcount=2}
ERROR update.UpdateLog - Error inspecting tlog tlog{file=/data/tsearch/solr/TSEACH_shard4_replica1/data/tlog/tlog.0000000000000005434 refcount=2}
Any help to resolve above error would be greatly appreciated?
I had a similar issue.
There could be many reasons to cause this exception. This is a one general error message masking the inner details.
While I cant offer a concrete solution to this problem, here is how I fixed my issue.
1. Check logs of all nodes.
Note that in cloud mode server logs are spread across nodes.
I was running cloud on single node. I could do this:
tail -100 example/cloud/node*/logs/solr.log
2. Find out inner cause and its fix
Do you see an exception?
Great.
Now google/lookup for the behind the scenes exception.
In my case, the group by query with group.ngroups=0 was the issue. It was fixed in 6.3 https://issues.apache.org/jira/browse/SOLR-4164
So, I upgraded solr and it solved
I had similar issue when my solr server was accidentally stopped.
I restarted it from bin/platform using following command and it worked:
ant startSolrServer

Solr 4.6.1 Streaming Solr Servers Error

After moving from Solr 4.4 to Solr 4.6.1, I am getting the below Exception while updating my Indexes using the Data Import Handler. Does anybody have any ideas on why this is happening?
ERROR - 2014-02-18 09:39:35.232; org.apache.solr.update.StreamingSolrServers$1; error
org.apache.solr.common.SolrException: Bad Request
request: http://10.200.131.174:8080/solr/collection1/update?update.distrib=FROMLEADER&distrib.from=http%3A%2F%2F10.200.131.173%3A8080%2Fsolr%2Fcollection1%2F&wt=javabin&version=2
at org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer$Runner.run(ConcurrentUpdateSolrServer.java:240)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:724)
ERROR - 2014-02-18 09:39:35.244; org.apache.solr.update.StreamingSolrServers$1; error
org.apache.solr.common.SolrException: Bad Request
This is a known issue for Solr versions greater than 4.5. I too ran into this while running SolrCloud with 2 shards. It seems that the update works fine on the server running DIH, but fails when it is forwarded to another shard or replica.
Here's the related jira issue. And here's another unlucky soul like you and I who also faced this error.
As far as I know, the only workaround currently is to go back to a Solr version 4.5 or below...

Solr connection timeout during indexing?

I have solrj client with infinite timeout(Solr4)
server.server.setSoTimeout(0)
server.server.setConnectionTimeout(0)
When I index my data I have many timeouts on server side.
Where can I update server side timeouts in solrconfig.xml or possible tomcat config?
Client side exception:
Caused by: java.net.SocketException: Broken pipe
at java.net.SocketOutputStream.socketWrite0(Native Method)
at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:92)
at java.net.SocketOutputStream.write(SocketOutputStream.java:136)
Server side exception:
Jan 31, 2013 8:55:54 PM org.apache.solr.common.SolrException log
SEVERE: org.apache.solr.common.SolrException: Read timed out
at org.apache.solr.handler.loader.XMLLoader.load(XMLLoader.java:159)
at org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:92)
at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74)
at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1699)
Caused by: java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:129)
at org.apache.coyote.http11.InternalInputBuffer.fill(InternalInputBuffer.java:751)
We had the same problem with Solr 4. We solved this after reading a blog post by Uwe Schindler (a Solr commiter).
With Solr 4 and several Solr 3 versions, you have to let an important share of your RAM free so that the system can use properly the mmap system call. This can be subtle depending on your system configuration (the blog post gives a plenty of informations on that point). In our case, this solved the problem: we could finally index without any more timeout issue.
the info for tomcat server.xml config will solve this. we got same stack trace and the below solved it for us:
http://forums.alfresco.com/ja/node/8458

Resources