how to handle execution timeout in flink - apache-flink

Connected to JobManager at Actor[akka.tcp://flink#localhost:6123/user/jobmanager#-1119198862] with leader session id 00000000-0000-0000-0000-000000000000.
org.apache.flink.client.program.ProgramInvocationException: The program execution failed: Couldn't retrieve the JobExecutionResult from the JobManager.
at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:478)
at org.apache.flink.client.program.StandaloneClusterClient.submitJob(StandaloneClusterClient.java:105)
at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:442)
at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:429)
at org.apache.flink.client.program.ContextEnvironment.execute(ContextEnvironment.java:62)
at FileSetWordCount.main(FileSetWordCount.java:178)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:528)
at org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:419)
at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:381)
at org.apache.flink.client.CliFrontend.executeProgram(CliFrontend.java:838)
at org.apache.flink.client.CliFrontend.run(CliFrontend.java:259)
at org.apache.flink.client.CliFrontend.parseParameters(CliFrontend.java:1086)
at org.apache.flink.client.CliFrontend$2.call(CliFrontend.java:1133)
at org.apache.flink.client.CliFrontend$2.call(CliFrontend.java:1130)
at org.apache.flink.runtime.security.HadoopSecurityContext$1.run(HadoopSecurityContext.java:43)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
at org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:40)
at org.apache.flink.client.CliFrontend.main(CliFrontend.java:1130)
Caused by: org.apache.flink.runtime.client.JobExecutionException: Couldn't retrieve the JobExecutionResult from the JobManager.
at org.apache.flink.runtime.client.JobClient.awaitJobResult(JobClient.java:309)
at org.apache.flink.runtime.client.JobClient.submitJobAndWait(JobClient.java:396)
at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:467)
... 23 more
Caused by: org.apache.flink.runtime.client.JobClientActorSubmissionTimeoutException: Job submission to the JobManager timed out. You may increase 'akka.client.timeout' in case the JobManager needs more time to configure and confirm the job submission.
at org.apache.flink.runtime.client.JobSubmissionClientActor.handleCustomMessage(JobSubmissionClientActor.java:119)
at org.apache.flink.runtime.client.JobClientActor.handleMessage(JobClientActor.java:251)
at org.apache.flink.runtime.akka.FlinkUntypedActor.handleLeaderSessionID(FlinkUntypedActor.java:89)
at org.apache.flink.runtime.akka.FlinkUntypedActor.onReceive(FlinkUntypedActor.java:68)
at akka.actor.UntypedActor$$anonfun$receive$1.applyOrElse(UntypedActor.scala:167)
at akka.actor.Actor$class.aroundReceive(Actor.scala:467)
at akka.actor.UntypedActor.aroundReceive(UntypedActor.scala:97)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
at akka.actor.ActorCell.invoke(ActorCell.scala:487)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:238)
at akka.dispatch.Mailbox.run(Mailbox.scala:220)
at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
No JobSubmissionResult returned, please make sure you called ExecutionEnvironment.execute()
This is an exception threw by Flink. How to handle the client timeout exception? This Flink application is going to run in the local envirionment. The application is used for about 1TB files processing.

I have figure it out. It is necessary to modify the 'flink-conf.yaml' file with adding a new line "akka.client.timeout: xx s"(e.g. "akka.client.timeout: 600 s").

Related

org.apache.flink.client.program.ProgramInvocationException: Could not retrieve the execution result

I am trying to start a Flink batch job on an AWS EMR cluster and am getting:
The program finished with the following exception:
org.apache.flink.client.program.ProgramInvocationException: Could not retrieve the execution result. (JobID: c7362754c49bdae8e9a46748d47bc180)
at org.apache.flink.client.program.rest.RestClusterClient.submitJob(RestClusterClient.java:260)
at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:486)
at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:474)
at org.apache.flink.client.program.ContextEnvironment.execute(ContextEnvironment.java:62)
at com.stolencamerafinder.flink.BatchJob.main(BatchJob.java:69)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:529)
at org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:421)
at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:426)
at org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:804)
at org.apache.flink.client.cli.CliFrontend.runProgram(CliFrontend.java:280)
at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:215)
at org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:1044)
at org.apache.flink.client.cli.CliFrontend.lambda$main$11(CliFrontend.java:1120)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1844)
at org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1120)
Caused by: org.apache.flink.runtime.client.JobSubmissionException: Failed to submit JobGraph.
at org.apache.flink.client.program.rest.RestClusterClient.lambda$submitJob$8(RestClusterClient.java:379)
at java.util.concurrent.CompletableFuture.uniExceptionally(CompletableFuture.java:870)
at java.util.concurrent.CompletableFuture$UniExceptionally.tryFire(CompletableFuture.java:852)
at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977)
at org.apache.flink.runtime.concurrent.FutureUtils.lambda$retryOperationWithDelay$5(FutureUtils.java:213)
at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:760)
at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:736)
at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
at java.util.concurrent.CompletableFuture.postFire(CompletableFuture.java:561)
at java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:929)
at java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:442)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.util.concurrent.CompletionException: org.apache.flink.runtime.concurrent.FutureUtils$RetryException: Could not complete the operation. Exception is not retryable.
at java.util.concurrent.CompletableFuture.encodeRelay(CompletableFuture.java:326)
at java.util.concurrent.CompletableFuture.completeRelay(CompletableFuture.java:338)
at java.util.concurrent.CompletableFuture.uniRelay(CompletableFuture.java:911)
at java.util.concurrent.CompletableFuture$UniRelay.tryFire(CompletableFuture.java:899)
... 12 more
Caused by: org.apache.flink.runtime.concurrent.FutureUtils$RetryException: Could not complete the operation. Exception is not retryable.
... 10 more
Caused by: java.util.concurrent.CompletionException: org.apache.flink.runtime.rest.util.RestClientException: [Job submission failed.]
at java.util.concurrent.CompletableFuture.encodeRelay(CompletableFuture.java:326)
at java.util.concurrent.CompletableFuture.completeRelay(CompletableFuture.java:338)
at java.util.concurrent.CompletableFuture.uniRelay(CompletableFuture.java:911)
at java.util.concurrent.CompletableFuture.uniCompose(CompletableFuture.java:953)
at java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:926)
... 4 more
Caused by: org.apache.flink.runtime.rest.util.RestClientException: [Job submission failed.]
at org.apache.flink.runtime.rest.RestClient.parseResponse(RestClient.java:310)
at org.apache.flink.runtime.rest.RestClient.lambda$submitRequest$3(RestClient.java:294)
at java.util.concurrent.CompletableFuture.uniCompose(CompletableFuture.java:952)
... 5 more
I have no idea of what the underlying cause is. Where can I find more details on why it went wrong?
On a similar note, I can't find a way of finding the logs for a successful job with Flink / EMR. Any idea?
To find more details, you need more information of the JobManager or ResourceManager. If you are using YARN as the cluster resource management framework, which many developers do, you can access to the ApplciationMatser's logs, which are the logs from JobManager in Flink.
I met the same problem, the reason is that I don't have start the flink cluster.
./bin/start-cluster.sh
i met this problem in my local mac pc, in local mode.
it runs well in maven project, but throws exception above when submitted to flink local cluster.
it was fixed after restarting my macbook

Fail to run flink-gelly-examples

I'm running Apache Flink 1.4.2 (Without bundled hadoop) on Mac, and I've managed to run the WordCount example. However, when I try to run the Gelly example following instructions in Runnung Gelly Examples , I stumble into the following error:
java.lang.ClassNotFoundException: org.apache.flink.graph.generator.random.BlockInfo
May I know how can I fix it?
I first started the cluster by
./bin/start-cluster.sh
Starting cluster.
[INFO] 1 instance(s) of jobmanager are already running on myhostname.
Starting jobmanager daemon on host myhostname.
[INFO] 1 instance(s) of taskmanager are already running on myhostname.
Starting taskmanager daemon on host myhostname.
Then in another terminal, I run
./bin/flink run examples/gelly/flink-gelly-examples_2.11-1.4.2.jar --algorithm GraphMetrics --order directed --input RMatGraph --type integer --scale 20 --simplify directed --output print
and the following errors
Cluster configuration: Standalone cluster with JobManager at localhost/127.0.0.1:6123
Using address localhost:6123 to connect to JobManager.
JobManager web interface address http://localhost:8081
Starting execution of program
Submitting job with JobID: 099acb25fa34be1fe63fb47296605f69. Waiting for job completion.
Connected to JobManager at Actor[akka.tcp://flink#localhost:6123/user/jobmanager#57507119] with leader session id 00000000-0000-0000-0000-000000000000.
------------------------------------------------------------
The program finished with the following exception:
org.apache.flink.client.program.ProgramInvocationException: The program execution failed: Failed to submit job 099acb25fa34be1fe63fb47296605f69 (RMatGraph (s20e16d) ⇨ GraphMetrics ⇨ Hash [integer])
at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:492)
at org.apache.flink.client.program.StandaloneClusterClient.submitJob(StandaloneClusterClient.java:105)
at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:456)
at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:444)
at org.apache.flink.client.program.ContextEnvironment.execute(ContextEnvironment.java:62)
at org.apache.flink.graph.Runner.execute(Runner.java:452)
at org.apache.flink.graph.Runner.main(Runner.java:507)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:525)
at org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:417)
at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:396)
at org.apache.flink.client.CliFrontend.executeProgram(CliFrontend.java:802)
at org.apache.flink.client.CliFrontend.run(CliFrontend.java:282)
at org.apache.flink.client.CliFrontend.parseParameters(CliFrontend.java:1054)
at org.apache.flink.client.CliFrontend$1.call(CliFrontend.java:1101)
at org.apache.flink.client.CliFrontend$1.call(CliFrontend.java:1098)
at org.apache.flink.runtime.security.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:30)
at org.apache.flink.client.CliFrontend.main(CliFrontend.java:1098)
Caused by: org.apache.flink.runtime.client.JobExecutionException: Failed to submit job 099acb25fa34be1fe63fb47296605f69 (RMatGraph (s20e16d) ⇨ GraphMetrics ⇨ Hash [integer])
at org.apache.flink.runtime.jobmanager.JobManager.org$apache$flink$runtime$jobmanager$JobManager$$submitJob(JobManager.scala:1325)
at org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1.applyOrElse(JobManager.scala:447)
at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:36)
at org.apache.flink.runtime.LeaderSessionMessageFilter$$anonfun$receive$1.applyOrElse(LeaderSessionMessageFilter.scala:38)
at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:36)
at org.apache.flink.runtime.LogMessages$$anon$1.apply(LogMessages.scala:33)
at org.apache.flink.runtime.LogMessages$$anon$1.apply(LogMessages.scala:28)
at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:123)
at org.apache.flink.runtime.LogMessages$$anon$1.applyOrElse(LogMessages.scala:28)
at akka.actor.Actor$class.aroundReceive(Actor.scala:502)
at org.apache.flink.runtime.jobmanager.JobManager.aroundReceive(JobManager.scala:122)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:526)
at akka.actor.ActorCell.invoke(ActorCell.scala:495)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:257)
at akka.dispatch.Mailbox.run(Mailbox.scala:224)
at akka.dispatch.Mailbox.exec(Mailbox.scala:234)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Caused by: java.lang.ClassNotFoundException: org.apache.flink.graph.generator.random.BlockInfo
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at org.apache.flink.runtime.execution.librarycache.FlinkUserCodeClassLoaders$ChildFirstClassLoader.loadClass(FlinkUserCodeClassLoaders.java:115)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:348)
at org.apache.flink.util.InstantiationUtil$ClassLoaderObjectInputStream.resolveClass(InstantiationUtil.java:73)
at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1859)
at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1745)
at java.io.ObjectInputStream.readClass(ObjectInputStream.java:1710)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1550)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:427)
at java.util.HashSet.readObject(HashSet.java:341)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1158)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2169)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2060)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1567)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2278)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2202)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2060)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1567)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:427)
at org.apache.flink.util.InstantiationUtil.deserializeObject(InstantiationUtil.java:437)
at org.apache.flink.util.InstantiationUtil.deserializeObject(InstantiationUtil.java:424)
at org.apache.flink.util.InstantiationUtil.deserializeObject(InstantiationUtil.java:412)
at org.apache.flink.util.SerializedValue.deserializeValue(SerializedValue.java:58)
at org.apache.flink.runtime.jobmanager.JobManager.org$apache$flink$runtime$jobmanager$JobManager$$submitJob(JobManager.scala:1253)
... 19 more
This particular error shows up because when you have copied the jar files to the lib/ folder, but didn't stop and start the flink cluster against after doing the copy (i.e. you had the cluster running before you performed the copy).
If you stop and start the cluster using ./bin/stop-cluster.sh and ./bin/start-cluster.sh after copying the jar files to the lib folder as mentioned on this page the problem gets fixed.
Also that's probably why you didn't get the error with new installation because you copied the jar files to the lib folder before starting the flink cluster.

Flink Python API java.io.EOFException

I'm coding three Batch Applications, with the Python API, however, my third application is dealing with some Exceptions, especially when I increase the parallelism. This application has a Cross transformation inside it.
The cluster has 4 VMs, with 4 cpu cores and 7GB RAM each machine. So, the max parallelism setted 16.
The exception is:
org.apache.flink.client.program.ProgramInvocationException: The program execution failed: Job execution failed.
atorg.apache.flink.client.program.ClusterClient.run(ClusterClient.java:427)
at org.apache.flink.client.program.StandaloneClusterClient.submitJob(StandaloneClusterClient.java:101)
at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:400)
at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:387)
at org.apache.flink.client.program.ContextEnvironment.execute(ContextEnvironment.java:62)
at org.apache.flink.api.java.ExecutionEnvironment.execute(ExecutionEnvironment.java:926)
at org.apache.flink.python.api.PythonPlanBinder.runPlan(PythonPlanBinder.java:149)
at org.apache.flink.python.api.PythonPlanBinder.main(PythonPlanBinder.java:114)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:528)
at org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:419)
at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:339)
at org.apache.flink.client.CliFrontend.executeProgram(CliFrontend.java:831)
at org.apache.flink.client.CliFrontend.run(CliFrontend.java:256)
at org.apache.flink.client.CliFrontend.parseParameters(CliFrontend.java:1073)
at org.apache.flink.client.CliFrontend$2.call(CliFrontend.java:1120)
at org.apache.flink.client.CliFrontend$2.call(CliFrontend.java:1117)
at org.apache.flink.runtime.security.HadoopSecurityContext$1.run(HadoopSecurityContext.java:43)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
at org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:40)
at org.apache.flink.client.CliFrontend.main(CliFrontend.java:1116)
Caused by: org.apache.flink.runtime.client.JobExecutionException: Job execution failed.
at org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply$mcV$sp(JobManager.scala:900)
at org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply(JobManager.scala:843)
at org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply(JobManager.scala:843)
at scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
at scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40)
at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Caused by: java.io.EOFException
at java.io.DataInputStream.readInt(DataInputStream.java:392)
at org.apache.flink.python.api.streaming.data.PythonStreamer.streamBufferWithoutGroups(PythonStreamer.java:252)
at org.apache.flink.python.api.functions.PythonMapPartition.mapPartition(PythonMapPartition.java:54)
at org.apache.flink.runtime.operators.MapPartitionDriver.run(MapPartitionDriver.java:103)
at org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:490)
at org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:355)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:655)
at java.lang.Thread.run(Thread.java:745)
I've got some evidences:
1) It's not at all deterministic, eventually, the application finished without errors.
2) With parallelism <= 4 the algorithm finish flawlessly, exception often happens with higher parallelism degrees.
3) Even entering small inputs (~20MB) the exception occurr.
4) I didn't find any error at the slaves logs (jobmanager), only at the Master Flink Taskmanager
Can you help to find out the responsible for this? I can get more logs (namenode, datanode, jobmanager, taskmanager) if necessary.

Suddenly getting Error in server build process (Codename One)

I cannot remember changing anything on my conputer but since around 18:40 CET I have not been able anymore to send any build to CN1. Here is the error it shows :
java.net.ConnectException: Connexion terminée par expiration du délai d'attente (Connection timed out)
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:589)
at sun.security.ssl.SSLSocketImpl.connect(SSLSocketImpl.java:668)
at sun.security.ssl.BaseSSLSocketImpl.connect(BaseSSLSocketImpl.java:173)
at sun.net.NetworkClient.doConnect(NetworkClient.java:180)
at sun.net.www.http.HttpClient.openServer(HttpClient.java:432)
at sun.net.www.http.HttpClient.openServer(HttpClient.java:527)
at sun.net.www.protocol.https.HttpsClient.<init>(HttpsClient.java:264)
at sun.net.www.protocol.https.HttpsClient.New(HttpsClient.java:367)
at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.getNewHttpClient(AbstractDelegateHttpsURLConnection.java:191)
at sun.net.www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1138)
at sun.net.www.protocol.http.HttpURLConnection$6.run(HttpURLConnection.java:1022)
at sun.net.www.protocol.http.HttpURLConnection$6.run(HttpURLConnection.java:1020)
at java.security.AccessController.doPrivileged(Native Method)
at java.security.AccessController.doPrivilegedWithCombiner(AccessController.java:782)
at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:1019)
at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.connect(AbstractDelegateHttpsURLConnection.java:177)
at sun.net.www.protocol.http.HttpURLConnection.getOutputStream0(HttpURLConnection.java:1316)
at sun.net.www.protocol.http.HttpURLConnection.access$100(HttpURLConnection.java:91)
at sun.net.www.protocol.http.HttpURLConnection$8.run(HttpURLConnection.java:1283)
at sun.net.www.protocol.http.HttpURLConnection$8.run(HttpURLConnection.java:1281)
at java.security.AccessController.doPrivileged(Native Method)
at java.security.AccessController.doPrivilegedWithCombiner(AccessController.java:782)
at sun.net.www.protocol.http.HttpURLConnection.getOutputStream(HttpURLConnection.java:1280)
at sun.net.www.protocol.https.HttpsURLConnectionImpl.getOutputStream(HttpsURLConnectionImpl.java:250)
at com.codename1.build.client.BuildProcess.uploadToS3(BuildProcess.java:305)
at com.codename1.build.client.BuildProcess.sendS3Build(BuildProcess.java:366)
at com.codename1.build.client.BuildProcess.sendRequestToServer(BuildProcess.java:432)
at com.codename1.build.client.CodeNameOneBuildTask.execute(CodeNameOneBuildTask.java:507)
at org.apache.tools.ant.UnknownElement.execute(UnknownElement.java:293)
at sun.reflect.GeneratedMethodAccessor92.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.tools.ant.dispatch.DispatchUtils.execute(DispatchUtils.java:106)
at org.apache.tools.ant.Task.perform(Task.java:348)
at org.apache.tools.ant.Target.execute(Target.java:435)
at org.apache.tools.ant.Target.performTasks(Target.java:456)
at org.apache.tools.ant.Project.executeSortedTargets(Project.java:1405)
at org.apache.tools.ant.Project.executeTarget(Project.java:1376)
at org.apache.tools.ant.helper.DefaultExecutor.executeTargets(DefaultExecutor.java:41)
at org.apache.tools.ant.Project.executeTargets(Project.java:1260)
at org.apache.tools.ant.module.bridge.impl.BridgeImpl.run(BridgeImpl.java:286)
at org.apache.tools.ant.module.run.TargetExecutor.run(TargetExecutor.java:555)
at org.netbeans.core.execution.RunClassThread.run(RunClassThread.java:153)
/home/path/to/myProject/build.xml:338: Error in server build process
BUILD FAILED (total time: 2 minutes 15 seconds)
I did not change the way I am sending the build either, what should I do ?
Any help appreciated,
Regards,
I believe this is related to the Amazon outage people are experiencing world wide. Try again after the issues are figured out at Amazon. I am unable to download my previous builds as well.
You could follow the status here:
https://status.aws.amazon.com/
https://twitter.com/awscloud

Gatling fetching from csv file

I have created a Gatling simulation, by creating a ".har" file from Google Chrome. And I have changed input parameters & trying to fetch data from a ".csv" file.
Now when I run the simulation, how can I check if Gatling simulation has fetched data fields from ".csv" file?
Temporarily lower logging level to DEBUG in conf/logback.xml, you'll see the requests that are being generated.
Gatling has quite good error handling/logging connected to feeders.
When you provide wrong file name or path of your feeder file you will get following error (java.lang.IllegalArgumentException: Could not locate feeder file). Note that tests will not run.
java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at io.gatling.mojo.MainWithArgsInFile.runMain(MainWithArgsInFile.java:50)
at io.gatling.mojo.MainWithArgsInFile.main(MainWithArgsInFile.java:33)
Caused by: java.lang.IllegalArgumentException: Could not locate feeder file: file gatling/src/test/resources/data/wrongFile.csv doesn't exist
at io.gatling.core.feeder.FeederSupport$class.feederBuilder(FeederSupport.scala:53)
at io.gatling.core.Predef$.feederBuilder(Predef.scala:22)
at io.gatling.core.feeder.FeederSupport$class.separatedValues(FeederSupport.scala:44)
at io.gatling.core.Predef$.separatedValues(Predef.scala:22)
at io.gatling.core.feeder.FeederSupport$class.separatedValues(FeederSupport.scala:41)
at io.gatling.core.Predef$.separatedValues(Predef.scala:22)
at io.gatling.core.feeder.FeederSupport$class.csv(FeederSupport.scala:34)
at io.gatling.core.Predef$.csv(Predef.scala:22)
at com.scenario.Scenario.<init>(Scenario.scala:10)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at java.lang.Class.newInstance(Class.java:442)
at io.gatling.app.Gatling$.io$gatling$app$Gatling$$$anonfun$1(Gatling.scala:41)
at io.gatling.app.Gatling.run(Gatling.scala:92)
at io.gatling.app.Gatling.runIfNecessary(Gatling.scala:75)
at io.gatling.app.Gatling.start(Gatling.scala:65)
at io.gatling.app.Gatling$.start(Gatling.scala:57)
at io.gatling.app.Gatling$.fromArgs(Gatling.scala:49)
at io.gatling.app.Gatling$.main(Gatling.scala:43)
at io.gatling.app.Gatling.main(Gatling.scala)
When your attribute will not matched column name from feeder you will get following error. All tests will end with error.
00:10:07.937 [ERROR] i.g.h.a.s.HttpRequestAction - 'httpRequest-1' failed to execute: No attribute named 'wrongAttributeName' is defined
When you have empty file with feeder data you will get following error.
java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at io.gatling.mojo.MainWithArgsInFile.runMain(MainWithArgsInFile.java:50)
at io.gatling.mojo.MainWithArgsInFile.main(MainWithArgsInFile.java:33)
Caused by: java.lang.IllegalStateException: Feeder is now empty, stopping engine
at io.gatling.core.action.SingletonFeed$$anonfun$receive$1.applyOrElse(SingletonFeed.scala:61)
at akka.actor.Actor$class.aroundReceive(Actor.scala:482)
at io.gatling.core.akka.BaseActor.aroundReceive(BaseActor.scala:23)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:526)
at akka.actor.ActorCell.invoke(ActorCell.scala:495)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:257)
at akka.dispatch.Mailbox.run(Mailbox.scala:224)
at akka.dispatch.Mailbox.exec(Mailbox.scala:234)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
When non of this appear you can be almost sure that everything is ok. But to verify parameters send by Gatling you can check
Gatling logs (change logging level to DEBUG) in logback.xml
<root level="DEBUG">
<appender-ref ref="CONSOLE" />
</root>
And then you will see log similar to
00:13:53.233 [DEBUG] i.g.h.a.s.HttpTx$ - Sending request=Action name uri=http://your-rest-service?param1=value1&param2=value2: scenario=com.scenario.Scenario Scenario name, userId=1
Access logs on your server. In Tomcat instruction you can get for example here

Resources