Issue Activating Jenkins File Leak Detector PlugIn - jenkins-plugins

We are having issues activating the Jenkins File Leak Detector PlugIn. Error message is below. Any thoughts?
Here's info that might be relevant:
RHEL 6.0
/tmp permissions: drwxrwxrwt. 17 root root 28672 Aug 31 08:42 tmp
[root#XXXX tmp]# java -version
java version "1.6.0_39"
Java(TM) SE Runtime Environment (build 1.6.0_39-b04)
Java HotSpot(TM) 64-Bit Server VM (build 20.14-b01, mixed mode)
Status Code: 500
Exception: java.lang.Error: Failed to activate file leak detector: Connecting to 0 2013-08-31 08:42:18 Full thread dump Java HotSpot(TM) 64-Bit Server VM (20.14-b01 mixed mode): "Low Memory Detector" daemon prio=10 tid=0x00007f3d400b7000 nid=0x380a runnable [0x0000000000000000] java.lang.Thread.State: RUNNABLE "C2 CompilerThread1" daemon prio=10 tid=0x00007f3d400b4800 nid=0x3809 runnable [0x0000000000000000] java.lang.Thread.State: RUNNABLE "C2 CompilerThread0" daemon prio=10 tid=0x00007f3d400b1800 nid=0x3808 waiting on condition [0x0000000000000000] java.lang.Thread.State: RUNNABLE "Signal Dispatcher" daemon prio=10 tid=0x00007f3d400af800 nid=0x3806 waiting on condition [0x0000000000000000] java.lang.Thread.State: RUNNABLE "Finalizer" daemon prio=10 tid=0x00007f3d40094000 nid=0x37b4 in Object.wait() [0x00007f3d3f17b000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on <0x00000007c1eb1300> (a java.lang.ref.ReferenceQueue$Lock) at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:118) - locked <0x00000007c1eb1300> (a java.lang.ref.ReferenceQueue$Lock) at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:134) at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:159) "Reference Handler" daemon prio=10 tid=0x00007f3d40092000 nid=0x37b3 in Object.wait() [0x00007f3d3f27c000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on <0x00000007c1eb11d8> (a java.lang.ref.Reference$Lock) at java.lang.Object.wait(Object.java:485) at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:116) - locked <0x00000007c1eb11d8> (a java.lang.ref.Reference$Lock) "main" prio=10 tid=0x00007f3d40006800 nid=0x37a8 runnable [0x00007f3d4802b000] java.lang.Thread.State: RUNNABLE at sun.tools.attach.LinuxVirtualMachine.sendQuitTo(Native Method) at sun.tools.attach.LinuxVirtualMachine.<init>(LinuxVirtualMachine.java:67) at sun.tools.attach.LinuxAttachProvider.attachVirtualMachine(LinuxAttachProvider.java:46) at com.sun.tools.attach.VirtualMachine.attach(VirtualMachine.java:195) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.kohsuke.file_leak_detector.Main.run(Main.java:50) at org.kohsuke.file_leak_detector.Main.main(Main.java:35) "VM Thread" prio=10 tid=0x00007f3d4008b800 nid=0x37ad runnable "GC task thread#0 (ParallelGC)" prio=10 tid=0x00007f3d40019800 nid=0x37a9 runnable "GC task thread#1 (ParallelGC)" prio=10 tid=0x00007f3d4001b800 nid=0x37aa runnable "GC task thread#2 (ParallelGC)" prio=10 tid=0x00007f3d4001d000 nid=0x37ab runnable "GC task thread#3 (ParallelGC)" prio=10 tid=0x00007f3d4001f000 nid=0x37ac runnable "VM Periodic Task Thread" prio=10 tid=0x00007f3d400c9800 nid=0x380e waiting on condition JNI global references: 1129 Heap PSYoungGen total 55616K, used 2868K [0x00000007c1eb0000, 0x00000007c5cc0000, 0x0000000800000000) eden space 47680K, 6% used [0x00000007c1eb0000,0x00000007c217d398,0x00000007c4d40000) from space 7936K, 0% used [0x00000007c5500000,0x00000007c5500000,0x00000007c5cc0000) to space 7936K, 0% used [0x00000007c4d40000,0x00000007c4d40000,0x00000007c5500000) PSOldGen total 127104K, used 0K [0x0000000745c00000, 0x000000074d820000, 0x00000007c1eb0000) object space 127104K, 0% used [0x0000000745c00000,0x0000000745c00000,0x000000074d820000) PSPermGen total 21248K, used 4333K [0x0000000740a00000, 0x0000000741ec0000, 0x0000000745c00000) object space 21248K, 20% used [0x0000000740a00000,0x0000000740e3b460,0x0000000741ec0000) Exception in thread "main" java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.kohsuke.file_leak_detector.Main.run(Main.java:50) at org.kohsuke.file_leak_detector.Main.main(Main.java:35) Caused by: com.sun.tools.attach.AttachNotSupportedException: Unable to open socket file: target process not responding or HotSpot VM not loaded at sun.tools.attach.LinuxVirtualMachine.<init>(LinuxVirtualMachine.java:82) at sun.tools.attach.LinuxAttachProvider.attachVirtualMachine(LinuxAttachProvider.java:46) at com.sun.tools.attach.VirtualMachine.attach(VirtualMachine.java:195) ... 6 more
Stacktrace:
javax.servlet.ServletException: java.lang.Error: Failed to activate file leak detector: Connecting to 0
2013-08-31 08:42:18
Full thread dump Java HotSpot(TM) 64-Bit Server VM (20.14-b01 mixed mode):
"Low Memory Detector" daemon prio=10 tid=0x00007f3d400b7000 nid=0x380a runnable [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"C2 CompilerThread1" daemon prio=10 tid=0x00007f3d400b4800 nid=0x3809 runnable [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"C2 CompilerThread0" daemon prio=10 tid=0x00007f3d400b1800 nid=0x3808 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"Signal Dispatcher" daemon prio=10 tid=0x00007f3d400af800 nid=0x3806 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"Finalizer" daemon prio=10 tid=0x00007f3d40094000 nid=0x37b4 in Object.wait() [0x00007f3d3f17b000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x00000007c1eb1300> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:118)
- locked <0x00000007c1eb1300> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:134)
at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:159)
"Reference Handler" daemon prio=10 tid=0x00007f3d40092000 nid=0x37b3 in Object.wait() [0x00007f3d3f27c000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x00000007c1eb11d8> (a java.lang.ref.Reference$Lock)
at java.lang.Object.wait(Object.java:485)
at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:116)
- locked <0x00000007c1eb11d8> (a java.lang.ref.Reference$Lock)
"main" prio=10 tid=0x00007f3d40006800 nid=0x37a8 runnable [0x00007f3d4802b000]
java.lang.Thread.State: RUNNABLE
at sun.tools.attach.LinuxVirtualMachine.sendQuitTo(Native Method)
at sun.tools.attach.LinuxVirtualMachine.<init>(LinuxVirtualMachine.java:67)
at sun.tools.attach.LinuxAttachProvider.attachVirtualMachine(LinuxAttachProvider.java:46)
at com.sun.tools.attach.VirtualMachine.attach(VirtualMachine.java:195)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.kohsuke.file_leak_detector.Main.run(Main.java:50)
at org.kohsuke.file_leak_detector.Main.main(Main.java:35)
"VM Thread" prio=10 tid=0x00007f3d4008b800 nid=0x37ad runnable
"GC task thread#0 (ParallelGC)" prio=10 tid=0x00007f3d40019800 nid=0x37a9 runnable
"GC task thread#1 (ParallelGC)" prio=10 tid=0x00007f3d4001b800 nid=0x37aa runnable
"GC task thread#2 (ParallelGC)" prio=10 tid=0x00007f3d4001d000 nid=0x37ab runnable
"GC task thread#3 (ParallelGC)" prio=10 tid=0x00007f3d4001f000 nid=0x37ac runnable
"VM Periodic Task Thread" prio=10 tid=0x00007f3d400c9800 nid=0x380e waiting on condition
JNI global references: 1129
Heap
PSYoungGen total 55616K, used 2868K [0x00000007c1eb0000, 0x00000007c5cc0000, 0x0000000800000000)
eden space 47680K, 6% used [0x00000007c1eb0000,0x00000007c217d398,0x00000007c4d40000)
from space 7936K, 0% used [0x00000007c5500000,0x00000007c5500000,0x00000007c5cc0000)
to space 7936K, 0% used [0x00000007c4d40000,0x00000007c4d40000,0x00000007c5500000)
PSOldGen total 127104K, used 0K [0x0000000745c00000, 0x000000074d820000, 0x00000007c1eb0000)
object space 127104K, 0% used [0x0000000745c00000,0x0000000745c00000,0x000000074d820000)
PSPermGen total 21248K, used 4333K [0x0000000740a00000, 0x0000000741ec0000, 0x0000000745c00000)
object space 21248K, 20% used [0x0000000740a00000,0x0000000740e3b460,0x0000000741ec0000)
Exception in thread "main" java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.kohsuke.file_leak_detector.Main.run(Main.java:50)
at org.kohsuke.file_leak_detector.Main.main(Main.java:35)
Caused by: com.sun.tools.attach.AttachNotSupportedException: Unable to open socket file: target process not responding or HotSpot VM not loaded
at sun.tools.attach.LinuxVirtualMachine.<init>(LinuxVirtualMachine.java:82)
at sun.tools.attach.LinuxAttachProvider.attachVirtualMachine(LinuxAttachProvider.java:46)
at com.sun.tools.attach.VirtualMachine.attach(VirtualMachine.java:195)
... 6 more
at org.kohsuke.stapler.Stapler.tryInvoke(Stapler.java:725)
at org.kohsuke.stapler.Stapler.invoke(Stapler.java:776)
at org.kohsuke.stapler.MetaClass$12.dispatch(MetaClass.java:381)
at org.kohsuke.stapler.Stapler.tryInvoke(Stapler.java:683)
at org.kohsuke.stapler.Stapler.invoke(Stapler.java:776)
at org.kohsuke.stapler.Stapler.invoke(Stapler.java:585)
at org.kohsuke.stapler.Stapler.service(Stapler.java:216)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:45)
at winstone.ServletConfiguration.execute(ServletConfiguration.java:248)
at winstone.RequestDispatcher.forward(RequestDispatcher.java:333)
at winstone.RequestDispatcher.doFilter(RequestDispatcher.java:376)
at hudson.util.PluginServletFilter$1.doFilter(PluginServletFilter.java:95)
at hudson.util.PluginServletFilter.doFilter(PluginServletFilter.java:87)
at winstone.FilterConfiguration.execute(FilterConfiguration.java:194)
at winstone.RequestDispatcher.doFilter(RequestDispatcher.java:366)
at hudson.security.csrf.CrumbFilter.doFilter(CrumbFilter.java:48)
at winstone.FilterConfiguration.execute(FilterConfiguration.java:194)
at winstone.RequestDispatcher.doFilter(RequestDispatcher.java:366)
at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:84)
at hudson.security.UnwrapSecurityExceptionFilter.doFilter(UnwrapSecurityExceptionFilter.java:51)
at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:87)
at org.acegisecurity.ui.ExceptionTranslationFilter.doFilter(ExceptionTranslationFilter.java:124)
at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:87)
at org.acegisecurity.providers.anonymous.AnonymousProcessingFilter.doFilter(AnonymousProcessingFilter.java:125)
at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:87)
at org.acegisecurity.ui.rememberme.RememberMeProcessingFilter.doFilter(RememberMeProcessingFilter.java:142)
at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:87)
at org.acegisecurity.ui.AbstractProcessingFilter.doFilter(AbstractProcessingFilter.java:271)
at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:87)
at org.acegisecurity.ui.basicauth.BasicProcessingFilter.doFilter(BasicProcessingFilter.java:174)
at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:87)
at jenkins.security.ApiTokenFilter.doFilter(ApiTokenFilter.java:64)
at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:87)
at org.acegisecurity.context.HttpSessionContextIntegrationFilter.doFilter(HttpSessionContextIntegrationFilter.java:249)
at hudson.security.HttpSessionContextIntegrationFilter2.doFilter(HttpSessionContextIntegrationFilter2.java:67)
at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:87)
at hudson.security.ChainedServletFilter.doFilter(ChainedServletFilter.java:76)
at hudson.security.HudsonFilter.doFilter(HudsonFilter.java:164)
at winstone.FilterConfiguration.execute(FilterConfiguration.java:194)
at winstone.RequestDispatcher.doFilter(RequestDispatcher.java:366)
at org.kohsuke.stapler.compression.CompressionFilter.doFilter(CompressionFilter.java:49)
at winstone.FilterConfiguration.execute(FilterConfiguration.java:194)
at winstone.RequestDispatcher.doFilter(RequestDispatcher.java:366)
at hudson.util.CharacterEncodingFilter.doFilter(CharacterEncodingFilter.java:81)
at winstone.FilterConfiguration.execute(FilterConfiguration.java:194)
at winstone.RequestDispatcher.doFilter(RequestDispatcher.java:366)
at winstone.RequestDispatcher.forward(RequestDispatcher.java:331)
at winstone.RequestHandlerThread.processRequest(RequestHandlerThread.java:227)
at winstone.RequestHandlerThread.run(RequestHandlerThread.java:150)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at winstone.BoundedExecutorService$1.run(BoundedExecutorService.java:77)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
at java.lang.Thread.run(Thread.java:662)
Caused by: java.lang.Error: Failed to activate file leak detector: Connecting to 0
2013-08-31 08:42:18
Full thread dump Java HotSpot(TM) 64-Bit Server VM (20.14-b01 mixed mode):
"Low Memory Detector" daemon prio=10 tid=0x00007f3d400b7000 nid=0x380a runnable [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"C2 CompilerThread1" daemon prio=10 tid=0x00007f3d400b4800 nid=0x3809 runnable [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"C2 CompilerThread0" daemon prio=10 tid=0x00007f3d400b1800 nid=0x3808 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"Signal Dispatcher" daemon prio=10 tid=0x00007f3d400af800 nid=0x3806 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"Finalizer" daemon prio=10 tid=0x00007f3d40094000 nid=0x37b4 in Object.wait() [0x00007f3d3f17b000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x00000007c1eb1300> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:118)
- locked <0x00000007c1eb1300> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:134)
at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:159)
"Reference Handler" daemon prio=10 tid=0x00007f3d40092000 nid=0x37b3 in Object.wait() [0x00007f3d3f27c000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x00000007c1eb11d8> (a java.lang.ref.Reference$Lock)
at java.lang.Object.wait(Object.java:485)
at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:116)
- locked <0x00000007c1eb11d8> (a java.lang.ref.Reference$Lock)
"main" prio=10 tid=0x00007f3d40006800 nid=0x37a8 runnable [0x00007f3d4802b000]
java.lang.Thread.State: RUNNABLE
at sun.tools.attach.LinuxVirtualMachine.sendQuitTo(Native Method)
at sun.tools.attach.LinuxVirtualMachine.<init>(LinuxVirtualMachine.java:67)
at sun.tools.attach.LinuxAttachProvider.attachVirtualMachine(LinuxAttachProvider.java:46)
at com.sun.tools.attach.VirtualMachine.attach(VirtualMachine.java:195)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.kohsuke.file_leak_detector.Main.run(Main.java:50)
at org.kohsuke.file_leak_detector.Main.main(Main.java:35)
"VM Thread" prio=10 tid=0x00007f3d4008b800 nid=0x37ad runnable
"GC task thread#0 (ParallelGC)" prio=10 tid=0x00007f3d40019800 nid=0x37a9 runnable
"GC task thread#1 (ParallelGC)" prio=10 tid=0x00007f3d4001b800 nid=0x37aa runnable
"GC task thread#2 (ParallelGC)" prio=10 tid=0x00007f3d4001d000 nid=0x37ab runnable
"GC task thread#3 (ParallelGC)" prio=10 tid=0x00007f3d4001f000 nid=0x37ac runnable
"VM Periodic Task Thread" prio=10 tid=0x00007f3d400c9800 nid=0x380e waiting on condition
JNI global references: 1129
Heap
PSYoungGen total 55616K, used 2868K [0x00000007c1eb0000, 0x00000007c5cc0000, 0x0000000800000000)
eden space 47680K, 6% used [0x00000007c1eb0000,0x00000007c217d398,0x00000007c4d40000)
from space 7936K, 0% used [0x00000007c5500000,0x00000007c5500000,0x00000007c5cc0000)
to space 7936K, 0% used [0x00000007c4d40000,0x00000007c4d40000,0x00000007c5500000)
PSOldGen total 127104K, used 0K [0x0000000745c00000, 0x000000074d820000, 0x00000007c1eb0000)
object space 127104K, 0% used [0x0000000745c00000,0x0000000745c00000,0x000000074d820000)
PSPermGen total 21248K, used 4333K [0x0000000740a00000, 0x0000000741ec0000, 0x0000000745c00000)
object space 21248K, 20% used [0x0000000740a00000,0x0000000740e3b460,0x0000000741ec0000)
Exception in thread "main" java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.kohsuke.file_leak_detector.Main.run(Main.java:50)
at org.kohsuke.file_leak_detector.Main.main(Main.java:35)
Caused by: com.sun.tools.attach.AttachNotSupportedException: Unable to open socket file: target process not responding or HotSpot VM not loaded
at sun.tools.attach.LinuxVirtualMachine.<init>(LinuxVirtualMachine.java:82)
at sun.tools.attach.LinuxAttachProvider.attachVirtualMachine(LinuxAttachProvider.java:46)
at com.sun.tools.attach.VirtualMachine.attach(VirtualMachine.java:195)
... 6 more
at com.cloudbees.jenkins.plugins.file_leak_detector.FileHandleDump.doActivate(FileHandleDump.java:101)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.kohsuke.stapler.Function$InstanceFunction.invoke(Function.java:297)
at org.kohsuke.stapler.Function.bindAndInvoke(Function.java:160)
at org.kohsuke.stapler.Function.bindAndInvokeAndServeResponse(Function.java:95)
at org.kohsuke.stapler.MetaClass$1.doDispatch(MetaClass.java:111)
at org.kohsuke.stapler.NameBasedDispatcher.dispatch(NameBasedDispatcher.java:53)
at org.kohsuke.stapler.Stapler.tryInvoke(Stapler.java:683)
... 55 more
Generated by Stapler at Sat Aug 31 08:42:23 EDT 2013

The JVM reports an error during attaching to the other JVM
com.sun.tools.attach.AttachNotSupportedException: Unable to open socket file: target process not responding or HotSpot VM not loaded
it seems attaching to a running process does not work. Either you have specified an invalid PID or the JVM that is running does not match to the one that you use to start file-leak-detector.
As a workaround you could try to inject the javaagent directly into the application instead of attaching to it. See http://file-leak-detector.kohsuke.org/ for details.

Related

Apache Flink TaskManager stuck on terminating

The taskmanager could not terminating and remained in a hung state, so the systemd considered that it was running and did not restart it
{"time":"2022-05-23 12:46:42.189","loglevel":"ERROR","class":"org.apache.flink.runtime.taskexecutor.TaskManagerRunner","message":"Fatal error occurred while executing the TaskManager. Shutting it down...","host":"flink15"}
org.apache.flink.util.FlinkRuntimeException: Task did not exit gracefully within 764 + seconds.
at org.apache.flink.runtime.taskmanager.Task$TaskCancelerWatchDog.run(Task.java:1791)
at java.lang.Thread.run(Thread.java:750)
{"time":"2022-05-23 12:46:52.201","loglevel":"ERROR","class":"org.apache.flink.runtime.taskexecutor.TaskManagerRunner","message":"Terminating TaskManagerRunner with exit code 1.","host":"flink15"}
org.apache.flink.util.FlinkException: Unexpected failure during runtime of TaskManagerRunner.
at org.apache.flink.runtime.taskexecutor.TaskManagerRunner.runTaskManager(TaskManagerRunner.java:394)
at org.apache.flink.runtime.taskexecutor.TaskManagerRunner.lambda$runTaskManagerProcessSecurely$3(TaskManagerRunner.java:428)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1762)
at org.apache.flink.runtime.security.contexts.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
at org.apache.flink.runtime.taskexecutor.TaskManagerRunner.runTaskManagerProcessSecurely(TaskManagerRunner.java:428)
at org.apache.flink.runtime.taskexecutor.TaskManagerRunner.runTaskManagerProcessSecurely(TaskManagerRunner.java:408)
at org.apache.flink.runtime.taskexecutor.TaskManagerRunner.main(TaskManagerRunner.java:366)
Caused by: java.util.concurrent.TimeoutException: null
at org.apache.flink.util.concurrent.FutureUtils$Timeout.run(FutureUtils.java:1237)
at org.apache.flink.util.concurrent.DirectExecutorService.execute(DirectExecutorService.java:217)
at org.apache.flink.util.concurrent.FutureUtils.lambda$orTimeout$15(FutureUtils.java:591)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
{"time":"2022-05-23 12:46:52.203","loglevel":"INFO","class":"org.apache.flink.runtime.blob.PermanentBlobCache","message":"Shutting down BLOB cache","host":"flink15"}
{"time":"2022-05-23 12:46:52.204","loglevel":"INFO","class":"org.apache.flink.runtime.blob.TransientBlobCache","message":"Shutting down BLOB cache","host":"flink15"}
{"time":"2022-05-23 12:46:52.205","loglevel":"INFO","class":"org.apache.flink.runtime.state.TaskExecutorLocalStateStoresManager","message":"Shutting down TaskExecutorLocalStateStoresManager.","host":"flink15"}
{"time":"2022-05-23 12:46:52.205","loglevel":"INFO","class":"org.apache.flink.runtime.filecache.FileCache","message":"removed file cache directory /var/lib/flink/tmp/flink-dist-cache-436fc230-8c11-45fd-83b2-43f0105dd524","host":"flink15"}
{"time":"2022-05-23 12:46:52.206","loglevel":"INFO","class":"org.apache.flink.runtime.io.disk.FileChannelManagerImpl","message":"FileChannelManager removed spill file directory /var/lib/flink/tmp/flink-io-f2b13865-538e-4369-9143-186d783aca9a","host":"flink15"}
{"time":"2022-05-23 12:46:52.206","loglevel":"INFO","class":"org.apache.flink.runtime.io.disk.FileChannelManagerImpl","message":"FileChannelManager removed spill file directory /var/lib/flink/tmp/flink-netty-shuffle-de83c8a0-9806-4a4e-b40e-b9b481a31f40","host":"flink15"}
but java process continued to run on the system, and when we manually restarted taskmanager systemd forced it to kill
May 23 15:20:48 flink15 systemd[1]: Stopping Apache Flink Taskmanager...
May 23 15:20:48 flink15 taskmanager.sh[10486]: Stopping taskexecutor daemon (pid: 60193) on host flink15.
May 23 15:20:58 flink15 taskmanager.sh[10486]: Daemon taskexecutor didn't stop within 10 seconds. Killing it.
May 23 15:21:01 flink15 systemd[1]: flink-taskmanager.service: Main process exited, code=killed, status=9/KILL
May 23 15:21:01 flink15 systemd[1]: Stopped Apache Flink Taskmanager.
May 23 15:21:01 flink15 systemd[1]: flink-taskmanager.service: Unit entered failed state.
is there any way to avoid this problem? so that taskmanager is guaranteed to exit in case of errors?

Fail to run flink-gelly-examples

I'm running Apache Flink 1.4.2 (Without bundled hadoop) on Mac, and I've managed to run the WordCount example. However, when I try to run the Gelly example following instructions in Runnung Gelly Examples , I stumble into the following error:
java.lang.ClassNotFoundException: org.apache.flink.graph.generator.random.BlockInfo
May I know how can I fix it?
I first started the cluster by
./bin/start-cluster.sh
Starting cluster.
[INFO] 1 instance(s) of jobmanager are already running on myhostname.
Starting jobmanager daemon on host myhostname.
[INFO] 1 instance(s) of taskmanager are already running on myhostname.
Starting taskmanager daemon on host myhostname.
Then in another terminal, I run
./bin/flink run examples/gelly/flink-gelly-examples_2.11-1.4.2.jar --algorithm GraphMetrics --order directed --input RMatGraph --type integer --scale 20 --simplify directed --output print
and the following errors
Cluster configuration: Standalone cluster with JobManager at localhost/127.0.0.1:6123
Using address localhost:6123 to connect to JobManager.
JobManager web interface address http://localhost:8081
Starting execution of program
Submitting job with JobID: 099acb25fa34be1fe63fb47296605f69. Waiting for job completion.
Connected to JobManager at Actor[akka.tcp://flink#localhost:6123/user/jobmanager#57507119] with leader session id 00000000-0000-0000-0000-000000000000.
------------------------------------------------------------
The program finished with the following exception:
org.apache.flink.client.program.ProgramInvocationException: The program execution failed: Failed to submit job 099acb25fa34be1fe63fb47296605f69 (RMatGraph (s20e16d) ⇨ GraphMetrics ⇨ Hash [integer])
at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:492)
at org.apache.flink.client.program.StandaloneClusterClient.submitJob(StandaloneClusterClient.java:105)
at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:456)
at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:444)
at org.apache.flink.client.program.ContextEnvironment.execute(ContextEnvironment.java:62)
at org.apache.flink.graph.Runner.execute(Runner.java:452)
at org.apache.flink.graph.Runner.main(Runner.java:507)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:525)
at org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:417)
at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:396)
at org.apache.flink.client.CliFrontend.executeProgram(CliFrontend.java:802)
at org.apache.flink.client.CliFrontend.run(CliFrontend.java:282)
at org.apache.flink.client.CliFrontend.parseParameters(CliFrontend.java:1054)
at org.apache.flink.client.CliFrontend$1.call(CliFrontend.java:1101)
at org.apache.flink.client.CliFrontend$1.call(CliFrontend.java:1098)
at org.apache.flink.runtime.security.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:30)
at org.apache.flink.client.CliFrontend.main(CliFrontend.java:1098)
Caused by: org.apache.flink.runtime.client.JobExecutionException: Failed to submit job 099acb25fa34be1fe63fb47296605f69 (RMatGraph (s20e16d) ⇨ GraphMetrics ⇨ Hash [integer])
at org.apache.flink.runtime.jobmanager.JobManager.org$apache$flink$runtime$jobmanager$JobManager$$submitJob(JobManager.scala:1325)
at org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1.applyOrElse(JobManager.scala:447)
at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:36)
at org.apache.flink.runtime.LeaderSessionMessageFilter$$anonfun$receive$1.applyOrElse(LeaderSessionMessageFilter.scala:38)
at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:36)
at org.apache.flink.runtime.LogMessages$$anon$1.apply(LogMessages.scala:33)
at org.apache.flink.runtime.LogMessages$$anon$1.apply(LogMessages.scala:28)
at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:123)
at org.apache.flink.runtime.LogMessages$$anon$1.applyOrElse(LogMessages.scala:28)
at akka.actor.Actor$class.aroundReceive(Actor.scala:502)
at org.apache.flink.runtime.jobmanager.JobManager.aroundReceive(JobManager.scala:122)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:526)
at akka.actor.ActorCell.invoke(ActorCell.scala:495)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:257)
at akka.dispatch.Mailbox.run(Mailbox.scala:224)
at akka.dispatch.Mailbox.exec(Mailbox.scala:234)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Caused by: java.lang.ClassNotFoundException: org.apache.flink.graph.generator.random.BlockInfo
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at org.apache.flink.runtime.execution.librarycache.FlinkUserCodeClassLoaders$ChildFirstClassLoader.loadClass(FlinkUserCodeClassLoaders.java:115)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:348)
at org.apache.flink.util.InstantiationUtil$ClassLoaderObjectInputStream.resolveClass(InstantiationUtil.java:73)
at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1859)
at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1745)
at java.io.ObjectInputStream.readClass(ObjectInputStream.java:1710)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1550)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:427)
at java.util.HashSet.readObject(HashSet.java:341)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1158)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2169)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2060)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1567)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2278)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2202)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2060)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1567)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:427)
at org.apache.flink.util.InstantiationUtil.deserializeObject(InstantiationUtil.java:437)
at org.apache.flink.util.InstantiationUtil.deserializeObject(InstantiationUtil.java:424)
at org.apache.flink.util.InstantiationUtil.deserializeObject(InstantiationUtil.java:412)
at org.apache.flink.util.SerializedValue.deserializeValue(SerializedValue.java:58)
at org.apache.flink.runtime.jobmanager.JobManager.org$apache$flink$runtime$jobmanager$JobManager$$submitJob(JobManager.scala:1253)
... 19 more
This particular error shows up because when you have copied the jar files to the lib/ folder, but didn't stop and start the flink cluster against after doing the copy (i.e. you had the cluster running before you performed the copy).
If you stop and start the cluster using ./bin/stop-cluster.sh and ./bin/start-cluster.sh after copying the jar files to the lib folder as mentioned on this page the problem gets fixed.
Also that's probably why you didn't get the error with new installation because you copied the jar files to the lib folder before starting the flink cluster.

how to handle execution timeout in flink

Connected to JobManager at Actor[akka.tcp://flink#localhost:6123/user/jobmanager#-1119198862] with leader session id 00000000-0000-0000-0000-000000000000.
org.apache.flink.client.program.ProgramInvocationException: The program execution failed: Couldn't retrieve the JobExecutionResult from the JobManager.
at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:478)
at org.apache.flink.client.program.StandaloneClusterClient.submitJob(StandaloneClusterClient.java:105)
at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:442)
at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:429)
at org.apache.flink.client.program.ContextEnvironment.execute(ContextEnvironment.java:62)
at FileSetWordCount.main(FileSetWordCount.java:178)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:528)
at org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:419)
at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:381)
at org.apache.flink.client.CliFrontend.executeProgram(CliFrontend.java:838)
at org.apache.flink.client.CliFrontend.run(CliFrontend.java:259)
at org.apache.flink.client.CliFrontend.parseParameters(CliFrontend.java:1086)
at org.apache.flink.client.CliFrontend$2.call(CliFrontend.java:1133)
at org.apache.flink.client.CliFrontend$2.call(CliFrontend.java:1130)
at org.apache.flink.runtime.security.HadoopSecurityContext$1.run(HadoopSecurityContext.java:43)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
at org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:40)
at org.apache.flink.client.CliFrontend.main(CliFrontend.java:1130)
Caused by: org.apache.flink.runtime.client.JobExecutionException: Couldn't retrieve the JobExecutionResult from the JobManager.
at org.apache.flink.runtime.client.JobClient.awaitJobResult(JobClient.java:309)
at org.apache.flink.runtime.client.JobClient.submitJobAndWait(JobClient.java:396)
at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:467)
... 23 more
Caused by: org.apache.flink.runtime.client.JobClientActorSubmissionTimeoutException: Job submission to the JobManager timed out. You may increase 'akka.client.timeout' in case the JobManager needs more time to configure and confirm the job submission.
at org.apache.flink.runtime.client.JobSubmissionClientActor.handleCustomMessage(JobSubmissionClientActor.java:119)
at org.apache.flink.runtime.client.JobClientActor.handleMessage(JobClientActor.java:251)
at org.apache.flink.runtime.akka.FlinkUntypedActor.handleLeaderSessionID(FlinkUntypedActor.java:89)
at org.apache.flink.runtime.akka.FlinkUntypedActor.onReceive(FlinkUntypedActor.java:68)
at akka.actor.UntypedActor$$anonfun$receive$1.applyOrElse(UntypedActor.scala:167)
at akka.actor.Actor$class.aroundReceive(Actor.scala:467)
at akka.actor.UntypedActor.aroundReceive(UntypedActor.scala:97)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
at akka.actor.ActorCell.invoke(ActorCell.scala:487)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:238)
at akka.dispatch.Mailbox.run(Mailbox.scala:220)
at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
No JobSubmissionResult returned, please make sure you called ExecutionEnvironment.execute()
This is an exception threw by Flink. How to handle the client timeout exception? This Flink application is going to run in the local envirionment. The application is used for about 1TB files processing.
I have figure it out. It is necessary to modify the 'flink-conf.yaml' file with adding a new line "akka.client.timeout: xx s"(e.g. "akka.client.timeout: 600 s").

Flink Python API java.io.EOFException

I'm coding three Batch Applications, with the Python API, however, my third application is dealing with some Exceptions, especially when I increase the parallelism. This application has a Cross transformation inside it.
The cluster has 4 VMs, with 4 cpu cores and 7GB RAM each machine. So, the max parallelism setted 16.
The exception is:
org.apache.flink.client.program.ProgramInvocationException: The program execution failed: Job execution failed.
atorg.apache.flink.client.program.ClusterClient.run(ClusterClient.java:427)
at org.apache.flink.client.program.StandaloneClusterClient.submitJob(StandaloneClusterClient.java:101)
at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:400)
at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:387)
at org.apache.flink.client.program.ContextEnvironment.execute(ContextEnvironment.java:62)
at org.apache.flink.api.java.ExecutionEnvironment.execute(ExecutionEnvironment.java:926)
at org.apache.flink.python.api.PythonPlanBinder.runPlan(PythonPlanBinder.java:149)
at org.apache.flink.python.api.PythonPlanBinder.main(PythonPlanBinder.java:114)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:528)
at org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:419)
at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:339)
at org.apache.flink.client.CliFrontend.executeProgram(CliFrontend.java:831)
at org.apache.flink.client.CliFrontend.run(CliFrontend.java:256)
at org.apache.flink.client.CliFrontend.parseParameters(CliFrontend.java:1073)
at org.apache.flink.client.CliFrontend$2.call(CliFrontend.java:1120)
at org.apache.flink.client.CliFrontend$2.call(CliFrontend.java:1117)
at org.apache.flink.runtime.security.HadoopSecurityContext$1.run(HadoopSecurityContext.java:43)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
at org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:40)
at org.apache.flink.client.CliFrontend.main(CliFrontend.java:1116)
Caused by: org.apache.flink.runtime.client.JobExecutionException: Job execution failed.
at org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply$mcV$sp(JobManager.scala:900)
at org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply(JobManager.scala:843)
at org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply(JobManager.scala:843)
at scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
at scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40)
at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Caused by: java.io.EOFException
at java.io.DataInputStream.readInt(DataInputStream.java:392)
at org.apache.flink.python.api.streaming.data.PythonStreamer.streamBufferWithoutGroups(PythonStreamer.java:252)
at org.apache.flink.python.api.functions.PythonMapPartition.mapPartition(PythonMapPartition.java:54)
at org.apache.flink.runtime.operators.MapPartitionDriver.run(MapPartitionDriver.java:103)
at org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:490)
at org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:355)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:655)
at java.lang.Thread.run(Thread.java:745)
I've got some evidences:
1) It's not at all deterministic, eventually, the application finished without errors.
2) With parallelism <= 4 the algorithm finish flawlessly, exception often happens with higher parallelism degrees.
3) Even entering small inputs (~20MB) the exception occurr.
4) I didn't find any error at the slaves logs (jobmanager), only at the Master Flink Taskmanager
Can you help to find out the responsible for this? I can get more logs (namenode, datanode, jobmanager, taskmanager) if necessary.

mvn gcloud:run Hangs

I have a Google App Engine application which I'm trying to test locally. Whenever I run the following command:
mvn gcloud:run
The application never fully starts up. I have the following output written to my console:
mvn gcloud:run
[INFO] Scanning for projects...
[INFO]
[INFO] ------------------------------------------------------------------------
[INFO] Building Wildstar Service Desk 88
[INFO] ------------------------------------------------------------------------
[INFO]
[INFO] >>> gcloud-maven-plugin:2.0.9.95.v20160203:run (default-cli) > package # servicedesk >>>
[INFO]
[INFO] --- maven-resources-plugin:2.6:resources (default-resources) # servicedesk ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] Copying 0 resource
[INFO]
[INFO] --- maven-compiler-plugin:3.3:compile (default-compile) # servicedesk ---
[INFO] Nothing to compile - all classes are up to date
[INFO]
[INFO] --- maven-resources-plugin:2.6:testResources (default-testResources) # servicedesk ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] skip non existing resourceDirectory /Users/derekberube/Documents/Programming/Java/Wildstar ServiceDesk/src/test/resources
[INFO]
[INFO] --- maven-compiler-plugin:3.3:testCompile (default-testCompile) # servicedesk ---
[INFO] No sources to compile
[INFO]
[INFO] --- maven-surefire-plugin:2.12.4:test (default-test) # servicedesk ---
[INFO] No tests to run.
[INFO]
[INFO] --- maven-war-plugin:2.6:war (default-war) # servicedesk ---
[INFO] Packaging webapp
[INFO] Assembling webapp [servicedesk] in [/Users/derekberube/Documents/Programming/Java/Wildstar ServiceDesk/target/servicedesk-88]
[INFO] Processing war project
[INFO] Copying webapp resources [/Users/derekberube/Documents/Programming/Java/Wildstar ServiceDesk/src/main/webapp]
[INFO] Webapp assembled in [26 msecs]
[INFO] Building war: /Users/derekberube/Documents/Programming/Java/Wildstar ServiceDesk/target/servicedesk-88.war
[INFO]
[INFO] <<< gcloud-maven-plugin:2.0.9.95.v20160203:run (default-cli) < package # servicedesk <<<
[INFO]
[INFO] --- gcloud-maven-plugin:2.0.9.95.v20160203:run (default-cli) # servicedesk ---
The following is the content of the log file generated by the gcloud process.
derekberube$ cat 22.54.23.587718.log
2016-03-07 22:54:23,593 DEBUG root Loaded Command Group: ['gcloud', 'info']
2016-03-07 22:54:23,594 DEBUG root Running gcloud.info with Namespace(__calliope_internal_deepest_parser=ArgumentParser(prog='gcloud.info', usage=None, description='This command displays information about the current gcloud environment.', version=None, formatter_class=<class 'argparse.HelpFormatter'>, conflict_handler='error', add_help=False), account=None, authority_selector=None, authorization_token_file=None, cmd_func=<bound method Command.Run of <googlecloudsdk.calliope.backend.Command object at 0x10cd2a590>>, command_path=['gcloud', 'info'], configuration=None, credential_file_override=None, document=None, format=None, h=None, help=None, http_timeout=None, log_http=None, project=None, quiet=None, show_log=False, trace_email=None, trace_log=False, trace_token=None, user_output_enabled=None, verbosity=None, version=None).
2016-03-07 22:54:23,687 INFO root Explict Display.
2016-03-07 22:54:23,687 INFO ___FILE_ONLY___ Google Cloud SDK [99.0.0]
Platform: [Mac OS X, x86_64]
Python Version: [2.7.10 (default, Oct 23 2015, 19:19:21) [GCC 4.2.1 Compatible Apple LLVM 7.0.0 (clang-700.0.59.5)]]
Python Location: [/usr/bin/python]
Site Packages: [Disabled]
Installation Root: [/Users/derekberube/google-cloud-sdk]
Installed Components:
core: [2016.02.26]
app-engine-python: [1.9.33]
core-nix: [2016.02.05]
pubsub-emulator: [2016.02.22]
kubectl: []
app-engine-java: [1.9.32]
gcloud: []
gsutil-nix: [4.15]
app-engine-python-extras: [1.9.21]
beta: [2016.01.12]
gsutil: [4.17]
bq: [2.0.18]
alpha: [2016.01.12]
gcd-emulator: [v1beta3-1.0.0]
bq-nix: [2.0.18]
kubectl-darwin-x86_64: [1.1.7]
System PATH: [/Library/Java/JavaVirtualMachines/jdk1.7.0.jdk/Contents/Home/bin:/Applications/appengine-java-sdk-1.9.32/bin:/Applications/Java Libraries/Metro/2.3.1//bin:/Users/derekberube/google-cloud-sdk/bin:/Library/Java/JavaVirtualMachines/jdk1.8.0.jdk/Contents/Home/bin:/usr/local/apache-maven/apache-maven-3.3.3/bin:/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin:/opt/X11/bin]
Cloud SDK on PATH: [True]
Installation Properties: [/Users/derekberube/google-cloud-sdk/properties]
User Config Directory: [/Users/derekberube/.config/gcloud]
User Properties: [/Users/derekberube/.config/gcloud/properties]
Account: [wildstarservicedesk-hrd#appspot.gserviceaccount.com]
Project: [wildstarservicedesk-hrd]
Current Properties:
[core]
project: [wildstarservicedesk-hrd]
account: [wildstarservicedesk-hrd#appspot.gserviceaccount.com]
disable_usage_reporting: [False]
[app]
suppress_change_warning: [true]
Logs Directory: [/Users/derekberube/.config/gcloud/logs]
Last Log File: [/Users/derekberube/.config/gcloud/logs/2016.03.07/21.02.55.937592.log]
2016-03-07 22:54:23,695 DEBUG root Metrics reporting process started.
The following is a copy of my pom.xml
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<packaging>war</packaging>
<version>88</version>
<groupId>com.wildstartech</groupId>
<artifactId>servicedesk</artifactId>
<name>Wildstar Service Desk</name>
<url>http://servicedesk.wildstartech.com/</url>
<properties>
<appengine.app.version>${project.version}</appengine.app.version>
<appengine.target.version>1.9.32</appengine.target.version>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
</properties>
<prerequisites>
<maven>3.1.0</maven>
</prerequisites>
<build>
<!-- for hot reload of the web application -->
<outputDirectory>${project.build.directory}/${project.build.finalName}/WEB-INF/classes</outputDirectory>
<plugins>
<plugin>
<groupId>org.eclipse.jetty</groupId>
<artifactId>jetty-maven-plugin</artifactId>
<version>9.3.7.v20160115</version>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<version>3.3</version>
<artifactId>maven-compiler-plugin</artifactId>
<configuration>
<source>1.7</source>
<target>1.7</target>
</configuration>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-war-plugin</artifactId>
<version>2.6</version>
</plugin>
<plugin>
<groupId>org.codehaus.mojo</groupId>
<artifactId>versions-maven-plugin</artifactId>
<version>2.2</version>
</plugin>
<plugin>
<groupId>com.google.appengine</groupId>
<artifactId>gcloud-maven-plugin</artifactId>
<version>2.0.9.95.v20160203</version>
<configuration>
<log_level>debug</log_level>
<quiet>false</quiet>
</configuration>
</plugin>
</plugins>
</build>
</project>
The following is a tree structure of my project.
|____pom.xml
|____src
| |____main
| | |____java
| | |____resources
| | |____webapp
| | | |____index.html
| | | |____WEB-INF
| | | | |____appengine-web.xml
| | | | |____web.xml
The following is my appengine-web.xml configuration file.
<appengine-web-app xmlns="http://appengine.google.com/ns/1.0">
<application>wildstarservicedesk-hrd</application>
<beta-settings>
<setting name="java_quickstart" value="true"/>
</beta-settings>
<inbound-services>
<service>mail</service>
</inbound-services>
<sessions-enabled>true</sessions-enabled>
<!-- Configure java.util.logging -->
<system-properties>
<property name="java.util.logging.config.file"
value="WEB-INF/logging.properties" />
</system-properties>
<threadsafe>true</threadsafe>
<version>88</version>
<vm>true</vm>
</appengine-web-app>
The following is my web.xml file.
<?xml version="1.0" encoding="utf-8"?>
<web-app xmlns="http://xmlns.jcp.org/xml/ns/javaee"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://xmlns.jcp.org/xml/ns/javaee
http://xmlns.jcp.org/xml/ns/javaee/web-app_3_1.xsd" version="3.1">
<welcome-file-list>
<welcome-file>index.jsp</welcome-file>
<welcome-file>index.xhtml</welcome-file>
<welcome-file>index.html</welcome-file>
</welcome-file-list>
</web-app>
And last, but not least, the following is the content of the index.html
<DOCTYPE html>
<html>
<head>
<title>Test JSP</title>
</head>
<body>
<p>This is a test.</p>
</body>
</html>
The following is the stack output.
jstack 5155
2016-03-07 23:58:29
Full thread dump Java HotSpot(TM) 64-Bit Server VM (24.80-b11 mixed mode):
"Attach Listener" daemon prio=5 tid=0x00007ff4a500b000 nid=0x3d0b waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"Service Thread" daemon prio=5 tid=0x00007ff4a281d800 nid=0x4e03 runnable [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"C2 CompilerThread1" daemon prio=5 tid=0x00007ff4a1819000 nid=0x4c03 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"C2 CompilerThread0" daemon prio=5 tid=0x00007ff4a406a000 nid=0x4a03 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"Signal Dispatcher" daemon prio=5 tid=0x00007ff4a4064800 nid=0x3e0f runnable [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"Finalizer" daemon prio=5 tid=0x00007ff4a4041800 nid=0x3803 in Object.wait() [0x0000700000d3a000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x00000007aeb20070> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:135)
- locked <0x00000007aeb20070> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:151)
at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:209)
"Reference Handler" daemon prio=5 tid=0x00007ff4a1813800 nid=0x3603 in Object.wait() [0x0000700000c37000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x00000007aeb10278> (a java.lang.ref.Reference$Lock)
at java.lang.Object.wait(Object.java:503)
at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:133)
- locked <0x00000007aeb10278> (a java.lang.ref.Reference$Lock)
"main" prio=5 tid=0x00007ff4a4001000 nid=0x1703 runnable [0x0000700000218000]
java.lang.Thread.State: RUNNABLE
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:152)
at java.net.SocketInputStream.read(SocketInputStream.java:122)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:275)
at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
- locked <0x00000007abefec10> (a java.io.BufferedInputStream)
at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:690)
at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:633)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1324)
- locked <0x00000007abee63d0> (a sun.net.www.protocol.http.HttpURLConnection)
at com.google.appengine.gcloudapp.GCloudAppRun.stopDevAppServer(GCloudAppRun.java:486)
at com.google.appengine.gcloudapp.GCloudAppRun.execute(GCloudAppRun.java:287)
at org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo(DefaultBuildPluginManager.java:134)
at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:208)
at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:153)
at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:145)
at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:116)
at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:80)
at org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder.build(SingleThreadedBuilder.java:51)
at org.apache.maven.lifecycle.internal.LifecycleStarter.execute(LifecycleStarter.java:128)
at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:307)
at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:193)
at org.apache.maven.DefaultMaven.execute(DefaultMaven.java:106)
at org.apache.maven.cli.MavenCli.execute(MavenCli.java:862)
at org.apache.maven.cli.MavenCli.doMain(MavenCli.java:286)
at org.apache.maven.cli.MavenCli.main(MavenCli.java:197)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced(Launcher.java:289)
at org.codehaus.plexus.classworlds.launcher.Launcher.launch(Launcher.java:229)
at org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode(Launcher.java:415)
at org.codehaus.plexus.classworlds.launcher.Launcher.main(Launcher.java:356)
"VM Thread" prio=5 tid=0x00007ff4a4039800 nid=0x3403 runnable
"GC task thread#0 (ParallelGC)" prio=5 tid=0x00007ff4a400d000 nid=0x2403 runnable
"GC task thread#1 (ParallelGC)" prio=5 tid=0x00007ff4a400d800 nid=0x2603 runnable
"GC task thread#2 (ParallelGC)" prio=5 tid=0x00007ff4a180a800 nid=0x2803 runnable
"GC task thread#3 (ParallelGC)" prio=5 tid=0x00007ff4a400e800 nid=0x2a03 runnable
"GC task thread#4 (ParallelGC)" prio=5 tid=0x00007ff4a180f800 nid=0x2c03 runnable
"GC task thread#5 (ParallelGC)" prio=5 tid=0x00007ff4a1810800 nid=0x2e03 runnable
"GC task thread#6 (ParallelGC)" prio=5 tid=0x00007ff4a1811000 nid=0x3003 runnable
"GC task thread#7 (ParallelGC)" prio=5 tid=0x00007ff4a400f000 nid=0x3203 runnable
"VM Periodic Task Thread" prio=5 tid=0x00007ff4a281e800 nid=0x5003 waiting on condition
JNI global references: 236
From a terminal window, I ran the command ps -eaf |grep java and one of the processes listed in the output contained the word (java). This happens whenever the ps command is NOT able to read a value from the CMD column.
I used the kill -eaf command to terminate the thread running the (java) process.
After doing that, the mvn gcloud:run process continued execution.

Resources