Recently we got "(Too many open files)" error in production and this leads to very high CPU spike(98%) on Linux machine and finally brought machine down. We have to reboot the machine to bring it back.
Flow of code is like this :-
Consume message from one of IBM-MQ queue -> start processing it -> make some update entries into SQL-Server DB and commit the transaction.
We are using Hikari for connection pool management and value is set as 30
hikariConfig.setMaximumPoolSize(30);
While checking the logs, i found following stack trace :-
2021-03-24T20:00:22.404 WARNING org.apache.catalina.loader.WebappClassLoaderBase Failed to open JAR [null]
java.io.FileNotFoundException: /opt/apache-tomcat-7.0.88/webapps/ROOT/WEB-INF/lib/HikariCP-2.5.1.jar (Too many open files)
at java.util.zip.ZipFile.open(Native Method)
at java.util.zip.ZipFile.<init>(ZipFile.java:219)
at java.util.zip.ZipFile.<init>(ZipFile.java:149)
at java.util.jar.JarFile.<init>(JarFile.java:166)
at java.util.jar.JarFile.<init>(JarFile.java:130)
at org.apache.tomcat.util.compat.JreCompat.jarFileNewInstance(JreCompat.java:221)
at org.apache.catalina.loader.WebappClassLoaderBase.openJARs(WebappClassLoaderBase.java:3102)
at org.apache.catalina.loader.WebappClassLoaderBase.findResourceInternal(WebappClassLoaderBase.java:3414)
at org.apache.catalina.loader.WebappClassLoaderBase.findResource(WebappClassLoaderBase.java:1494)
at org.apache.catalina.loader.WebappClassLoaderBase.getResourceAsStream(WebappClassLoaderBase.java:1722)
at java.util.ResourceBundle$Control$1.run(ResourceBundle.java:2677)
at java.util.ResourceBundle$Control$1.run(ResourceBundle.java:2662)
at java.security.AccessController.doPrivileged(Native Method)
at java.util.ResourceBundle$Control.newBundle(ResourceBundle.java:2661)
at java.util.ResourceBundle.loadBundle(ResourceBundle.java:1501)
at java.util.ResourceBundle.findBundle(ResourceBundle.java:1465)
at java.util.ResourceBundle.getBundleImpl(ResourceBundle.java:1361)
at java.util.ResourceBundle.getBundle(ResourceBundle.java:773)
at com.ibm.mq.jmqi.JmqiException.buildMessage(JmqiException.java:428)
at com.ibm.mq.jmqi.JmqiException.getMessage(JmqiException.java:543)
at com.ibm.mq.jmqi.JmqiException.getMessage(JmqiException.java:515)
at java.lang.Throwable.getLocalizedMessage(Throwable.java:391)
at com.ibm.mq.jmqi.internal.JmqiTools.getExSumm(JmqiTools.java:947)
at com.ibm.mq.jmqi.remote.api.RemoteFAP.jmqiConnect(RemoteFAP.java:2282)
at com.ibm.mq.jmqi.remote.api.RemoteFAP.jmqiConnect(RemoteFAP.java:1294)
at com.ibm.msg.client.wmq.internal.WMQSession.connect(WMQSession.java:387)
at com.ibm.msg.client.wmq.internal.WMQSession.<init>(WMQSession.java:331)
at com.ibm.msg.client.wmq.internal.WMQConnection.createSession(WMQConnection.java:909)
at com.ibm.msg.client.jms.internal.JmsConnectionImpl.createSession(JmsConnectionImpl.java:896)
at com.ibm.mq.jms.MQConnection.createSession(MQConnection.java:337)
at org.springframework.jms.connection.SingleConnectionFactory.createSession(SingleConnectionFactory.java:437)
at org.springframework.jms.connection.CachingConnectionFactory.getSession(CachingConnectionFactory.java:236)
at org.springframework.jms.connection.SingleConnectionFactory$SharedConnectionInvocationHandler.invoke(SingleConnectionFactory.java:604)
at com.sun.proxy.$Proxy195.createSession(Unknown Source)
at org.springframework.jms.support.JmsAccessor.createSession(JmsAccessor.java:192)
at org.springframework.jms.core.JmsTemplate.execute(JmsTemplate.java:475)
at org.springframework.jms.core.JmsTemplate.execute(JmsTemplate.java:526)
at com.test.config.messaging.MainframeMessageProducer.lambda$sendMessageToQueueWithHeaders$0(MainframeMessageProducer.java:120)
at org.springframework.retry.support.RetryTemplate.doExecute(RetryTemplate.java:287)
at org.springframework.retry.support.RetryTemplate.execute(RetryTemplate.java:164)
at com.test.config.messaging.MainframeMessageProducer.sendMessageToQueueWithHeaders(MainframeMessageProducer.java:118)
at com.test.config.messaging.MainframeMessageProducer.sendMessageToMainframeQueue(MainframeMessageProducer.java:101)
at com.test.config.messaging.MainframeMessageProducer$$FastClassBySpringCGLIB$$f2d092d2.invoke(<generated>)
at org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:204)
at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.invokeJoinpoint(CglibAopProxy.java:721)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:157)
at org.springframework.aop.aspectj.MethodInvocationProceedingJoinPoint.proceed(MethodInvocationProceedingJoinPoint.java:85)
at com.test.common.aop.RecordTimingAspect.recordTimingAspect(RecordTimingAspect.java:28)
at sun.reflect.GeneratedMethodAccessor505.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.springframework.aop.aspectj.AbstractAspectJAdvice.invokeAdviceMethodWithGivenArgs(AbstractAspectJAdvice.java:629)
at org.springframework.aop.aspectj.AbstractAspectJAdvice.invokeAdviceMethod(AbstractAspectJAdvice.java:618)
at org.springframework.aop.aspectj.AspectJAroundAdvice.invoke(AspectJAroundAdvice.java:70)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:168)
at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:92)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179)
at org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:656)
at com.test.config.messaging.MainframeMessageProducer$$EnhancerBySpringCGLIB$$dc5ae691.sendMessageToMainframeQueue(<generated>)
at com.test.posting.GDISPurchasePostServiceImpl.postMessage(GDISPurchasePostServiceImpl.java:315)
at com.test.posting.GDISPostingService.processMessage(GDISPostingService.java:56)
at com.test.posting.GDISPostingService$$FastClassBySpringCGLIB$$3c0c40af.invoke(<generated>)
at org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:204)
at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.invokeJoinpoint(CglibAopProxy.java:721)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:157)
at org.springframework.aop.aspectj.MethodInvocationProceedingJoinPoint.proceed(MethodInvocationProceedingJoinPoint.java:85)
at com.test.common.aop.CorrelationIdAspect.correlationIdAdvice(CorrelationIdAspect.java:40)
at sun.reflect.GeneratedMethodAccessor554.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.springframework.aop.aspectj.AbstractAspectJAdvice.invokeAdviceMethodWithGivenArgs(AbstractAspectJAdvice.java:629)
at org.springframework.aop.aspectj.AbstractAspectJAdvice.invokeAdviceMethod(AbstractAspectJAdvice.java:618)
at org.springframework.aop.aspectj.AspectJAroundAdvice.invoke(AspectJAroundAdvice.java:70)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:168)
at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:92)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179)
at org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:656)
at com.test.posting.GDISPostingService$$EnhancerBySpringCGLIB$$dd870248.processMessage(<generated>)
at com.test.posting.GDISPostingMessageListener.onMessage(GDISPostingMessageListener.java:167)
at org.springframework.jms.listener.AbstractMessageListenerContainer.doInvokeListener(AbstractMessageListenerContainer.java:746)
at org.springframework.jms.listener.AbstractMessageListenerContainer.invokeListener(AbstractMessageListenerContainer.java:684)
at org.springframework.jms.listener.AbstractMessageListenerContainer.doExecuteListener(AbstractMessageListenerContainer.java:651)
at org.springframework.jms.listener.AbstractPollingMessageListenerContainer.doReceiveAndExecute(AbstractPollingMessageListenerContainer.java:317)
at org.springframework.jms.listener.AbstractPollingMessageListenerContainer.receiveAndExecute(AbstractPollingMessageListenerContainer.java:255)
at org.springframework.jms.listener.DefaultMessageListenerContainer$AsyncMessageListenerInvoker.invokeListener(DefaultMessageListenerContainer.java:1166)
at org.springframework.jms.listener.DefaultMessageListenerContainer$AsyncMessageListenerInvoker.executeOngoingLoop(DefaultMessageListenerContainer.java:1158)
at org.springframework.jms.listener.DefaultMessageListenerContainer$AsyncMessageListenerInvoker.run(DefaultMessageListenerContainer.java:1055)
at java.lang.Thread.run(Thread.java:748)
I am confused here what leads to max open files error here :-
is it due to Hikari CP as this is first line we got as an error (java.io.FileNotFoundException: /opt/apache-tomcat-7.0.88/webapps/ROOT/WEB-INF/lib/HikariCP-2.5.1.jar (Too many open files) ) ?
Is it due to IBM-MQ connection not available ?
You could use the lsof command to LiSt Open Files.
It produces lots of output, so you'll want to capture and post process it ( eg grep or sort)
You can list open files by pid, userid, disk etc.
You might try lsof > before wait... lsof > after then use diff to see the differences
Related
I'm using flink mysql connector with a single executor of 32Gb RAM, 16vCPU with 32 slots. If I run a job with parallelism 32 (job parallelism 224) that is doing temporal lookup joins with 10 MySQL tables, it starts to fail after 2-3 successful runs with below error.
org.apache.flink.runtime.JobException: Recovery is suppressed by NoRestartBackoffTimeStrategy
at org.apache.flink.runtime.executiongraph.failover.flip1.ExecutionFailureHandler.handleFailure(ExecutionFailureHandler.java:138)
at org.apache.flink.runtime.executiongraph.failover.flip1.ExecutionFailureHandler.getFailureHandlingResult(ExecutionFailureHandler.java:82)
at org.apache.flink.runtime.scheduler.DefaultScheduler.handleTaskFailure(DefaultScheduler.java:228)
at org.apache.flink.runtime.scheduler.DefaultScheduler.maybeHandleTaskFailure(DefaultScheduler.java:218)
at org.apache.flink.runtime.scheduler.DefaultScheduler.updateTaskExecutionStateInternal(DefaultScheduler.java:209)
at org.apache.flink.runtime.scheduler.SchedulerBase.updateTaskExecutionState(SchedulerBase.java:679)
at org.apache.flink.runtime.scheduler.SchedulerNG.updateTaskExecutionState(SchedulerNG.java:79)
at org.apache.flink.runtime.jobmaster.JobMaster.updateTaskExecutionState(JobMaster.java:444)
at sun.reflect.GeneratedMethodAccessor62.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.lambda$handleRpcInvocation$1(AkkaRpcActor.java:316)
at org.apache.flink.runtime.concurrent.akka.ClassLoadingUtils.runWithContextClassLoader(ClassLoadingUtils.java:83)
at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcInvocation(AkkaRpcActor.java:314)
at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:217)
at org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleRpcMessage(FencedAkkaRpcActor.java:78)
at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:163)
at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:24)
at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:20)
at scala.PartialFunction.applyOrElse(PartialFunction.scala:123)
at scala.PartialFunction.applyOrElse$(PartialFunction.scala:122)
at akka.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:20)
at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:172)
at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:172)
at akka.actor.Actor.aroundReceive(Actor.scala:537)
at akka.actor.Actor.aroundReceive$(Actor.scala:535)
at akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:220)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:580)
at akka.actor.ActorCell.invoke(ActorCell.scala:548)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:270)
at akka.dispatch.Mailbox.run(Mailbox.scala:231)
at akka.dispatch.Mailbox.exec(Mailbox.scala:243)
at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
at java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692)
at java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:157)
Caused by: java.lang.IllegalArgumentException: open() failed.
at org.apache.flink.connector.jdbc.table.JdbcRowDataLookupFunction.open(JdbcRowDataLookupFunction.java:138)
at LookupFunction$55178.open(Unknown Source)
at org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:34)
at org.apache.flink.table.runtime.operators.join.lookup.LookupJoinRunner.open(LookupJoinRunner.java:67)
at org.apache.flink.table.runtime.operators.join.lookup.LookupJoinWithCalcRunner.open(LookupJoinWithCalcRunner.java:51)
at org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:34)
at org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.open(AbstractUdfStreamOperator.java:100)
at org.apache.flink.streaming.api.operators.ProcessOperator.open(ProcessOperator.java:56)
at org.apache.flink.streaming.runtime.tasks.RegularOperatorChain.initializeStateAndOpenOperators(RegularOperatorChain.java:110)
at org.apache.flink.streaming.runtime.tasks.StreamTask.restoreGates(StreamTask.java:711)
at org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$1.call(StreamTaskActionExecutor.java:55)
at org.apache.flink.streaming.runtime.tasks.StreamTask.restoreInternal(StreamTask.java:687)
at org.apache.flink.streaming.runtime.tasks.StreamTask.restore(StreamTask.java:654)
at org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:958)
at org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:927)
at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:766)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:575)
at java.lang.Thread.run(Thread.java:748)
Caused by: com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure
The last packet sent successfully to the server was 0 milliseconds ago. The driver has not received any packets from the server.
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at com.mysql.jdbc.Util.handleNewInstance(Util.java:403)
at com.mysql.jdbc.SQLError.createCommunicationsException(SQLError.java:990)
at com.mysql.jdbc.MysqlIO.<init>(MysqlIO.java:335)
at com.mysql.jdbc.ConnectionImpl.coreConnect(ConnectionImpl.java:2187)
at com.mysql.jdbc.ConnectionImpl.connectOneTryOnly(ConnectionImpl.java:2220)
at com.mysql.jdbc.ConnectionImpl.createNewIO(ConnectionImpl.java:2015)
at com.mysql.jdbc.ConnectionImpl.<init>(ConnectionImpl.java:768)
at com.mysql.jdbc.JDBC4Connection.<init>(JDBC4Connection.java:47)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at com.mysql.jdbc.Util.handleNewInstance(Util.java:403)
at com.mysql.jdbc.ConnectionImpl.getInstance(ConnectionImpl.java:385)
at com.mysql.jdbc.NonRegisteringDriver.connect(NonRegisteringDriver.java:323)
at org.apache.flink.connector.jdbc.internal.connection.SimpleJdbcConnectionProvider.getOrEstablishConnection(SimpleJdbcConnectionProvider.java:121)
at org.apache.flink.connector.jdbc.table.JdbcRowDataLookupFunction.establishConnectionAndStatement(JdbcRowDataLookupFunction.java:211)
at org.apache.flink.connector.jdbc.table.JdbcRowDataLookupFunction.open(JdbcRowDataLookupFunction.java:129)
... 17 more
Caused by: java.net.SocketException: Too many open files
at java.net.Socket.createImpl(Socket.java:478)
at java.net.Socket.getImpl(Socket.java:538)
at java.net.Socket.setTcpNoDelay(Socket.java:998)
at com.mysql.jdbc.StandardSocketFactory.configureSocket(StandardSocketFactory.java:132)
at com.mysql.jdbc.StandardSocketFactory.connect(StandardSocketFactory.java:203)
at com.mysql.jdbc.MysqlIO.<init>(MysqlIO.java:299)
... 32 more
Did Some debugging, the process list on MySQL shows ~ 2* (total job parallelism) connections, i.e. 448 connections from Task Manager IP. The output of lsof | grep mysql-cj- | wc -l on task manager also reached to 12k from 3k. But after cancelling job, sometime this number doesn't go down. Am I missing something ?
The error is mainly because there are too many connections requesting mysql at the same time. Provide several optimization ideas for reference
Consider reducing the total concurrency of tasks
By default, lookup cache is not enabled. You can enable it by setting both lookup.cache.max-rows and lookup.cache.ttl, refer to https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/connectors/table/jdbc/
I have the next flow. I need to reverse engineer a txt file in ODI. The file it's for unix (lf line terminator), but when I try to reverse engineer the file I have the next error:
com.sunopsis.tools.core.exception.SnpsSimpleMessageException:
ODI-15116: There is not enough data in the target file to proceed with
the reverse-engineering. Please make sure the file has at least one
non-empty line and number of rows to skip is correct. at
com.sunopsis.dwg.dbobj.SnpTable.fileDataStoreReversePhase1(SnpTable.java:1059)
at
com.sunopsis.graphical.frame.edit.EditFrameSnpTable.doReverseFromTable(EditFrameSnpTable.java:804)
at
com.sunopsis.graphical.frame.edit.EditFrameSnpTable.jButtonReverseDelimiter_ActionPerformed(EditFrameSnpTable.java:3035)
at
com.sunopsis.graphical.frame.edit.EditFrameSnpTable$IvjEventHandler.actionPerformed(EditFrameSnpTable.java:354)
at
javax.swing.AbstractButton.fireActionPerformed(AbstractButton.java:2022)
at
javax.swing.AbstractButton$Handler.actionPerformed(AbstractButton.java:2348)
at
javax.swing.DefaultButtonModel.fireActionPerformed(DefaultButtonModel.java:402)
at
javax.swing.DefaultButtonModel.setPressed(DefaultButtonModel.java:259)
at
javax.swing.plaf.basic.BasicButtonListener.mouseReleased(BasicButtonListener.java:252)
at
java.awt.AWTEventMulticaster.mouseReleased(AWTEventMulticaster.java:290)
at java.awt.Component.processMouseEvent(Component.java:6539) at
javax.swing.JComponent.processMouseEvent(JComponent.java:3324) at
java.awt.Component.processEvent(Component.java:6304) at
java.awt.Container.processEvent(Container.java:2239) at
java.awt.Component.dispatchEventImpl(Component.java:4889) at
java.awt.Container.dispatchEventImpl(Container.java:2297) at
java.awt.Component.dispatchEvent(Component.java:4711) at
java.awt.LightweightDispatcher.retargetMouseEvent(Container.java:4904)
at
java.awt.LightweightDispatcher.processMouseEvent(Container.java:4535)
at java.awt.LightweightDispatcher.dispatchEvent(Container.java:4476)
at java.awt.Container.dispatchEventImpl(Container.java:2283) at
java.awt.Window.dispatchEventImpl(Window.java:2746) at
java.awt.Component.dispatchEvent(Component.java:4711) at
java.awt.EventQueue.dispatchEventImpl(EventQueue.java:760) at
java.awt.EventQueue.access$500(EventQueue.java:97) at
java.awt.EventQueue$3.run(EventQueue.java:709) at
java.awt.EventQueue$3.run(EventQueue.java:703) at
java.security.AccessController.doPrivileged(Native Method) at
java.security.ProtectionDomain$JavaSecurityAccessImpl.doIntersectionPrivilege(ProtectionDomain.java:74)
at
java.security.ProtectionDomain$JavaSecurityAccessImpl.doIntersectionPrivilege(ProtectionDomain.java:84)
at java.awt.EventQueue$4.run(EventQueue.java:733) at
java.awt.EventQueue$4.run(EventQueue.java:731) at
java.security.AccessController.doPrivileged(Native Method) at
java.security.ProtectionDomain$JavaSecurityAccessImpl.doIntersectionPrivilege(ProtectionDomain.java:74)
at java.awt.EventQueue.dispatchEvent(EventQueue.java:730) at
oracle.javatools.internal.ui.EventQueueWrapper._dispatchEvent(EventQueueWrapper.java:169)
at
oracle.javatools.internal.ui.EventQueueWrapper.dispatchEvent(EventQueueWrapper.java:151)
at
java.awt.EventDispatchThread.pumpOneEventForFilters(EventDispatchThread.java:205)
at
java.awt.EventDispatchThread.pumpEventsForFilter(EventDispatchThread.java:116)
at
java.awt.EventDispatchThread.pumpEventsForHierarchy(EventDispatchThread.java:105)
at
java.awt.EventDispatchThread.pumpEvents(EventDispatchThread.java:101)
at
java.awt.EventDispatchThread.pumpEvents(EventDispatchThread.java:93)
at java.awt.EventDispatchThread.run(EventDispatchThread.java:82)
After I open the file in notepad and do the next steps (change file from linux (LF) to windows (CRLF)): notepad ++ > edit > EOL Conversion Windows) it works.
Does someone knows why it works to reverse only windows files, even if ODI is installed on linux?
I am trying to schedule a job using a .bat file. When i run it it gets stuck on start of job execution. Does anyone know how I can fix this issue so I can run my jobs?
Does this have anything to do with Java? I do not have the variable PENTAHO_JAVA_HOME. I only have JAVA_HOME. Not sure if it has to do with my kitchen.bat file. I only using Spoon.bat.
I think its a problem with my Kitchen.bat (.kjb) because if i run the batch file with Pan.bat (.ktr) my file runs fine.
C:\Users\bxt0\Desktop>c:
C:\Users\bxt\Desktop>cd /d "C:\data-integration"
C:\data-integration>call Kitchen.bat
/file:C:\Users\bxt\Desktop\CCMStatsJob.
kjb "-param:TABLE_NAME=region" -logfile=C:\Users\bxt058y\Documents\Pentaho
Jobs\
ccmjob.txt
DEBUG: Using JAVA_HOME
DEBUG: _PENTAHO_JAVA_HOME=C:\Program Files\Java\jre1.8.0_74
DEBUG: _PENTAHO_JAVA=C:\Program Files\Java\jre1.8.0_74\bin\java.exe
C:\data-integration>"C:\Program Files\Java\jre1.8.0_74\bin\java.exe" "-
Xms1024m
" "-Xmx2048m" "-XX:MaxPermSize=256m" "-
Dhttps.protocols=TLSv1,TLSv1.1,TLSv1.2" "
-Djava.library.path=libswt\win64" "-DKETTLE_HOME=" "-DKETTLE_REPOSITORY=" "-
DKET
TLE_USER=" "-DKETTLE_PASSWORD=" "-DKETTLE_PLUGIN_PACKAGES=" "-
DKETTLE_LOG_SIZE_L
IMIT=" "-DKETTLE_JNDI_ROOT=" -jar launcher\pentaho-application-launcher-
7.1.0.0-
12.jar -lib ..\libswt\win64 -main org.pentaho.di.kitchen.Kitchen -initialDir
"C
:\data-integration"\ /file:C:\Users\bxt058y\Desktop\CCMStatsJob.kjb "-
param:TABL
E_NAME=region" -logfile C:\Users\bxt058y\Documents\Pentaho Jobs\ccmjob.txt
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=256m;
sup
port was removed in 8.0
log4j:ERROR Could not create an Appender. Reported error follows.
java.lang.ClassCastException: org.apache.log4j.ConsoleAppender cannot be
cast to
org.apache.log4j.Appender
at org.apache.log4j.xml.DOMConfigurator.parseAppender(DOMConfigurator.ja
va:248)
at org.apache.log4j.xml.DOMConfigurator.findAppenderByName(DOMConfigurat
or.java:176)
at org.apache.log4j.xml.DOMConfigurator.findAppenderByReference(DOMConfi
gurator.java:191)
at org.apache.log4j.xml.DOMConfigurator.parseChildrenOfLoggerElement(DOM
Configurator.java:523)
at org.apache.log4j.xml.DOMConfigurator.parseCategory(DOMConfigurator.ja
va:436)
at org.apache.log4j.xml.DOMConfigurator.parse(DOMConfigurator.java:1004)
at org.apache.log4j.xml.DOMConfigurator.doConfigure(DOMConfigurator.java
:872)
at org.apache.log4j.xml.DOMConfigurator.doConfigure(DOMConfigurator.java
:755)
at org.apache.log4j.xml.DOMConfigurator.configure(DOMConfigurator.java:8
96)
at org.pentaho.di.core.logging.log4j.Log4jLogging.applyLog4jConfiguratio
n(Log4jLogging.java:81)
at org.pentaho.di.core.logging.log4j.Log4jLogging.createLogger(Log4jLogg
ing.java:89)
at org.pentaho.di.core.logging.log4j.Log4jLogging.init(Log4jLogging.java
:68)
at org.pentaho.di.core.KettleClientEnvironment.initLogginPlugins(KettleC
lientEnvironment.java:155)
at org.pentaho.di.core.KettleClientEnvironment.init(KettleClientEnvironm
ent.java:118)
at org.pentaho.di.core.KettleClientEnvironment.init(KettleClientEnvironm
ent.java:79)
at org.pentaho.di.kitchen.Kitchen$1.call(Kitchen.java:91)
at org.pentaho.di.kitchen.Kitchen$1.call(Kitchen.java:84)
at java.util.concurrent.FutureTask.run(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
log4j:ERROR Could not create an Appender. Reported error follows.
java.lang.ClassCastException: org.apache.log4j.ConsoleAppender cannot be
cast to
org.apache.log4j.Appender
at org.apache.log4j.xml.DOMConfigurator.parseAppender(DOMConfigurator.ja
va:248)
at org.apache.log4j.xml.DOMConfigurator.findAppenderByName(DOMConfigurat
or.java:176)
at org.apache.log4j.xml.DOMConfigurator.findAppenderByReference(DOMConfi
gurator.java:191)
at org.apache.log4j.xml.DOMConfigurator.parseChildrenOfLoggerElement(DOM
Configurator.java:523)
at org.apache.log4j.xml.DOMConfigurator.parseRoot(DOMConfigurator.java:4
92)
at org.apache.log4j.xml.DOMConfigurator.parse(DOMConfigurator.java:1006)
at org.apache.log4j.xml.DOMConfigurator.doConfigure(DOMConfigurator.java
:872)
at org.apache.log4j.xml.DOMConfigurator.doConfigure(DOMConfigurator.java
:755)
at org.apache.log4j.xml.DOMConfigurator.configure(DOMConfigurator.java:8
96)
at org.pentaho.di.core.logging.log4j.Log4jLogging.applyLog4jConfiguratio
n(Log4jLogging.java:81)
at org.pentaho.di.core.logging.log4j.Log4jLogging.createLogger(Log4jLogg
ing.java:89)
at org.pentaho.di.core.logging.log4j.Log4jLogging.init(Log4jLogging.java
:68)
at org.pentaho.di.core.KettleClientEnvironment.initLogginPlugins(KettleC
lientEnvironment.java:155)
at org.pentaho.di.core.KettleClientEnvironment.init(KettleClientEnvironm
ent.java:118)
at org.pentaho.di.core.KettleClientEnvironment.init(KettleClientEnvironm
ent.java:79)
at org.pentaho.di.kitchen.Kitchen$1.call(Kitchen.java:91)
at org.pentaho.di.kitchen.Kitchen$1.call(Kitchen.java:84)
at java.util.concurrent.FutureTask.run(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
2018/06/28 09:35:32 - Kitchen - Start of run.
2018/06/28 09:35:34 - CCMStatsJob - Start of job execution
2018/06/28 09:35:34 - CCMStatsJob - CCMStatsJob
I think its a problem with my Kitchen.bat (.kjb) because if i run the batch file with Pan.bat (.ktr) my file runs fine.
Windows does not like white spaces in file names (linux neither). Correct the -logfile.
The parameters must not be between quotes. Remove.
In Window the arguments are introduced by "/" not by "-" (as in linux). Change.
The command call executes the process in the background, so you will see nothing, no log not even a signal that the process has stopped. To check it did run correctly you need to type the logfile.
kitchen.bat /file:C:\Users\bxt058y\Desktop\CCMStatsJob.kjb /param:TABLE_NAME=region /logfile:"C:\Users\bxt058y\Documents\Pentaho Jobs\ccmjob.txt"
i just configure my flink cluster (1.0.3 version) and i whanted to run tersort with the flink framework, but its me returned the next erreor, i used this comamnde to run my programm:
/bin/flink run -c --class es.udc.gac.flinkbench.ScalaTeraSort flinkbench-assembly-1.0.jar hdfs://192.168.3.89:8020/teragen/10mo hdfs://192.168.3.89:8020/teragen/rstflink
and its me return this :
Cluster configuration: Standalone cluster with JobManager at localhost/127.0.0.1:6123
Using address localhost:6123 to connect to JobManager.
JobManager web interface address http://localhost:8081
Starting execution of program
2017-06-23 10:28:43,692 INFO org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 2
Spent 2278ms computing base-splits.
Spent 2ms computing TeraScheduler splits.
Computing input splits took 2281ms
Sampling 2 splits of 2
Making -1 from 100000 sampled records
------------------------------------------------------------
The program finished with the following exception:
org.apache.flink.client.program.ProgramInvocationException: The main method caused an error.
at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:545)
at org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:419)
at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:339)
at org.apache.flink.client.CliFrontend.executeProgram(CliFrontend.java:831)
at org.apache.flink.client.CliFrontend.run(CliFrontend.java:256)
at org.apache.flink.client.CliFrontend.parseParameters(CliFrontend.java:1073)
at org.apache.flink.client.CliFrontend$2.call(CliFrontend.java:1120)
at org.apache.flink.client.CliFrontend$2.call(CliFrontend.java:1117)
at org.apache.flink.runtime.security.HadoopSecurityContext$1.run(HadoopSecurityContext.java:43)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1656)
at org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:40)
at org.apache.flink.client.CliFrontend.main(CliFrontend.java:1116)
Caused by: java.lang.NegativeArraySizeException
at es.udc.gac.flinkbench.terasort.TeraInputFormat$TextSampler.createPartitions(TeraInputFormat.java:103)
at es.udc.gac.flinkbench.terasort.TeraInputFormat.writePartitionFile(TeraInputFormat.java:183)
at es.udc.gac.flinkbench.ScalaTeraSort$.main(ScalaTeraSort.scala:49)
at es.udc.gac.flinkbench.ScalaTeraSort.main(ScalaTeraSort.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:528)
... 13 more
I have a very simple ftp route that should recursively download files from an URL. It would be very important to get it running still today.
There are no proxies, no extra authentication, no firewall. However, it only downloads the first file and then the socket will be closed. I have experimented with the different timeouts but they did not solve the issue. Either error code 226 is retuned if I don't use any timeout or extra config option or if I use the commented out things, error code 221 is returned. The 226 does not seem to be an error as it just counts as an indication that the server completed the transfer. Stack traces I copied below. I would appreciate the responses and thanks also in advance.
The route I am using in pollEnrich as I have to start it depending on a timer.
The code:
[...]
from("direct:ftp").routeId("ftp")
.log("### FTP is in progress ")
.pollEnrich().simple(ConfigData.getConfigData().getFtpUrl()
+ "?binary=true&"
+ "recursive=true"
/* + "soTimeout=300000&"
+ "stepwise=true&"
+ "ignoreFileNotFoundOrPermissionError=true&"
+ "ftpClient.dataTimeout=30000&"
+ "disconnect=true&"
+ "consumer.delay=1000" */
)
.to("file:modelFiles")
.end();
[...]
Update 1
I removed pollEnrich() and without it it works fine. However, I cannot start it from another route e.g. from a timer. So this is a quick hack to get FTP running. I also copy the code here as I did it with an idempotent consumer, so that only those files will be downloaded that are not yet on the disk. It can maybe be useful also to other people. The idempotent consumer example you can find here (Apache Camel ftp consumer loads the same files again and again) for more information.
Any further comments?
from(ConfigData.getConfigData().getFtpUrl()
+ "?binary=true&"
+ "recursive=true&"
+ "passiveMode=true&"
+ "ftpClient.bufferSize=10000000&"
+ "localWorkDirectory=" + ConfigData.getConfigData().getLocalTmpDirectory())
.idempotentConsumer(header("CamelFileName"), FileIdempotentRepository.fileIdempotentRepository(new File("data", "repo.dat")))
.to("file:modelFiles")
.log("Downloaded file ${file:name} complete.")
.end();
Stack Trace
2016-03-19 15:27:53,921 [WARN|org.apache.camel.component.file.remote.FtpConsumer|MarkerIgnoringBase] Error processing file RemoteFile[2009-03-25/BioModels_Database-r13-sbml_files.tar.gz] due to File operation failed: null socket closed. Code: 221. Caused by: [org.apache.camel.component.file.GenericFileOperationFailedException - File operation failed: null socket closed. Code: 221]
org.apache.camel.component.file.GenericFileOperationFailedException: File **operation failed: null socket closed. Code: 221**
[...]
Caused by: java.net.SocketException: socket closed
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(Unknown Source)
at java.net.SocketInputStream.read(Unknown Source)
at sun.nio.cs.StreamDecoder.readBytes(Unknown Source)
at sun.nio.cs.StreamDecoder.implRead(Unknown Source)
at sun.nio.cs.StreamDecoder.read(Unknown Source)
at java.io.InputStreamReader.read(Unknown Source)
at java.io.BufferedReader.fill(Unknown Source)
at java.io.BufferedReader.read(Unknown Source)
at org.apache.commons.net.io.CRLFLineReader.readLine(CRLFLineReader.java:58)
at org.apache.commons.net.ftp.FTP.__getReply(FTP.java:314)
at org.apache.commons.net.ftp.FTP.__getReply(FTP.java:294)
at org.apache.commons.net.ftp.FTP.sendCommand(FTP.java:483)
at org.apache.commons.net.ftp.FTP.sendCommand(FTP.java:608)
at org.apache.commons.net.ftp.FTP.cwd(FTP.java:828)
at org.apache.commons.net.ftp.FTPClient.changeWorkingDirectory(FTPClient.java:1128)
at org.apache.camel.component.file.remote.FtpOperations.doChangeDirectory(FtpOperations.java:769)
2016-03-19 12:42:16,068 [WARN|org.apache.camel.component.file.remote.FtpConsumer|MarkerIgnoringBase] Error processing file RemoteFile[2008-12-03/BioModels_Database-r12-sbml_files.tar.gz] due to File operation failed: null Socket Closed. Code: 226. Caused by: [org.apache.camel.component.file.GenericFileOperationFailedException - **File operation failed: null Socket Closed. Code: 226]**
[...]
Caused by: java.net.SocketException: Socket Closed
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(Unknown Source)
at java.net.SocketInputStream.read(Unknown Source)
at sun.nio.cs.StreamDecoder.readBytes(Unknown Source)
at sun.nio.cs.StreamDecoder.implRead(Unknown Source)
at sun.nio.cs.StreamDecoder.read(Unknown Source)
at java.io.InputStreamReader.read(Unknown Source)
at java.io.BufferedReader.fill(Unknown Source)
at java.io.BufferedReader.read(Unknown Source)
at org.apache.commons.net.io.CRLFLineReader.readLine(CRLFLineReader.java:58)
at org.apache.commons.net.ftp.FTP.__getReply(FTP.java:314)
at org.apache.commons.net.ftp.FTP.__getReply(FTP.java:294)
at org.apache.commons.net.ftp.FTP.getReply(FTP.java:692)
at org.apache.commons.net.ftp.FTPClient.completePendingCommand(FTPClient.java:1813)
at org.apache.commons.net.ftp.FTPClient._retrieveFile(FTPClient.java:1885)
at org.apache.commons.net.ftp.FTPClient.retrieveFile(FTPClient.java:1845)
at org.apache.camel.component.file.remote.FtpOperations.retrieveFileToStreamInBody(FtpOperations.java:367)
Solution:
As added in Upadete 1, I do not use pollEnrich(). The route cannot be started from a timer now but it works. So I will close the question. I really like the idea of the idempotent consumer (unrelated to the original question).
Don't forget that a pollEnrich delete the oldExchange and keep the newExchange.
So you don't want to do a from("ftp").pollEnrich("timer") but a from("timer").pollEnrich("ftp").
See the Solution at the end of the question.