Flink: Could not find a suitable table factory for 'org.apache.flink.table.factories.DeserializationSchemaFactory' in the classpath - flink-streaming

I am using flink's table api, I receive data from kafka, then register it as
a table, then I use sql statement to process, and finally convert the result
back to a stream, write to a directory, the code looks like this:
def main(args: Array[String]): Unit = {
val sEnv = StreamExecutionEnvironment.getExecutionEnvironment
sEnv.setStreamTimeCharacteristic(TimeCharacteristic.EventTime)
val tEnv = TableEnvironment.getTableEnvironment(sEnv)
tEnv.connect(
new Kafka()
.version("0.11")
.topic("user")
.startFromEarliest()
.property("zookeeper.connect", "")
.property("bootstrap.servers", "")
)
.withFormat(
new Json()
.failOnMissingField(false)
.deriveSchema() //使用表的 schema
)
.withSchema(
new Schema()
.field("username_skey", Types.STRING)
)
.inAppendMode()
.registerTableSource("user")
val userTest: Table = tEnv.sqlQuery(
"""
select ** form ** join **"".stripMargin)
val endStream = tEnv.toRetractStream[Row](userTest)
endStream.writeAsText("/tmp/sqlres",WriteMode.OVERWRITE)
sEnv.execute("Test_New_Sign_Student")
}
I was successful in the local test, but when I submit the following command
in the cluster, I get the following error:
=======================================================
org.apache.flink.client.program.ProgramInvocationException: The main method
caused an error.
at
org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:546)
at
org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:421)
at
org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:426)
at
org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:804)
at
org.apache.flink.client.cli.CliFrontend.runProgram(CliFrontend.java:280)
at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:215)
at
org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:1044)
at
org.apache.flink.client.cli.CliFrontend.lambda$main$11(CliFrontend.java:1120)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1692)
at
org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
at
org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1120)
Caused by: org.apache.flink.table.api.NoMatchingTableFactoryException: Could
not find a suitable table factory for
'org.apache.flink.table.factories.DeserializationSchemaFactory' in
the classpath.
Reason: No factory implements
'org.apache.flink.table.factories.DeserializationSchemaFactory'.
The following properties are requested:
connector.properties.0.key=zookeeper.connect
....
schema.9.name=roles
schema.9.type=VARCHAR
update-mode=append
The following factories have been considered:
org.apache.flink.table.sources.CsvBatchTableSourceFactory
org.apache.flink.table.sources.CsvAppendTableSourceFactory
org.apache.flink.table.sinks.CsvBatchTableSinkFactory
org.apache.flink.table.sinks.CsvAppendTableSinkFactory
org.apache.flink.streaming.connectors.kafka.Kafka011TableSourceSinkFactory
at
org.apache.flink.table.factories.TableFactoryService$.filterByFactoryClass(TableFactoryService.scala:176)
at
org.apache.flink.table.factories.TableFactoryService$.findInternal(TableFactoryService.scala:125)
at
org.apache.flink.table.factories.TableFactoryService$.find(TableFactoryService.scala:100)
at
org.apache.flink.table.factories.TableFactoryService.find(TableFactoryService.scala)
at
org.apache.flink.streaming.connectors.kafka.KafkaTableSourceSinkFactoryBase.getDeserializationSchema(KafkaTableSourceSinkFactoryBase.java:259)
at
org.apache.flink.streaming.connectors.kafka.KafkaTableSourceSinkFactoryBase.createStreamTableSource(KafkaTableSourceSinkFactoryBase.java:144)
at
org.apache.flink.table.factories.TableFactoryUtil$.findAndCreateTableSource(TableFactoryUtil.scala:50)
at
org.apache.flink.table.descriptors.ConnectTableDescriptor.registerTableSource(ConnectTableDescriptor.scala:44)
at
org.clay.test.Test_New_Sign_Student$.main(Test_New_Sign_Student.scala:64)
at
org.clay.test.Test_New_Sign_Student.main(Test_New_Sign_Student.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at
org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:529)
===================================
Can someone tell me what caused this? I am very confused about this........

if you are using maven-shade-plugin, make sure SPI transformer is placed.
Flink uses java Service Provider to discover Source/Sink connector.
Without this transformer, you will 100% encoutner "org.apache.flink.table.api.NoMatchingTableFactoryException: Could
not find a suitable table factory", which happened on me.
https://ci.apache.org/projects/flink/flink-docs-release-1.9/dev/table/connect.html#update-mode
flink officially points out this, search "SPI" on this page
<transformers>
<transformer implementation="org.apache.maven.plugins.shade.resource.ServicesResourceTransformer"/>
</transformers>

You have to add the JAR dependencies of the connectors (Kafka) and formats (JSON) that you are using to the classpath of your program, i.e., either build a fat JAR that includes them or provide them to the classpath of the Flink cluster by copying them in the ./lib folder.
Check the Flink documentation for links to download the respective dependencies.

I have the met the same problem, just add parameters --connector.type kafka when you run your application will solve this. see enter link description here

Related

Unable to use extended_choice_parameter.ExtendedChoiceParameterValue for security reasons in jenkins

please find the error while trying to pass the parameters during build -
java.lang.UnsupportedOperationException: Refusing to marshal com.cwctravel.hudson.plugins.extended_choice_parameter.ExtendedChoiceParameterValue for security reasons; see https://jenkins.io/redirect/class-filter/
at hudson.util.XStream2$BlacklistedTypesConverter.marshal(XStream2.java:541)
at com.thoughtworks.xstream.core.AbstractReferenceMarshaller.convert(AbstractReferenceMarshaller.java:69)
at com.thoughtworks.xstream.core.TreeMarshaller.convertAnother(TreeMarshaller.java:58)
at com.thoughtworks.xstream.core.TreeMarshaller.convertAnother(TreeMarshaller.java:43)
at com.thoughtworks.xstream.core.AbstractReferenceMarshaller$1.convertAnother(AbstractReferenceMarshaller.java:88)
at com.thoughtworks.xstream.converters.collections.AbstractCollectionConverter.writeItem(AbstractCollectionConverter.java:64)
at com.thoughtworks.xstream.converters.collections.CollectionConverter.marshal(CollectionConverter.java:74)
at com.thoughtworks.xstream.core.AbstractReferenceMarshaller.convert(AbstractReferenceMarshaller.java:69)
at com.thoughtworks.xstream.core.TreeMarshaller.convertAnother(TreeMarshaller.java:58)
at com.thoughtworks.xstream.core.AbstractReferenceMarshaller$1.convertAnother(AbstractReferenceMarshaller.java:84)
at hudson.util.RobustReflectionConverter.marshallField(RobustReflectionConverter.java:264)
at hudson.util.RobustReflectionConverter$2.writeField(RobustReflectionConverter.java:251)
Caused: java.lang.RuntimeException: Failed to serialize hudson.model.ParametersAction#parameters for class hudson.model.ParametersAction
at hudson.util.RobustReflectionConverter$2.writeField(RobustReflectionConverter.java:255)
at hudson.util.RobustReflectionConverter$2.visit(RobustReflectionConverter.java:223)
at com.thoughtworks.xstream.converters.reflection.PureJavaReflectionProvider.visitSerializableFields(PureJavaReflectionProvider.java:138)
at hudson.util.RobustReflectionConverter.doMarshal(RobustReflectionConverter.java:209)
at hudson.util.RobustReflectionConverter.marshal(RobustReflectionConverter.java:150)
at com.thoughtworks.xstream.core.AbstractReferenceMarshaller.convert(AbstractReferenceMarshaller.java:69)
at com.thoughtworks.xstream.core.TreeMarshaller.convertAnother(TreeMarshaller.java:58)
at com.thoughtworks.xstream.core.TreeMarshaller.convertAnother(TreeMarshaller.java:43)
at com.thoughtworks.xstream.core.AbstractReferenceMarshaller$1.convertAnother(AbstractReferenceMarshaller.java:88)
at com.thoughtworks.xstream.converters.collections.AbstractCollectionConverter.writeItem(AbstractCollectionConverter.java:64)
at com.thoughtworks.xstream.converters.collections.CollectionConverter.marshal(CollectionConverter.java:74)
at com.thoughtworks.xstream.core.AbstractReferenceMarshaller.convert(AbstractReferenceMarshaller.java:69)
at com.thoughtworks.xstream.core.TreeMarshaller.convertAnother(TreeMarshaller.java:58)
at com.thoughtworks.xstream.core.AbstractReferenceMarshaller$1.convertAnother(AbstractReferenceMarshaller.java:84)
at hudson.util.RobustReflectionConverter.marshallField(RobustReflectionConverter.java:264)
at hudson.util.RobustReflectionConverter$2.writeField(RobustReflectionConverter.java:251)
Caused: java.lang.RuntimeException: Failed to serialize hudson.model.Actionable#actions for class org.jenkinsci.plugins.workflow.job.WorkflowRun
at hudson.util.RobustReflectionConverter$2.writeField(RobustReflectionConverter.java:255)
at hudson.util.RobustReflectionConverter$2.visit(RobustReflectionConverter.java:223)
at com.thoughtworks.xstream.converters.reflection.PureJavaReflectionProvider.visitSerializableFields(PureJavaReflectionProvider.java:138)
at hudson.util.RobustReflectionConverter.doMarshal(RobustReflectionConverter.java:209)
at hudson.util.RobustReflectionConverter.marshal(RobustReflectionConverter.java:150)
at com.thoughtworks.xstream.core.AbstractReferenceMarshaller.convert(AbstractReferenceMarshaller.java:69)
at com.thoughtworks.xstream.core.TreeMarshaller.convertAnother(TreeMarshaller.java:58)
at com.thoughtworks.xstream.core.TreeMarshaller.convertAnother(TreeMarshaller.java:43)
at com.thoughtworks.xstream.core.TreeMarshaller.start(TreeMarshaller.java:82)
at com.thoughtworks.xstream.core.AbstractTreeMarshallingStrategy.marshal(AbstractTreeMarshallingStrategy.java:37)
at com.thoughtworks.xstream.XStream.marshal(XStream.java:1026)
at com.thoughtworks.xstream.XStream.marshal(XStream.java:1015)
at com.thoughtworks.xstream.XStream.toXML(XStream.java:988)
at hudson.XmlFile.write(XmlFile.java:195)
at hudson.model.Run.save(Run.java:2077)
at org.jenkinsci.plugins.workflow.cps.EnvActionImpl.forRun(EnvActionImpl.java:136)
at org.jenkinsci.plugins.workflow.cps.EnvActionImpl$Binder.getValue(EnvActionImpl.java:149)
at org.jenkinsci.plugins.workflow.cps.EnvActionImpl$Binder.getValue(EnvActionImpl.java:142)
at org.jenkinsci.plugins.workflow.cps.CpsScript.getProperty(CpsScript.java:121)
at org.codehaus.groovy.runtime.InvokerHelper.getProperty(InvokerHelper.java:174)
at org.codehaus.groovy.runtime.ScriptBytecodeAdapter.getProperty(ScriptBytecodeAdapter.java:456)
at org.kohsuke.groovy.sandbox.impl.Checker$7.call(Checker.java:355)
at org.kohsuke.groovy.sandbox.GroovyInterceptor.onGetProperty(GroovyInterceptor.java:68)
at org.jenkinsci.plugins.scriptsecurity.sandbox.groovy.SandboxInterceptor.onGetProperty(SandboxInterceptor.java:354)
at org.kohsuke.groovy.sandbox.impl.Checker$7.call(Checker.java:353)
at org.kohsuke.groovy.sandbox.impl.Checker.checkedGetProperty(Checker.java:357)
at org.kohsuke.groovy.sandbox.impl.Checker.checkedGetProperty(Checker.java:333)
at org.kohsuke.groovy.sandbox.impl.Checker.checkedGetProperty(Checker.java:333)
at com.cloudbees.groovy.cps.sandbox.SandboxInvoker.getProperty(SandboxInvoker.java:29)
at com.cloudbees.groovy.cps.impl.PropertyAccessBlock.rawGet(PropertyAccessBlock.java:20)
Caused: java.io.IOException
at hudson.XmlFile.write(XmlFile.java:202)
at hudson.model.Run.save(Run.java:2077)
at org.jenkinsci.plugins.workflow.cps.EnvActionImpl.forRun(EnvActionImpl.java:136)
at org.jenkinsci.plugins.workflow.cps.EnvActionImpl$Binder.getValue(EnvActionImpl.java:149)
at org.jenkinsci.plugins.workflow.cps.EnvActionImpl$Binder.getValue(EnvActionImpl.java:142)
at org.jenkinsci.plugins.workflow.cps.CpsScript.getProperty(CpsScript.java:121)
at org.codehaus.groovy.runtime.InvokerHelper.getProperty(InvokerHelper.java:174)
at org.codehaus.groovy.runtime.ScriptBytecodeAdapter.getProperty(ScriptBytecodeAdapter.java:456)
at org.kohsuke.groovy.sandbox.impl.Checker$7.call(Checker.java:355)
at org.kohsuke.groovy.sandbox.GroovyInterceptor.onGetProperty(GroovyInterceptor.java:68)
at org.jenkinsci.plugins.scriptsecurity.sandbox.groovy.SandboxInterceptor.onGetProperty(SandboxInterceptor.java:354)
at org.kohsuke.groovy.sandbox.impl.Checker$7.call(Checker.java:353)
at org.kohsuke.groovy.sandbox.impl.Checker.checkedGetProperty(Checker.java:357)
at org.kohsuke.groovy.sandbox.impl.Checker.checkedGetProperty(Checker.java:333)
at org.kohsuke.groovy.sandbox.impl.Checker.checkedGetProperty(Checker.java:333)
at com.cloudbees.groovy.cps.sandbox.SandboxInvoker.getProperty(SandboxInvoker.java:29)
at com.cloudbees.groovy.cps.impl.PropertyAccessBlock.rawGet(PropertyAccessBlock.java:20)
at WorkflowScript.run(WorkflowScript:8)
at ___cps.transform___(Native Method)
at com.cloudbees.groovy.cps.impl.PropertyishBlock$ContinuationImpl.get(PropertyishBlock.java:74)
at com.cloudbees.groovy.cps.LValueBlock$GetAdapter.receive(LValueBlock.java:30)
at com.cloudbees.groovy.cps.impl.PropertyishBlock$ContinuationImpl.fixName(PropertyishBlock.java:66)
at jdk.internal.reflect.GeneratedMethodAccessor629.invoke(Unknown Source)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at com.cloudbees.groovy.cps.impl.ContinuationPtr$ContinuationImpl.receive(ContinuationPtr.java:72)
at com.cloudbees.groovy.cps.impl.ConstantBlock.eval(ConstantBlock.java:21)
at com.cloudbees.groovy.cps.Next.step(Next.java:83)
at com.cloudbees.groovy.cps.Continuable$1.call(Continuable.java:174)
at com.cloudbees.groovy.cps.Continuable$1.call(Continuable.java:163)
at org.codehaus.groovy.runtime.GroovyCategorySupport$ThreadCategoryInfo.use(GroovyCategorySupport.java:129)
at org.codehaus.groovy.runtime.GroovyCategorySupport.use(GroovyCategorySupport.java:268)
at com.cloudbees.groovy.cps.Continuable.run0(Continuable.java:163)
at org.jenkinsci.plugins.workflow.cps.SandboxContinuable.access$001(SandboxContinuable.java:19)
at org.jenkinsci.plugins.workflow.cps.SandboxContinuable$1.call(SandboxContinuable.java:35)
at org.jenkinsci.plugins.workflow.cps.SandboxContinuable$1.call(SandboxContinuable.java:32)
at org.jenkinsci.plugins.scriptsecurity.sandbox.groovy.GroovySandbox.runInSandbox(GroovySandbox.java:237)
at org.jenkinsci.plugins.workflow.cps.SandboxContinuable.run0(SandboxContinuable.java:32)
at org.jenkinsci.plugins.workflow.cps.CpsThread.runNextChunk(CpsThread.java:174)
at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup.run(CpsThreadGroup.java:331)
at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup.access$100(CpsThreadGroup.java:82)
at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup$2.call(CpsThreadGroup.java:243)
at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup$2.call(CpsThreadGroup.java:231)
at org.jenkinsci.plugins.workflow.cps.CpsVmExecutorService$2.call(CpsVmExecutorService.java:64)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at hudson.remoting.SingleLaneExecutorService$1.run(SingleLaneExecutorService.java:136)
at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
at jenkins.security.ImpersonatingExecutorService$1.run(ImpersonatingExecutorService.java:59)
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:834)
Finished: FAILURE
We are stated to see security issue related to extended choice parameters in jenkins ,which gets json array as input. Please find the error . Can you please help on this. FYI There was no jenkins or plugins update done recently. This stared happening all of the sudden (2 weeks back) and all the "extended choice parameters" got deleted without any clue in jenkins build.
extended choice parameters VERSION - 0.78
jenkins VERSION - 2.263.1
Issue has been fixed by replacing the extended choice parameter jar file with the latest version which had fix to extended_choice_parameter.ExtendedChoiceParameterValue parameters. Also, make sure to White list "export JDK_JAVA_OPTIONS="Dhudson.remoting.ClassFilter=com.cwctravel.hudson.plugins.extended_choice_parameter.ExtendedChoiceParameterValue" in .bash_profile of your jenkins machine

flink org.apache.flink.table.api.NoMatchingTableFactoryException

I'm using flink table api, using kafka as input source and json as table schema. I got this error when submit my program:
```
The program finished with the following exception:
org.apache.flink.client.program.ProgramInvocationException: The main method caused an error.
at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:546)
at org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:421)
at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:426)
at org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:804)
at org.apache.flink.client.cli.CliFrontend.runProgram(CliFrontend.java:280)
at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:215)
at org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:1044)
at org.apache.flink.client.cli.CliFrontend.lambda$main$11(CliFrontend.java:1120)
at org.apache.flink.runtime.security.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:30)
at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1120)
Caused by: org.apache.flink.table.api.NoMatchingTableFactoryException:
Could not find a suitable table factory for 'org.apache.flink.table.factories.StreamTableSourceFactory' in
the classpath.
Reason: No context matches.
The following properties are requested:
connector.properties.0.key=zookeeper.connect
connector.properties.0.value=localhost:2181
connector.properties.1.key=bootstrap.servers
connector.properties.1.value=localhost:9092
connector.property-version=1
connector.startup-mode=latest-offset
connector.topic=flink_table_test
connector.type=kafka
connector.version=0.10
format.fail-on-missing-field=true
format.json-schema={\t'type': 'object',\t'properties': {\t\t'uid': {\t\t\t'type': 'number'\t\t},\t\t'name': {\t\t\t'type': 'string'\t\t},\t\t'age': {\t\t\t'type': 'number'\t\t},\t\t'timestamp': {\t\t\t'type': 'long'\t\t}\t}}
format.property-version=1
format.type=json
schema.0.name=uid
schema.0.type=BIGINT
schema.1.name=name
schema.1.type=VARCHAR
schema.2.name=age
schema.2.type=INT
schema.3.name=ts
schema.3.type=TIMESTAMP
update-mode=append
The following factories have been considered:
org.apache.flink.formats.json.JsonRowFormatFactory
org.apache.flink.table.sources.CsvBatchTableSourceFactory
org.apache.flink.table.sources.CsvAppendTableSourceFactory
org.apache.flink.table.sinks.CsvBatchTableSinkFactory
org.apache.flink.table.sinks.CsvAppendTableSinkFactory
at org.apache.flink.table.factories.TableFactoryService$.filterByContext(TableFactoryService.scala:214)
at org.apache.flink.table.factories.TableFactoryService$.findInternal(TableFactoryService.scala:130)
at org.apache.flink.table.factories.TableFactoryService$.find(TableFactoryService.scala:81)
at org.apache.flink.table.factories.TableFactoryUtil$.findAndCreateTableSource(TableFactoryUtil.scala:49)
at org.apache.flink.table.descriptors.ConnectTableDescriptor.registerTableSource(ConnectTableDescriptor.scala:44)
at com.akulaku.data.flink.QueryTable$.main(QueryTable.scala:53)
at com.akulaku.data.flink.QueryTable.main(QueryTable.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:529)
... 9 more
```
I already add the flink-json-1.6.1.jar into flink lib path.
Could anyone helps?
Try adding the kafka connector sql jar (e.g. flink-connector-kafka-0.11_2.11-1.6.1-sql-jar.jar). You can find the whole list of connector dependencies here.

Loading user classes in flink RichFilterFunction using User code classloader

I am trying to use the flink's UserCodeClassLoader but since I am new to Flink I could not understand exactly how to use it.
Scenario:
In the open() method of RichFilterFunction(), I want to load an external jar.
To do so, I do the following in open():
#Override
public void open(Configuration parameters) throws Exception {
ClassLoader userClassLoader = getRuntimeContext().getUserCodeClassLoader();
URL url = userClassLoader.getResource("/tmp/rohit/FilterTest/FilterTest.jar");
klazz = userClassLoader.loadClass("FilterTest");
Constructor<?> ctor = klazz.getConstructor();
Object obj = ctor.newInstance(new Object[] {});
control = (MyRichFilterInterface)obj;
... etc
However, I am getting ClassNotFoundException:
Caused by: java.lang.ClassNotFoundException: FilterTest
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at org.apache.flink.runtime.execution.librarycache.FlinkUserCodeClassLoaders$ChildFirstClassLoader.loadClass(FlinkUserCodeClassLoaders.java:128)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at MyRichFilterFunction.open(MyRichFilterFunction.java:24)
at org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:36)
at org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.open(AbstractUdfStreamOperator.java:102)
at org.apache.flink.streaming.runtime.tasks.StreamTask.openAllOperators(StreamTask.java:393)
at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:254)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:718)
at java.lang.Thread.run(Thread.java:748)
My flink version is 1.4.0 & it is installed in /root/flink-1.4.0/
I have not changed any parameters in flink config specific to this issue.
If anyone knows what I am missing, that would be great!!!
I have used URLClassLoader earlier, but I am not sure how to use Flink's provided UserCodeClassLoader.

Spark - jdbc write fails in Yarn cluster mode but works in spark-shell

I am using Spark 1.6.2, Hadoop 2.6, Scala 2.10.5 and Java 1.7
I am using JDBC to read data from MSSQL and this works without any problem:
val hqlContext = new HiveContext(sc)
val url = "jdbc:sqlserver://1.1.1.1:1111;database=CIQOwnershipProcessing;user=OwnershipUser;password=Ownership123"
val driver = "com.microsoft.sqlserver.jdbc.SQLServerDriver";
val df1 = hqlContext.read.format("jdbc").options(
Map("url" -> url, "driver" -> driver,
"dbtable" -> "(select * from OwnershipStandardization_PositionSequence_tbl) as ps")).load()
And, while writing back dataframe to MSSQL, I am using the JDBC write as shown below. This works fine in Spark-shell but fails when I do spark-submit in Yarn-Cluster mode. What am I missing ?
val prop = new java.util.Properties
df1.write.mode("Overwrite").jdbc(url, "CIQOwnershipProcessing.dbo.df_sparkop",prop)
This is how my spark-submit command looks like. As you can see, I am passing the SQLJDBC jar path too. And, I have also specified the jdbc jar path in "spark.executor.extraClassPath" property in spark-defaults.conf on all nodes of the cluster. Since the JDBC read is working, I doubt if it has anything to do with the classpaths.
spark-submit --class com.spgmi.csd.OshpStdCarryOver --master yarn --deploy-mode cluster --conf spark.yarn.executor.memoryOverhead=2048 --num-executors 1 --executor-cores 2 --driver-memory 3g --executor-memory 8g --jars $SPARK_HOME/lib/datanucleus-api-jdo-3.2.6.jar,$SPARK_HOME/lib/datanucleus-core-3.2.10.jar,$SPARK_HOME/lib/datanucleus-rdbms-3.2.9.jar,/usr/share/java/sqljdbc_4.1/enu/sqljdbc41.jar --files $SPARK_HOME/conf/hive-site.xml $SPARK_HOME/lib/spark-poc2-17.1.0.jar
The error thrown in the Yarn-Cluster mode is:
17/01/05 10:21:31 ERROR yarn.ApplicationMaster: User class threw
exception: java.lang.InstantiationException:
org.apache.spark.sql.execution.datasources.jdbc.DriverWrapper
java.lang.InstantiationException:
org.apache.spark.sql.execution.datasources.jdbc.DriverWrapper
at java.lang.Class.newInstance(Class.java:368)
at org.apache.spark.sql.execution.datasources.jdbc.DriverRegistry$.register(DriverRegistry.scala:46)
at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$createConnectionFactory$2.apply(JdbcUtils.scala:53)
at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$createConnectionFactory$2.apply(JdbcUtils.scala:52)
at org.apache.spark.sql.DataFrameWriter.jdbc(DataFrameWriter.scala:278)
at com.spgmi.csd.OshpStdCarryOver$.main(SparkOshpStdCarryOver.scala:175)
at com.spgmi.csd.OshpStdCarryOver.main(SparkOshpStdCarryOver.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:558)
I was facing the same issue. I resolved it by setting connection property in prop.
val prop = new java.util.Properties
prop.setProperty("driver","com.mysql.jdbc.Driver")
now pass this prop in
df1.write.mode("Overwrite").jdbc(url, "CIQOwnershipProcessing.dbo.df_sparkop",prop)
Your problem feels very similar to SPARK-14204 and SPARK-14162 -- although that bug was supposed to be fixed in Spark 1.6.2 (?!)
With a Type 4 JDBC driver you should not have to explicitly mention the "driver" property; the JAR should automatically register the URL prefix that it supports (here jdbc:sqlserver:).
But because of the bug, the Spark JDBC module may not use that "registration" to find the driver that implicitly matches the URL.
In other words: for reading, you force the "driver" property and the connection works; for writing, you don't force it, and it does not work. Aha!

Eclipselink library java.lang.ClassNotFoundException

I'm trying to resolve the problem with ClassNotFoundException for 3 days, and I can't find the solution, so I'm asking for help, the problem happens when i try to use every method except find or findAll on EJB Entity Beans.
For example when I try to use method remove():
ctx = new InitialContext();
remote = (CategoriesRemote) ctx.lookup("CategoriesFacade/remote");
remote.remove(category);
I get not nice looking Exception:
Exception in thread "Thread-7" java.lang.RuntimeException: java.lang.ClassNotFoundException: org.eclipse.persistence.indirection.IndirectList
at org.jboss.aop.joinpoint.MethodInvocation.getArguments(MethodInvocation.java:318)
at org.jboss.ejb3.stateless.StatelessContainer.dynamicInvoke(StatelessContainer.java:386)
at org.jboss.ejb3.session.InvokableContextClassProxyHack._dynamicInvoke(InvokableContextClassProxyHack.java:53)
at org.jboss.aop.Dispatcher.invoke(Dispatcher.java:91)
at org.jboss.aspects.remoting.AOPRemotingInvocationHandler.invoke(AOPRemotingInvocationHandler.java:82)
at org.jboss.remoting.ServerInvoker.invoke(ServerInvoker.java:898)
at org.jboss.remoting.transport.socket.ServerThread.completeInvocation(ServerThread.java:791)
at org.jboss.remoting.transport.socket.ServerThread.processInvocation(ServerThread.java:744)
at org.jboss.remoting.transport.socket.ServerThread.dorun(ServerThread.java:548)
at org.jboss.remoting.transport.socket.ServerThread.run(ServerThread.java:234)
Caused by: java.lang.ClassNotFoundException: org.eclipse.persistence.indirection.IndirectList
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:247)
at sun.rmi.server.LoaderHandler.loadClass(LoaderHandler.java:434)
at sun.rmi.server.LoaderHandler.loadClass(LoaderHandler.java:165)
at java.rmi.server.RMIClassLoader$2.loadClass(RMIClassLoader.java:620)
at org.jboss.system.JBossRMIClassLoader.loadClass(JBossRMIClassLoader.java:91)
at java.rmi.server.RMIClassLoader.loadClass(RMIClassLoader.java:247)
at sun.rmi.server.MarshalInputStream.resolveClass(MarshalInputStream.java:197)
at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1574)
at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1495)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1731)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1328)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1946)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1870)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1752)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1328)
at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1666)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1322)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:350)
at java.rmi.MarshalledObject.get(MarshalledObject.java:142)
at org.jboss.aop.joinpoint.MethodInvocation.getArguments(MethodInvocation.java:309)
at org.jboss.ejb3.stateless.StatelessContainer.dynamicInvoke(StatelessContainer.java:386)
at org.jboss.ejb3.session.InvokableContextClassProxyHack._dynamicInvoke(InvokableContextClassProxyHack.java:53)
at org.jboss.aop.Dispatcher.invoke(Dispatcher.java:91)
at org.jboss.aspects.remoting.AOPRemotingInvocationHandler.invoke(AOPRemotingInvocationHandler.java:82)
at org.jboss.remoting.ServerInvoker.invoke(ServerInvoker.java:898)
at org.jboss.remoting.transport.socket.ServerThread.completeInvocation(ServerThread.java:791)
at org.jboss.remoting.transport.socket.ServerThread.processInvocation(ServerThread.java:744)
at org.jboss.remoting.transport.socket.ServerThread.dorun(ServerThread.java:548)
at org.jboss.remoting.transport.socket.ServerThread.run(ServerThread.java:234)
at org.jboss.remoting.MicroRemoteClientInvoker.invoke(MicroRemoteClientInvoker.java:216)
at org.jboss.remoting.Client.invoke(Client.java:1961)
at org.jboss.remoting.Client.invoke(Client.java:804)
at org.jboss.aspects.remoting.InvokeRemoteInterceptor.invoke(InvokeRemoteInterceptor.java:60)
at org.jboss.aop.joinpoint.MethodInvocation.invokeNext(MethodInvocation.java:102)
at org.jboss.aspects.tx.ClientTxPropagationInterceptor.invoke(ClientTxPropagationInterceptor.java:61)
at org.jboss.aop.joinpoint.MethodInvocation.invokeNext(MethodInvocation.java:102)
at org.jboss.ejb3.security.client.SecurityClientInterceptor.invoke(SecurityClientInterceptor.java:65)
at org.jboss.aop.joinpoint.MethodInvocation.invokeNext(MethodInvocation.java:102)
at org.jboss.ejb3.remoting.IsLocalInterceptor.invoke(IsLocalInterceptor.java:77)
at org.jboss.aop.joinpoint.MethodInvocation.invokeNext(MethodInvocation.java:102)
at org.jboss.ejb3.async.impl.interceptor.AsynchronousClientInterceptor.invoke(AsynchronousClientInterceptor.java:143)
at org.jboss.aop.joinpoint.MethodInvocation.invokeNext(MethodInvocation.java:102)
at org.jboss.aspects.remoting.PojiProxy.invoke(PojiProxy.java:62)
at $Proxy9.invoke(Unknown Source)
at org.jboss.ejb3.proxy.impl.handler.session.SessionProxyInvocationHandlerBase.invoke(SessionProxyInvocationHandlerBase.java:185)
at $Proxy18.remove(Unknown Source)
at nowyskos.modules.cms.CategorySender.run(CategorySender.java:88)
at java.lang.Thread.run(Thread.java:662)
at org.jboss.aspects.remoting.InvokeRemoteInterceptor.invoke(InvokeRemoteInterceptor.java:72)
at org.jboss.aop.joinpoint.MethodInvocation.invokeNext(MethodInvocation.java:102)
at org.jboss.aspects.tx.ClientTxPropagationInterceptor.invoke(ClientTxPropagationInterceptor.java:61)
at org.jboss.aop.joinpoint.MethodInvocation.invokeNext(MethodInvocation.java:102)
at org.jboss.ejb3.security.client.SecurityClientInterceptor.invoke(SecurityClientInterceptor.java:65)
at org.jboss.aop.joinpoint.MethodInvocation.invokeNext(MethodInvocation.java:102)
at org.jboss.ejb3.remoting.IsLocalInterceptor.invoke(IsLocalInterceptor.java:77)
at org.jboss.aop.joinpoint.MethodInvocation.invokeNext(MethodInvocation.java:102)
at org.jboss.ejb3.async.impl.interceptor.AsynchronousClientInterceptor.invoke(AsynchronousClientInterceptor.java:143)
at org.jboss.aop.joinpoint.MethodInvocation.invokeNext(MethodInvocation.java:102)
at org.jboss.aspects.remoting.PojiProxy.invoke(PojiProxy.java:62)
at $Proxy9.invoke(Unknown Source)
at org.jboss.ejb3.proxy.impl.handler.session.SessionProxyInvocationHandlerBase.invoke(SessionProxyInvocationHandlerBase.java:185)
at $Proxy18.remove(Unknown Source)
at nowyskos.modules.cms.CategorySender.run(CategorySender.java:88)
at java.lang.Thread.run(Thread.java:662)
Previously I got others exceptions regarding no security manager, I resolved it adding
System.setSecuritymanager(new RMISecurityManager);
and adding client.all file to VM Options.
But I can't find the solution for this problem.
I think that in this case the problem is not because of ejb or security manager, but because worng classpath, but in my Netbeans project I added Eclipselink library, additionaly I set the global classpath with this folder containing jar file of eclipselink which have IndirectList class.
I'm getting this error in my client application. EJB Module works on JBoss Server.
Situation for me looks weird, I don't know why find or findAll methods works perfect and remove or create crashes.
I'm really exhaused because of the problem, I don't know how to move on, please help.
Well, according to what I see here, this class "org.eclipse.persistence.indirection.IndirectList" which is missing is present in glassfish distributions. So, I guess there might be a wrong distribution of eclipselink that you are using for your project that works with JBoss AS. A quick (but not be the right solution), would be to add one of the jars (from here) which best matches your environment to your classpath. My advice would be to double check the version, distribution of eclipselink.

Resources