Adding opellet 2.6.3 to Maven project including Ontapi throws the following error:
Exception in thread "main" java.lang.NoSuchFieldError: TSV
at ru.avicomp.ontapi.OntFormat.<clinit>(OntFormat.java:61)
at ru.avicomp.ontapi.OntologyFactoryImpl$ONTLoaderImpl.guessFormat(OntologyFactoryImpl.java:752)
at ru.avicomp.ontapi.OntologyFactoryImpl$ONTLoaderImpl.getSupportedFormats(OntologyFactoryImpl.java:774)
at ru.avicomp.ontapi.OntologyFactoryImpl$ONTLoaderImpl.read(OntologyFactoryImpl.java:795)
at ru.avicomp.ontapi.OntologyFactoryImpl$ONTLoaderImpl.readGraph(OntologyFactoryImpl.java:725)
at ru.avicomp.ontapi.OntologyFactoryImpl$ONTLoaderImpl.loadGraph(OntologyFactoryImpl.java:580)
at ru.avicomp.ontapi.OntologyFactoryImpl$ONTLoaderImpl.load(OntologyFactoryImpl.java:286)
at ru.avicomp.ontapi.OntologyFactoryImpl.loadOWLOntology(OntologyFactoryImpl.java:109)
at ru.avicomp.ontapi.OntologyFactoryImpl.loadOWLOntology(OntologyFactoryImpl.java:58)
at ru.avicomp.ontapi.OntologyManagerImpl.load(OntologyManagerImpl.java:1678)
at ru.avicomp.ontapi.OntologyManagerImpl.load(OntologyManagerImpl.java:1644)
at ru.avicomp.ontapi.OntologyManagerImpl.loadOntologyFromOntologyDocument(OntologyManagerImpl.java:1587)
at ru.avicomp.ontapi.OntologyManager.loadOntologyFromOntologyDocument(OntologyManager.java:243)
at ru.avicomp.ontapi.OntologyManager.loadOntologyFromOntologyDocument(OntologyManager.java:259)
at ru.avicomp.ontapi.OntologyManager.loadOntologyFromOntologyDocument(OntologyManager.java:58)
The code tested is the following:
OWLOntologyManager manager = OntManagers.createONT();
OWLDataFactory factory = manager.getOWLDataFactory();
OWLOntology ontology=manager.loadOntologyFromOntologyDocument(
new File("ontologies/E1G1.owl"));
My pom file contains the following dependencies:ontapi1 .1.0, jena-arq 3.6.0, openllet-pellint 2.6.3.
Ensure you only have one owlapi version in the claaspath. The stack trace implies there are at least two versions.
Related
After upgrading maven-plugin-plugin from 3.6.0 to 3.6.4, I am getting a the following exception while the build creates the maven-plugin-descriptor:
Caused by: org.apache.maven.plugin.PluginExecutionException: Execution default-descriptor of goal org.apache.maven.plugins:maven-plugin-plugin:3.6.4:descriptor failed: syntax error #[60,84] in file:/xyz/Foo.java
at org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo (DefaultBuildPluginManager.java:148)
...
Caused by: com.thoughtworks.qdox.parser.ParseException: syntax error #[60,84] in file:/xyz/Foo.java
at com.thoughtworks.qdox.parser.impl.Parser.yyerror (Parser.java:1963)
at com.thoughtworks.qdox.parser.impl.Parser.yyparse (Parser.java:2085)
at com.thoughtworks.qdox.parser.impl.Parser.parse (Parser.java:1944)
at com.thoughtworks.qdox.library.SourceLibrary.parse (SourceLibrary.java:232)
This is running with Maven 3.8.6.
The code compiles file - it's only when running through the org.apache.maven.plugins:maven-plugin-plugin:3.6.4:descriptor
Turns out this was related to using a restricted identifier (var, yield, record) as a method parameter. The underlying qdox parser seems to be more picky than the Java compiler in the regard.
Running Sonarlint on the file helped identify the issue. They provide a nice description in their java:S6213 rule.
When I upgrade my Flink Java app from 1.12.2 to 1.12.3, I get a new runtime error. I can strip down my Flink app to this two liner:
public class TableEnvOnly {
public static void main(String[] args) throws Exception {
final StreamExecutionEnvironment streamEnv = StreamExecutionEnvironment.getExecutionEnvironment();
StreamTableEnvironment tableEnv = StreamTableEnvironment.create(streamEnv);
}
}
This works and doesn't trigger any errors with Flink version 1.12.2. When I upgrade the Maven Flink dependencies to 1.12.3, the same simple app throws the error:
Exception in thread "main" java.lang.NoSuchMethodError: 'scala.collection.mutable.ArrayOps scala.Predef$.refArrayOps(java.lang.Object[])'
at org.apache.flink.table.planner.delegation.PlannerBase.<init>(PlannerBase.scala:118)
at org.apache.flink.table.planner.delegation.StreamPlanner.<init>(StreamPlanner.scala:47)
at org.apache.flink.table.planner.delegation.BlinkPlannerFactory.create(BlinkPlannerFactory.java:48)
at org.apache.flink.table.api.bridge.java.internal.StreamTableEnvironmentImpl.create(StreamTableEnvironmentImpl.java:143)
at org.apache.flink.table.api.bridge.java.StreamTableEnvironment.create(StreamTableEnvironment.java:113)
at org.apache.flink.table.api.bridge.java.StreamTableEnvironment.create(StreamTableEnvironment.java:85)
at simple.TableEnvOnly.main(TableEnvOnly.java:12)
FYI, I'm not using Scala directly. My Gradle dependencies are:
implementation("org.apache.flink:flink-table-planner-blink_2.12:1.12.3")
implementation("org.apache.flink:flink-clients_2.12:1.12.3")
implementation("org.apache.flink:flink-connector-kafka_2.12:1.12.3")
implementation("org.apache.flink:flink-connector-jdbc_2.12:1.12.3")
TL;DR: After upgrade to Flink 1.12.4 the problem magically disappears.
Details
After upgrade from Flink 1.12.2 to Flink 1.12.3 the following code stopped to compile:
import scala.collection.JavaConverters._
val input = new DataStream[String](env.fromCollection(Seq("a", "b", "c").asJava))
val res = input.map(_.toUpperCase)
The Scala compiler reports the error:
could not find implicit value for evidence parameter of type org.apache.flink.api.common.typeinfo.TypeInformation[String]
The version of scala-compiler and scala-library is 2.12.7 - exactly as used by Flink.
To overcome the compilation problem, we provide an implicit instance of TypeInformation:
implicit val typeInfo = TypeInformation.of(classOf[String])
Then, the code compiles. Nevertheless we face the runtime failure described above:
java.lang.NoSuchMethodError: scala.Predef$.refArrayOps([Ljava/lang/Object;)Lscala/collection/mutable/ArrayOps;
at org.apache.flink.api.scala.ClosureCleaner$.getSerializedLambda(ClosureCleaner.scala:184)
at org.apache.flink.api.scala.ClosureCleaner$.org$apache$flink$api$scala$ClosureCleaner$$clean(ClosureCleaner.scala:257)
at org.apache.flink.api.scala.ClosureCleaner$.clean(ClosureCleaner.scala:168)
at org.apache.flink.streaming.api.scala.StreamExecutionEnvironment.scalaClean(StreamExecutionEnvironment.scala:859)
at org.apache.flink.streaming.api.scala.DataStream.clean(DataStream.scala:1189)
at org.apache.flink.streaming.api.scala.DataStream.map(DataStream.scala:623)
As mentioned, the upgrade to Flink 1.12.4 helps - both the compilation and the runtime failures disappear.
My guess is that some Flink 1.12.3 jars have been accidentally compiled with a wrong Scala version. The subsequent release 1.12.4 has been compiled with the correct Scala version.
I am using flink's table api, I receive data from kafka, then register it as
a table, then I use sql statement to process, and finally convert the result
back to a stream, write to a directory, the code looks like this:
def main(args: Array[String]): Unit = {
val sEnv = StreamExecutionEnvironment.getExecutionEnvironment
sEnv.setStreamTimeCharacteristic(TimeCharacteristic.EventTime)
val tEnv = TableEnvironment.getTableEnvironment(sEnv)
tEnv.connect(
new Kafka()
.version("0.11")
.topic("user")
.startFromEarliest()
.property("zookeeper.connect", "")
.property("bootstrap.servers", "")
)
.withFormat(
new Json()
.failOnMissingField(false)
.deriveSchema() //使用表的 schema
)
.withSchema(
new Schema()
.field("username_skey", Types.STRING)
)
.inAppendMode()
.registerTableSource("user")
val userTest: Table = tEnv.sqlQuery(
"""
select ** form ** join **"".stripMargin)
val endStream = tEnv.toRetractStream[Row](userTest)
endStream.writeAsText("/tmp/sqlres",WriteMode.OVERWRITE)
sEnv.execute("Test_New_Sign_Student")
}
I was successful in the local test, but when I submit the following command
in the cluster, I get the following error:
=======================================================
org.apache.flink.client.program.ProgramInvocationException: The main method
caused an error.
at
org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:546)
at
org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:421)
at
org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:426)
at
org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:804)
at
org.apache.flink.client.cli.CliFrontend.runProgram(CliFrontend.java:280)
at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:215)
at
org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:1044)
at
org.apache.flink.client.cli.CliFrontend.lambda$main$11(CliFrontend.java:1120)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1692)
at
org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
at
org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1120)
Caused by: org.apache.flink.table.api.NoMatchingTableFactoryException: Could
not find a suitable table factory for
'org.apache.flink.table.factories.DeserializationSchemaFactory' in
the classpath.
Reason: No factory implements
'org.apache.flink.table.factories.DeserializationSchemaFactory'.
The following properties are requested:
connector.properties.0.key=zookeeper.connect
....
schema.9.name=roles
schema.9.type=VARCHAR
update-mode=append
The following factories have been considered:
org.apache.flink.table.sources.CsvBatchTableSourceFactory
org.apache.flink.table.sources.CsvAppendTableSourceFactory
org.apache.flink.table.sinks.CsvBatchTableSinkFactory
org.apache.flink.table.sinks.CsvAppendTableSinkFactory
org.apache.flink.streaming.connectors.kafka.Kafka011TableSourceSinkFactory
at
org.apache.flink.table.factories.TableFactoryService$.filterByFactoryClass(TableFactoryService.scala:176)
at
org.apache.flink.table.factories.TableFactoryService$.findInternal(TableFactoryService.scala:125)
at
org.apache.flink.table.factories.TableFactoryService$.find(TableFactoryService.scala:100)
at
org.apache.flink.table.factories.TableFactoryService.find(TableFactoryService.scala)
at
org.apache.flink.streaming.connectors.kafka.KafkaTableSourceSinkFactoryBase.getDeserializationSchema(KafkaTableSourceSinkFactoryBase.java:259)
at
org.apache.flink.streaming.connectors.kafka.KafkaTableSourceSinkFactoryBase.createStreamTableSource(KafkaTableSourceSinkFactoryBase.java:144)
at
org.apache.flink.table.factories.TableFactoryUtil$.findAndCreateTableSource(TableFactoryUtil.scala:50)
at
org.apache.flink.table.descriptors.ConnectTableDescriptor.registerTableSource(ConnectTableDescriptor.scala:44)
at
org.clay.test.Test_New_Sign_Student$.main(Test_New_Sign_Student.scala:64)
at
org.clay.test.Test_New_Sign_Student.main(Test_New_Sign_Student.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at
org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:529)
===================================
Can someone tell me what caused this? I am very confused about this........
if you are using maven-shade-plugin, make sure SPI transformer is placed.
Flink uses java Service Provider to discover Source/Sink connector.
Without this transformer, you will 100% encoutner "org.apache.flink.table.api.NoMatchingTableFactoryException: Could
not find a suitable table factory", which happened on me.
https://ci.apache.org/projects/flink/flink-docs-release-1.9/dev/table/connect.html#update-mode
flink officially points out this, search "SPI" on this page
<transformers>
<transformer implementation="org.apache.maven.plugins.shade.resource.ServicesResourceTransformer"/>
</transformers>
You have to add the JAR dependencies of the connectors (Kafka) and formats (JSON) that you are using to the classpath of your program, i.e., either build a fat JAR that includes them or provide them to the classpath of the Flink cluster by copying them in the ./lib folder.
Check the Flink documentation for links to download the respective dependencies.
I have the met the same problem, just add parameters --connector.type kafka when you run your application will solve this. see enter link description here
If I add this artifact to Zeppelin com.knockdata:spark-highcharts:0.6.4 it gives the error org.apache.thrift.transport.TTransportException
Even a simple example like this causes the error:
val x = Array(1,2,3,4)
val rdd = sc.parallelize(x)
The problem is definitely related to %spark as %md and %sh work. I have Spark version spark-2.1.0-bin-hadoop2.6.
There are no messages in the Spark logs. In zeppelin-interpreter-spark-root-(hostname).log it says:
com.google.gson.JsonSyntaxException: java.lang.IllegalStateException: Expected BEGIN_ARRAY but was
BEGIN_OBJECT at line 1 column 2
at com.google.gson.Gson.fromJson(Gson.java:802)
at com.google.gson.Gson.fromJson(Gson.java:757)
at com.google.gson.Gson.fromJson(Gson.java:706)
at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer.convert(RemoteInterprete
rServer.java:425)
org.apache.zeppelin.interpreter.InterpreterException: Exception in thread "main" java.lang.NoSuchMethodError: scala.Predef$.wrapRefArray([Ljava/lang/Object;)Lscala/collection/mutable/WrappedArray;
spark-highcharts:0.6.4 does not support zeppelin:0.7.2. There is a dependency from spark-highcharts which clearly state which zeppelin version to use and it is not binary compatible. That is why the error reported.
The version has been bumped to spark-highcharts:0.6.5 to support zeppelin:0.7.2(spark:2.1).
I am using Spark 1.6.2, Hadoop 2.6, Scala 2.10.5 and Java 1.7
I am using JDBC to read data from MSSQL and this works without any problem:
val hqlContext = new HiveContext(sc)
val url = "jdbc:sqlserver://1.1.1.1:1111;database=CIQOwnershipProcessing;user=OwnershipUser;password=Ownership123"
val driver = "com.microsoft.sqlserver.jdbc.SQLServerDriver";
val df1 = hqlContext.read.format("jdbc").options(
Map("url" -> url, "driver" -> driver,
"dbtable" -> "(select * from OwnershipStandardization_PositionSequence_tbl) as ps")).load()
And, while writing back dataframe to MSSQL, I am using the JDBC write as shown below. This works fine in Spark-shell but fails when I do spark-submit in Yarn-Cluster mode. What am I missing ?
val prop = new java.util.Properties
df1.write.mode("Overwrite").jdbc(url, "CIQOwnershipProcessing.dbo.df_sparkop",prop)
This is how my spark-submit command looks like. As you can see, I am passing the SQLJDBC jar path too. And, I have also specified the jdbc jar path in "spark.executor.extraClassPath" property in spark-defaults.conf on all nodes of the cluster. Since the JDBC read is working, I doubt if it has anything to do with the classpaths.
spark-submit --class com.spgmi.csd.OshpStdCarryOver --master yarn --deploy-mode cluster --conf spark.yarn.executor.memoryOverhead=2048 --num-executors 1 --executor-cores 2 --driver-memory 3g --executor-memory 8g --jars $SPARK_HOME/lib/datanucleus-api-jdo-3.2.6.jar,$SPARK_HOME/lib/datanucleus-core-3.2.10.jar,$SPARK_HOME/lib/datanucleus-rdbms-3.2.9.jar,/usr/share/java/sqljdbc_4.1/enu/sqljdbc41.jar --files $SPARK_HOME/conf/hive-site.xml $SPARK_HOME/lib/spark-poc2-17.1.0.jar
The error thrown in the Yarn-Cluster mode is:
17/01/05 10:21:31 ERROR yarn.ApplicationMaster: User class threw
exception: java.lang.InstantiationException:
org.apache.spark.sql.execution.datasources.jdbc.DriverWrapper
java.lang.InstantiationException:
org.apache.spark.sql.execution.datasources.jdbc.DriverWrapper
at java.lang.Class.newInstance(Class.java:368)
at org.apache.spark.sql.execution.datasources.jdbc.DriverRegistry$.register(DriverRegistry.scala:46)
at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$createConnectionFactory$2.apply(JdbcUtils.scala:53)
at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$createConnectionFactory$2.apply(JdbcUtils.scala:52)
at org.apache.spark.sql.DataFrameWriter.jdbc(DataFrameWriter.scala:278)
at com.spgmi.csd.OshpStdCarryOver$.main(SparkOshpStdCarryOver.scala:175)
at com.spgmi.csd.OshpStdCarryOver.main(SparkOshpStdCarryOver.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:558)
I was facing the same issue. I resolved it by setting connection property in prop.
val prop = new java.util.Properties
prop.setProperty("driver","com.mysql.jdbc.Driver")
now pass this prop in
df1.write.mode("Overwrite").jdbc(url, "CIQOwnershipProcessing.dbo.df_sparkop",prop)
Your problem feels very similar to SPARK-14204 and SPARK-14162 -- although that bug was supposed to be fixed in Spark 1.6.2 (?!)
With a Type 4 JDBC driver you should not have to explicitly mention the "driver" property; the JAR should automatically register the URL prefix that it supports (here jdbc:sqlserver:).
But because of the bug, the Spark JDBC module may not use that "registration" to find the driver that implicitly matches the URL.
In other words: for reading, you force the "driver" property and the connection works; for writing, you don't force it, and it does not work. Aha!