Flink 1.4 throws errors - apache-flink

just trying to migrate from flink 1.3 into 1.4 and getting this exception on
linux machine:
(not reproducing at windows).
i've import this package also:
// https://mvnrepository.com/artifact/org.apache.flink/flink-shaded-hadoop2
compile group: 'org.apache.flink', name: 'flink-shaded-hadoop2', version: '1.4.0'
any help?
at flink console:
TriggerWindow(TumblingProcessingTimeWindows(10000), ReducingStateDescriptor{serializer=org.apache.flink.api.java.typeutils.runtime.TupleSerializer#cb6c5dba, reduceFunction=com.clicktale.reducers.MetricsReducer#4e406694}, ProcessingTimeTrigger(), WindowedStream.reduce(WindowedStream.java:241)) -> Sink: Unnamed (1/1)
java.util.ServiceConfigurationError: org.apache.hadoop.fs.FileSystem: Provider org.apache.hadoop.fs.LocalFileSystem not a subtype
at java.util.ServiceLoader.fail(ServiceLoader.java:239)
at java.util.ServiceLoader.access$300(ServiceLoader.java:185)
at java.util.ServiceLoader$LazyIterator.nextService(ServiceLoader.java:376)
at java.util.ServiceLoader$LazyIterator.next(ServiceLoader.java:404)
at java.util.ServiceLoader$1.next(ServiceLoader.java:480)
at org.apache.hadoop.fs.FileSystem.loadFileSystems(FileSystem.java:2364)
at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2375)
at org.apache.flink.runtime.fs.hdfs.HadoopFsFactory.create(HadoopFsFactory.java:99)
at org.apache.flink.core.fs.FileSystem.getUnguardedFileSystem(FileSystem.java:401)
at org.apache.flink.streaming.connectors.fs.bucketing.BucketingSink.createHadoopFileSystem(BucketingSink.java:1154)
at org.apache.flink.streaming.connectors.fs.bucketing.BucketingSink.initFileSystem(BucketingSink.java:411)
at org.apache.flink.streaming.connectors.fs.bucketing.BucketingSink.initializeState(BucketingSink.java:355)
at org.apache.flink.streaming.util.functions.StreamingFunctionUtils.tryRestoreFunction(StreamingFunctionUtils.java:178)
at org.apache.flink.streaming.util.functions.StreamingFunctionUtils.restoreFunctionState(StreamingFunctionUtils.java:160)
at org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.initializeState(AbstractUdfStreamOperator.java:96)
at org.apache.flink.streaming.api.operators.AbstractStreamOperator.initializeState(AbstractStreamOperator.java:259)
at org.apache.flink.streaming.runtime.tasks.StreamTask.initializeOperators(StreamTask.java:694)
at org.apache.flink.streaming.runtime.tasks.StreamTask.initializeState(StreamTask.java:682)
at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:253)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:718)
at java.lang.Thread.run(Thread.java:748)

I faced a similar (not specifically this, but dependencies related) issues migrating from 1.3 to 1.4.
In my case, I had to re-generate a fresh POM file using maven archetype and then add the needed dependencies one by one.
See Java Quickstart or Scala Quickstart.
Reason being that there has been a major rework on dependency structure. See Release notes for more information.

Note that Flink 1.4 will load any Hadoop jars found via the "hadoop classpath" shell command, and these will be first on the classpath. So if you have an incompatible version of Hadoop installed that the "hadoop" command points at, you can run into this kind of problem.

Related

FlinkKafkaConsumer fails to read from a LZ4 compressed topic

We've got several flink applications reading from Kafka topics, and they work fine. But recently we've added a new topic to the existing flink job and it started failing immediately on startup with the following root error:
Caused by: org.apache.kafka.common.KafkaException: java.lang.NoClassDefFoundError: net/jpountz/lz4/LZ4Exception
at org.apache.kafka.common.record.CompressionType$4.wrapForInput(CompressionType.java:113)
at org.apache.kafka.common.record.DefaultRecordBatch.compressedIterator(DefaultRecordBatch.java:256)
at org.apache.kafka.common.record.DefaultRecordBatch.streamingIterator(DefaultRecordBatch.java:334)
at org.apache.kafka.clients.consumer.internals.Fetcher$PartitionRecords.nextFetchedRecord(Fetcher.java:1208)
at org.apache.kafka.clients.consumer.internals.Fetcher$PartitionRecords.fetchRecords(Fetcher.java:1245)
... 7 more
I found out that this topic has the lz4 compression and guess that flink for some reason is unable to work with it. Adding lz4 dependencies directly to the app didn't work, and what's weird - it runs fine locally, but fails on the remote cluster.
The flink runtime version is 1.9.1, and we have the same version of all other dependencies in our application:
flink-streaming-java_2.11, flink-connector-kafka_2.11, flink-java and flink-clients_2.11
Could this be happening due to flink not having a dependency to the lz4 lib inside?
Found the solution. No version upgrade was needed, nor the additional dependencies to the application itself. What worked out for us is adding the lz4 library jar directly to the flink libs folder in the Docker image. After that, the error with lz4 compression disappeared.

Flink 1.9.1 No FileSystem for scheme "file" error when submit jobs to cluster

we are recently upgrading our flink cluster to version 1.9.1. Error related to hadoop s3a occurs. The message is as below.
2020-01-16 08:39:49,283 ERROR org.apache.flink.runtime.blob.BlobServerConnection - PUT operation failed
org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.UnsupportedFileSystemException: No FileSystem for scheme "file"
at org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:3332)
at org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3352)
at org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:124)
at org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3403)
at org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3371)
at org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.FileSystem.get(FileSystem.java:477)
at org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.FileSystem.getLocal(FileSystem.java:433)
at org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.confChanged(LocalDirAllocator.java:301)
at org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:378)
at org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.createTmpFileForWrite(LocalDirAllocator.java:456)
at org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.LocalDirAllocator.createTmpFileForWrite(LocalDirAllocator.java:200)
at org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.s3a.S3AFileSystem.createTmpFileForWrite(S3AFileSystem.java:572)
at org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.s3a.S3ADataBlocks$DiskBlockFactory.create(S3ADataBlocks.java:811)
at org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.s3a.S3ABlockOutputStream.createBlockIfNeeded(S3ABlockOutputStream.java:190)
at org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.s3a.S3ABlockOutputStream.<init>(S3ABlockOutputStream.java:168)
at org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.s3a.S3AFileSystem.create(S3AFileSystem.java:778)
at org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1169)
at org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1149)
at org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1038)
at org.apache.flink.fs.s3.common.hadoop.HadoopFileSystem.create(HadoopFileSystem.java:141)
at org.apache.flink.fs.s3.common.hadoop.HadoopFileSystem.create(HadoopFileSystem.java:37)
at org.apache.flink.runtime.blob.FileSystemBlobStore.put(FileSystemBlobStore.java:73)
at org.apache.flink.runtime.blob.FileSystemBlobStore.put(FileSystemBlobStore.java:69)
at org.apache.flink.runtime.blob.BlobUtils.moveTempFileToStore(BlobUtils.java:444)
at org.apache.flink.runtime.blob.BlobServer.moveTempFileToStore(BlobServer.java:694)
at org.apache.flink.runtime.blob.BlobServerConnection.put(BlobServerConnection.java:351)
at org.apache.flink.runtime.blob.BlobServerConnection.run(BlobServerConnection.java:114)
I guess the s3 hadoop filesystem is trying to create local files but it cannot find 'file' filesystem. Can anyone advise the potential problem here?
Thanks
The plugin loader had a shortcoming in 1.9.0 and 1.9.1 that prevented the plugins from lazily loading new classes. It's fixed in the upcoming 1.9.2 and 1.10 releases.
For the time being, you could simply add the jar to the lib folder as a workaround. Note, however, that in 1.10 you can only use s3 through plugins, so keep that in mind when you would upgrade.

Error while deploying flink application on EMR

I am getting this error when I deploy my flink application on EMR
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/flink/api/common/serialization/DeserializationSchema
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:348)
at org.apache.hadoop.util.RunJar.run(RunJar.java:232)
Although, it works fine when I deploy on a local cluster. I am using flink 1.9.0 on EMR version 5.28.0
This issue can be connected with multiple different things. Things to check are:
Version mismatch between Flink in dependencies and Flink on EMR.
The core dependencies of Flink should be `provided. To not cause clash with the dependencies that are available on cluster.
What is your JDK version? Is it possible that there is a problem with the environment? I think it is very likely that the JDK version does not match

Installing zeppelin on CentOS6 with lens get compilation error

I git clone the zeppelin from https://github.com/apache/incubator-zeppelin.git, and make the project by running:
mvn clean package -Pspark-1.5 -Dhadoop.version=2.2.0 -Phadoop-2.2 -Ppyspark -DskipTests
but i always get the error:
Most probably, this indicates a dependency version miss-match, in this case Apache Lense.
The best way is to try re-building Apache Zeppelin from latest master, and if the issue persists - file an issue on official project JIRA

A Maven plugin 1.8.3 to support App Engine requires Maven version 3.1.0

Could someone compile this pom.xml from this tutorial:
https://code.google.com/p/appengine-maven-plugin/
I tried as well this one, but I'm not familiar wtih 'maven invoker plugin configuration' and settings.xml
https://code.google.com/p/appengine-maven-plugin/source/browse/pom.xml
The error I'm getting is the following:
[ERROR] Failed to execute goal com.google.appengine:appengine-maven-plugin:1.8.3:devserver (default-cli) on project nerinorestaurante: The p
lugin com.google.appengine:appengine-maven-plugin:1.8.3 requires Maven version 3.1.0 -> [Help 1]
I think your intention is to use the appengine-maven-plugin?
If so, you need to use version 3.1 of maven. Download it, install it.
It is all clearly explained here : http://maven.apache.org/
Appengine requires JDK 7 (not 8 or 9, for example).
Set JAVA_HOME, JDK_HOME environment variables. E.g.:
export JAVA_HOME=/Library/Java/JavaVirtualMachines/jdk1.7.0_51.jdk/Contents/Home

Resources