The RemoteEnvironment cannot be used when submitting a program through a client, or running in a TestEnvironment context - apache-flink

I was trying to execute the apache-beam word count having Kafka as input and output. But on submitting the jar to the flink cluster, this error came -
The RemoteEnvironment cannot be used when submitting a program through a client, or running in a TestEnvironment context.
org.apache.flink.streaming.api.environment.RemoteStreamEnvironment.<init>(RemoteStreamEnvironment.java:174)
org.apache.flink.streaming.api.environment.RemoteStreamEnvironment.<init>(RemoteStreamEnvironment.java:142)
org.apache.beam.runners.flink.FlinkExecutionEnvironments$BeamFlinkRemoteStreamEnvironment.<init>(FlinkExecutionEnvironments.java:331)
org.apache.beam.runners.flink.FlinkExecutionEnvironments.createStreamExecutionEnvironment(FlinkExecutionEnvironments.java:180)
org.apache.beam.runners.flink.FlinkExecutionEnvironments.createStreamExecutionEnvironment(FlinkExecutionEnvironments.java:141)
org.apache.beam.runners.flink.FlinkPipelineExecutionEnvironment.translate(FlinkPipelineExecutionEnvironment.java:98)
org.apache.beam.runners.flink.FlinkRunner.run(FlinkRunner.java:110)
org.apache.beam.sdk.Pipeline.run(Pipeline.java:315)
org.apache.beam.sdk.Pipeline.run(Pipeline.java:301)
org.apache.beam.examples.WordCount.runWordCount(WordCount.java:295)
org.apache.beam.examples.WordCount.main(WordCount.java:406)
the command I used to submit jar -
./flink run -m localhost:8081 --class org.apache.beam.examples.WordCount /users/word-count-beam/target/word-count-beam-bundled-0.1.jar --runner=FlinkRunner --flinkMaster=localhost --parallelism=2 --checkpointingInterval=10000 --checkpointTimeoutMillis=5000 --minPauseBetweenCheckpoints=500

I guess you use StreamExecutionEnvironment.createRemoteEnvironment, thats way you cannot submit your jar with 'flink run', you have to run it as regular java jar (java -jar ...).
If you want submit it in your cluster you should use StreamExecutionEnvironment.getExecutionEnvironment, it will return the executionEnvironment of the cluster been submitted.

Related

run apache beam on apache flink

I want to run a Python code using Apache beam on Apache Flink. The command that the apache beam site for launching Python code on Apache Flink is as follows:
docker run --net=host apachebeam/flink1.9_job_server:latest --flink-master=localhost:8081
The following is a discussion of different methods of executing code using Apache Fail on Apache Flink. But I haven't seen an example of launching it.
https://flink.apache.org/ecosystem/2020/02/22/apache-beam-how-beam-runs-on-top-of-flink.html
I want this code to run without Docker. How is this code commanded?
You can spin up the flink job server directly using the beam source source. Note you'll need to install java.
1) Clone the beam source code:
git clone https://github.com/apache/beam.git
2) Start the job server
cd beam
./gradlew -p runners/flink/1.8/job-server runShadow -PflinkMasterUrl=localhost:8081
Some helpful tips:
This is not flink itself! You'll need to spin up flink separately.
The flink job service actually spins up a few services:
Expansion Service (port 8097) : This service allows you to use ExternalTransforms within your pipeline that exist within the java sdk. For example the transforms found within the python sdk apache_beam.io.external.* hit this expansion service.
Artifact Service (port 8098) : This is where the pipeline uploads your python artifacts (e.g. pickle files, etc) to be used by the flink taskmanager when it executes your python code. From what I recall you must share the artifact staging area (default to /tmp/beam-artifact-staging) between the flink taskworker and this artifact service.
Job Service (port 8099) : This is what you submit your pipeline to. It translates your pipeline into something for flink and submits it.

How to "Run a single Flink job on YARN " by rest API?

From the Flink official document we know that we can "Run a single Flink job on YARN " by the command below ,my question is can we "Run a single Flink job on YARN " by Rest API, and got the application API ?
./bin/flink run -m yarn-cluster -yn 2 ./examples/batch/WordCount.jar
See the (somewhat deceptively named) Monitoring REST API. You can use the /jars/upload request to send your (fat/uber) jar to the cluster. This returns back an id, that you can use with the /jars/:jarid/run request to start your job.
If you also need to start up the cluster, then you're currently (AFAIK) going to need to write some Java code to start a cluster on YARN. There are two source files in Flink that do this same thing:
ProgramDeployer.java Used by the Flink Table API.
CliFrontEnd.java Used by the command line tool.

Kafka Flink logging issue

I am working on Kafka Flink integration actually I am done with that integration , I have written a simple word count program in Java using Flink API, when I ran it by java -jar myjarname it worked fine but when I tried to ran it with ./bin/flink run myjarname command it was giving me following error,
NoSuchMethodError:org.apache.flink.streaming.api.operators.isCheckpointingEnabled
The respected jar is there but still it is giving me above error.

solr not writing logs when it runs not from its main folder

When I run solr using
java -jar "C:\solr\example\start.jar"
It writes logs to C:\solr\example\logs.
When I run it using
java -Dsolr.solr.home="C:\solr\example\solr"
-Djetty.home="C:\solr\example"
-Djetty.logs="C:\solr\example\logs"
-jar "C:\solr\example\
start.jar"
it writes logs only if I run it from
C:\solr\example>
any other folder - logs are not written.
This is important as I need to run it as a service later (using nssm)
What should I change?
As you have discovered, the Jetty-hosted example distributed with Solr must be started in the example directory to function properly. Try creating a batch file that changes to the directory then invokes Java, like this:
C:
cd C:\solr\example\
java -Dsolr.solr.home="C:\solr\example\solr"
-Djetty.home="C:\solr\example"
-Djetty.logs="C:\solr\example\logs"
-jar "C:\solr\example\
Then have NSSM run the batch file instead of java.
Both answers should work for you.
You could set it up using apache Tomcat as opposed to the Jetty instance Solr comes with. Tomcat which comes standard with a startup.bat batch file that you use to start your server

Jenkins commit a file after successful build

I am using Jenkins, Ant , Flex and Java for my web application.
Currently I update a build version file in Flex src and commit it before starting Jenkins build.
I want to avoid this manual process and let script do this for me.
Contents of file:
Build=01_01_2013_10:43
Release=2.01
Question1:
I want to update this file contents and compile my code and then commit this file back to svn. So that SVN has latest build version number.
How do I commit this changed file to SVN. Would be great if commit happens after successful build.
Question2: I want to send an email to all developers an hour before build starts. "Please commit your changes. Build will start in 1 hr." Can I set up a delay between email and (actual svn export + ant build).
or
Do I have to schedule 2 jobs an hour apart. One to send email and one to do build.
You can use the subclipse svn ant integration to commit changed files to SVN including authentication:
<svnSetting
svnkit="true"
username="bingo"
password="bongo"
id="svn.settings"
/>
<svn refid="svn.settings">
<commit file="your.file" />
</svn>
To get username and password to the build file you have different options. One would be to use a parametrized build, where you define user name and password as build parameters which can be evaluated in the build file.
username="${parameter.svn.username}"
password="${parameter.svn.password}"
A second option is using a the jenkins config file provider plugin. With this you can also use the parameters like for the parametrized build, but you import the credentials from the provided config file, e.g. a properties file can be imported via
<property file="config.file" />
Actually you can also use ant's exec task to execute your subversion commit the file.
For sending an e-mail one hour before actually building, you should setup two jobs, which are scheduled one hour apart. But I don't think this is good practice to notify before building, consider to build more often maybe even per commit to svn.
You can also use the Post build Task plugin (https://wiki.jenkins-ci.org/display/JENKINS/Post+build+task) to execute svn as a shell script (svn must be installed and authenticated from the shell once for the user that runs Jenkins).
Then the svn commit runs as a post build action. The plugin has an option (checkbox) to run the script only if the previous build/steps were successful.
The plugins is also mentioned here: Execute Shell Script after post build in Jenkins

Resources