How submit flink programs to a cluster from another flink program? - apache-flink

I wanna run flink programs on demand submit them went one conditions happens. How run flink jobs from java code in flink 1.3.0 version?

You can use Flink's REST API to submit a job from another running Flink job. For more details see the REST API documentation.

Related

Flink CLI vs Flink Web Console

We have a requirement where to replace flink console UI and enable all the functionalities of Flink Web console using CLI utilities, for some of the functionalities like starting job, save-points etc we are using Flink CLI.
My questions are
Does Flink CLI has parity with Flink Web UI Console?
If not, is there alternate ways to do things without ui what is possible via Flink Console (like checking/monitoring back pressure of a job etc)
I am trying to find a solution where on-call engineer can completely monitor and operate on flink using command line / terminal without need to go to web ui
Thanks in Advance
In theory the Flink CLI plus the REST api provide a superset of the functionality available via the web UI. But some things, like identifying a busy task that's causing backpressure, can be done much more quickly with the web UI. For monitoring and troubleshooting I think you'll need to either build some tooling and/or set up a metrics dashboard (e.g., using Grafana in combination with your preferred metrics reporter).

Use OpenTelemetry with Apache Flink

I have been trying to use OpenTelemetry (https://opentelemetry.io/) in an Apache Flink's job. I am sending the traces to a Kafka topic in order to see it in a Jaeger.
The traceability is working in the job when I am executing it inside my IntelliJ IDE, but once I create the package and try to execute it inside the cluster, I am not able to make it work.
Is there any blocker in that sense for Apache Flink that I am not aware of?
I have accomplished this using a variable:
export FLINK_ENV_JAVA_OPTS=-javaagent:./lib/opentelemetry-javaagent-all.jar
But this is working if I am setting up the Flink's cluster. The problem it's that the cluster that I am using is inside AWS (Kinesis Analytics) and I am not able to set up this variable.
Is there a way to use OpenTelemetry with Flink?

Questions regarding Flink streaming with Kafka

I have a Java application to lunch a flink job to process Kafka streaming.
The application is pending here at the job submission at flinkEnv.execute("flink job name") since the job is running forever for streamings incoming from kafka.
In this case, how can I get job id returned from the execution? I see the jobid is printing in the console. Just wonder, how to get jobid is this case without flinkEnv.execute returning yet.
How I can cancel a flink job given job name from remote server in Java?
As far as I know there is currently no nice programmatic way to control Flink. But since Flink is written in Java everything you can do with the console can also be done with internal class org.apache.flink.client.CliFrontend which is invoked by the console scripts.
An alternative would be using the REST API of the Flink JobManager.
you can use rest api to consume flink job process.
check below link: https://ci.apache.org/projects/flink/flink-docs-master/monitoring/rest_api.html.
maybe you can try to request http://host:port/jobs/overview to get all job's message that contains job's name and job's id. Such as
{"jobs":[{"jid":"d6e7b76f728d6d3715bd1b95883f8465","name":"Flink Streaming Job","state":"RUNNING","start-time":1628502261163,"end-time":-1,"duration":494208,"last-modification":1628502353963,"tasks":{"total":6,"created":0,"scheduled":0,"deploying":0,"running":6,"finished":0,"canceling":0,"canceled":0,"failed":0,"reconciling":0,"initializing":0}}]}
I really hope this will help you.

How to deploy new changes of my flow to Apache Flink cluster?

For example I uploaded JAR with my flow and run it through Apache Flink dashboard. Then I implemented some changes in flow and want to deploy them.
Can anybody explain me step-by-step how to deploy new version of my flow to Apache Flink cluster correctly (without downtime, loosing state, etc.)? I didn't find description of deploy process in official documentation.
What you want to use is the savepoints in Flink.
The steps are as follows:
Prepare the new jar for your job
Save the state of the currently running job using flink savepoint <JobID>
Stop the job
Start the new jar using the just created savepoint flink run -s <pathToSavepoint> <jobJar> ...
See also: https://www.ververica.com/blog/how-apache-flink-enables-new-streaming-applications-part-1

Why "Configuration" section of running job is empty?

Can anybody explain me why "Configuration" section of running job in Apache Flink Dashboard is empty?
How to use this job configuration in my flow? Seems like this is not described in documentation.
The configuration tab of a running job shows the values of the ExecutionConfig. Depending on the version of Flink you might will experience a different behaviour.
Flink <= 1.0
The ExecutionConfig is only accessible for finished jobs. For running jobs, it is not possible to access it. Once the job has finished or has been stopped/cancelled, you should be able to see the ExecutionConfig.
Flink > 1.0
The ExecutionConfig can also be accessed for running jobs.

Resources