Set a Job Name to Flink job using Table API - apache-flink

I want to set up a Job Name for my Flink application written using Table API, like I did it using Streaming API env.execute(jobName).
I want to replace:
I can't find a way in documentation except to do it while running a job from jar
bin/flink run -d -yD pipeline.name=MyPipelineName-v1.0 ...
flink: 1.14.5
env: Yarn
Update:
In case someone will face the same situation. We can add Table API pipelines to Data Stream API Doc, so doing like that will allow us to have a desired job name set programmatically.
Ex.:
val sinkDescriptor = TableDescriptor.forConnector("kafka")
.option("topic","topic_out")
.option("properties.bootstrap.servers", "localhost:9092")
.schema(schema)
.format(FormatDescriptor.forFormat("avro").build())
.build()
tEnv.createTemporaryTable("OutputTable", sinkDescriptor)
statementSet.addInsert(sinkDescriptor, tA)
statementSet.attachAsDataStream()
env.execute(jobName)

Only StreamExecutionEnvironment calls setJobName on the stream graph.

Related

Pass Flink Job Manager Configuration via Flink Submit Job Rest API

We are using the Flink REST API to submit job to Flink EMR clusters. These clusters are already running in Session mode. We want to know if there is any way to pass following Flink Job manager configuration param while submitting the job via Flink REST API call.
s3.connection.maximum : 1000
state.backend.local-recovery: true
state.checkpoints.dir: hdfs://ha-nn-uri/flink/checkpoints
state.savepoints.dir : hdfs://ha-nn-uri/flink/savepoints
I figured out Flink submit job has "programArgs" field and I tried using it but Flink job manager configuration didn't pick up these settings
"programArgs": f" --s3.connection.maximum 1000 state.backend.local-recovery true --stage '{ddb_config}' --cell-name '{cluster_name}'"

Query on automating Flink Job submission

I am trying to use Flink REST APIs to automate Flink job submission process via pipeline. To call any Flink Rest endpoint we should be aware about the Job Manager Web interface IP. For my POC, I got the IP after running flink-yarn-session command on CLI, but what is the way to get it from code?
Fo automation, I am planning to call following REST API in sequence
request. get('http://ip-10-0-127-59.ec2.internal:8081/jobs/overview') // Get Running job Id
requests.post('http://ip-10-0-127-59.ec2.internal:8081/jobs/:jobID/savepoints/') // Cancel job with savepoint
requests.get('http://ip-10-0-127-59.ec2.internal:8081/jobs/:JobId/savepoints/
:savepointId') // Get savepoint status
requests. Post("http://ip-10-0-127-59.ec2.internal:8081/jars/upload"). // Upload jar for new job
requests.post(
"http://ip-10-0-127-59.ec2.internal:8081/jars/de05ced9-03b7-4f8a-bff9-4d26542c853f_ATVPlaybackStateMachineFlinkJob-1.0-super-2.3.3.jar/run") // submit new job
requests.get('http://ip-10-0-116-99.ec2.internal:35497/jobs/:jobId') // Get status of new job
If you have the flexibility to run on Kubernetes instead on Yarn (looks like you are on AWS from your hostnames, so you could use EKS) then I would recommend using the official Flink Kubernetes Operator - it is built for exactly this purpose by the community.
If Yarn is a given for your use case then you may follow the code examples that Flink uses to talk to the Yarn ResourceManager in the flink-yarn package, especially the following:
https://github.com/apache/flink/blob/master/flink-yarn/src/main/java/org/apache/flink/yarn/YarnClusterDescriptor.java#L384
https://github.com/apache/flink/blob/master/flink-yarn/src/main/java/org/apache/flink/yarn/YarnResourceManagerDriver.java#L258

Prometheus alert for flink failed job?

I'm trying to monitor the availability of my flink jobs using Prometheus alerts.
I have tried with the flink_jobmanager_job_uptime/downtime metrics but they don't seem to fit since they just stop being emmited after the job has failed/finished.
I have already been pointed out to the numRunningJobs metric in order to alert of a missing job. I don't want to use this solution since I would have to update my prometheus config each time i want to deploy a new job.
Has anyone managed to create this alert of a Flink failed job using Prometheus?
Prometheus has an absent() function that will return 1 if the metric don't exist. So, you can just set the alert expression to something like
absent(flink_jobmanager_job_uptime) == 1

Spark job callback

Maybe you can help me with my problem
I start spark job on google-dataproc through API. This job writes results on the google data storage.
When it will be finished I want to get a callback to my application.
Do you know any way to get it? I don't want to track job status through API each time.
Thanks in advance!
I'll agree that it would be nice if there was to either wait for or get a callback for when operations such as VM creation, cluster creation, job completion, etc finish. Out of curiosity, are you using one of the api clients (like google-cloud-java), or are you using the REST API directly?
In the mean time, there are a couple of workarounds that come to mind:
1) Google Cloud Storage (GCS) callbacks
GCS can trigger callbacks (either Cloud Functions or PubSub notifications) when you create files. You can create an file at the end of your Spark job, which will then trigger a notification. Or, just add a trigger for when you put an output file on GCS.
If you're modifying the job anyway, you could also just have the Spark job call back directly to your application when it's done.
2) Use the gcloud command line tool (probably not the best choice for web servers)
gcloud already waits for jobs to complete. You can either use gcloud dataproc jobs submit spark ... to submit and wait for a new job to finish, or gcloud dataproc jobs wait <jobid> to wait for an in-progress job to finish.
That being said, if you're purely looking for a callback for choosing whether to run another job, consider using Apache Airflow + Cloud Composer.
In general, the more you tell us about what you're trying to accomplish, we can help you better :)

How to expose Hystrix jmx for Prometheus

I'm new to Hystrix and I just created my first Hystrix Commands. The commands are being created and executed in a loop so the metrics data should have being registered. I am using the servo metrics publisher as follows:
HystrixPlugins.getInstance()
.registerMetricsPublisher(HystrixServoMetricsPublisher.getInstance());
EDIT:
Looking at the JConsole I found the related metrics definition as follows in the link:
jconsole
I am not using spring, eureka, servo to read data and run the app.
I would like to know how to expose this data in a way that prometheus can read. I tried hystrix-prometheus, but the documentation is not helpful when it is about where the metrics are being exposed, how to get them or check the them.
In order to retrieve Hystrix metrics, you'll first need to get Prometheus' Java Simple Client up and running. The setup depends on your environment. Independent of your environment the result should be a URL where you can retrieve i.e. simple Java metrics.
Once that it up and running, you can use the line
HystrixPrometheusMetricsPublisher.register("application_name");
to register the additional Hystrix metrics. They will be served by the same URL. Please note that you will see Hystrix metrics only after the first call of a Hystrix enabled command.

Resources