Flink batch continuous running - apache-flink

I have flink batch job. What is the best way to run continuously? (It needs to restart when it's finished because the streaming job can provide new data)
I want to restart the job immediately if it's finished.
Infinite cycle and inside call the tasks?
Make a bash script and always push the job into the jobmanager? (I think it's a really big resource waste)
Thanks

In a similar use-case where we run Flink job against same collection; we trigger new job at periodic intervals. [daily, hourly etc.] https://azkaban.github.io/ can be used for scheduling. This is NOT really what you mentioned. But, a close-match which might be sufficient to solve your use-case.

Related

How to reduce times between Flink intra-jobs and avoid repeated tasks

I have run a Flink bounded job in standalone cluster. Then Flink breaks it down into 3 jobs.
It takes around 10 secs to start the next job after one job finish. How to reduce the times between jobs? and when observing the details of the tasks flow, I notice that 2nd job did the same tasks that have been done by 1st job, plus new additional tasks, and so on with 3rb job. For example, it repeatedly reads the data from files in every job and then join it. Why does it happen? I am a new Flink user. AFAIK, we can't cache dataset in Flink. Really need help to understand how it works. Thank you.
Here is the code

How to fail whole flink application if one job gets fail?

There are two jobs running in flink shown in the below image, If one gets failed, I need to fail the whole flink application? How can I do it? Suppose job with parallelism:1 fails due to some exception, How to fail job with parallelism:4?
The details of how you should go about this depend a bit on the type of infrastructure you are using to run Flink, and how are submitting the jobs. But if you look at ClusterClient and JobClient and associated classes, you should be able to find a way forward.
If you aren't already, you may want to take advantage of application mode, which was added in Flink 1.11. This makes it possible for a single main() method to launch multiple jobs, and added env.executeAsync() for non-blocking job submission.

how to deploy a new job without downtime

I have an Apache Flink application that reads from a single Kafka topic.
I would like to update the application from time to time without experiencing downtime. For now the Flink application executes some simple operators such as map and some synchronous IO to external systems via http rest APIs.
I have tried to use the stop command, but i get "Job termination (STOP) failed: This job is not stoppable.", I understand that the Kafka connector does not support the the stop behavior - a link!
A simple solution would be to cancel with savepoint and to redeploy the new jar with the savepoint, but then we get downtime.
Another solution would be to control the deployment from the outside, for example, by switching to a new topic.
what would be a good practice ?
If you don't need exactly-once output (i.e., can tolerate some duplicates) you can take a savepoint without cancelling the running job. Once the savepoint is completed, you start a second job. The second job could write to different topic but doesn't have to. When the second job is up, you can cancel the first job.

How to schedule a job in apache-flink

I want to write a task that is triggered by apache flink after every 24 hours and then processed by flink. What is the possible way to do this? Does flink provide any job scheduling functionality?
Apache Flink is not a job scheduler but an event processing engine which is a different paradigm, as Flink jobs are supposed to run continuously instead of being triggered by a schedule.
That said, you could achieve the functionality by simply using an off the shelve scheduler (i.e. cron) who is scheduled to start a job on your Flink cluster and then stop it after you receive some sort of notification that the job was done (i.e. through a Kafka topic) or simply use a timeout after which you would assume that the job is finished and you can stop the job. But again, especially because Flink is not designed for this kind of use cases, you would most certainly run into edge cases which Flink does not support.
Alternatively you can simply use a 24 hour tumbling window and run your task in the corresponding trigger function. See https://flink.apache.org/news/2015/12/04/Introducing-windows.html for details on that matter.

Strategy for interleaving jobs on AppEngine?

Let's say I have 1000's of jobs to perform repeatedly, how would you propose I architect my system on Google AppEngine?
I need to be able to add more jobs whilst effectively scaling the system. Scheduled Tasks are part of the solution of course as well as Task Queues but I am looking for more insights has to best utilize these resources.
NOTE: There are no dependencies between "jobs".
Based on what little description you've provided, it's hard to say. You probably want to use the Task Queue, and maybe the deferred library if you're using Python. All that's required to use these is to use the API to enqueue a task.
If you're talking about having many repeating tasks, you have a couple of options:
Start off the first task on the task queue manually, and use 'chaining' to have each invocation queue the next one with the appropriate interval.
Store each schedule in the datastore. Have a cron job regularly scan for any tasks that have reached their ETA; fire off a task queue task for each, updating the ETA for the next run.
I think you could use Cron Jobs.
Regards.

Resources