Passing custom parameters to docker when running Flink on Mesos/Marathon - apache-flink

My team are trying set-up Apache Flink (v1.4) cluster on Mesos/Marathon. We are using the docker image provided by mesosphere. It works really well!
Because of a new requirement, the task managers have to launched with extend runtime privileges. We can easily enable this runtime privileges for the app manager via the Marathon web UI. However, we cannot find a way to enable the privileges for task managers.
In Apache Spark, we can set spark.mesos.executor.docker.parameters privileged=true in Spark's configuration file. Therefore, Spark can pass this parameter to docker run command. I am wondering if Apache Flink allow us to pass a custom parameter to docker run when launching task managers. If not, how can we start task managers with extended runtime privileges?
Thanks

There is a new parameter mesos.resourcemanager.tasks.container.docker.parameters introduced in this commit which will allow passing arbitrary parameters to Docker.

Unfortunately, this is not possible as of right now (or only for the framework scheduler as Tobi pointed out).
I went ahead and created a Jira for this feature so you can keep track/add details/contribute it yourself: https://issues.apache.org/jira/browse/FLINK-8490

You should be able to tweak the setting for the parameters in the ContainerInfo of https://github.com/mesoshq/flink-framework/blob/master/index.js to support this. I’ll eventually update the Flink version in the Docker image...

Related

Use OpenTelemetry with Apache Flink

I have been trying to use OpenTelemetry (https://opentelemetry.io/) in an Apache Flink's job. I am sending the traces to a Kafka topic in order to see it in a Jaeger.
The traceability is working in the job when I am executing it inside my IntelliJ IDE, but once I create the package and try to execute it inside the cluster, I am not able to make it work.
Is there any blocker in that sense for Apache Flink that I am not aware of?
I have accomplished this using a variable:
export FLINK_ENV_JAVA_OPTS=-javaagent:./lib/opentelemetry-javaagent-all.jar
But this is working if I am setting up the Flink's cluster. The problem it's that the cluster that I am using is inside AWS (Kinesis Analytics) and I am not able to set up this variable.
Is there a way to use OpenTelemetry with Flink?

Using Flink LocalEnvironment for Production

I wanted to understand the limitations of LocalExecutionEnvironment and if it can be used to run in production ?
Appreciate any help/insight. Thanks
LocalExecutionEnvironment spins up a Flink MiniCluster, which runs the entire Flink system (JobManager, TaskManager) in a single JVM. So you're limited to CPU cores and memory available on that one machine. You also don't have HA from multiple JobManagers. I haven't looked at other limitations of the MiniCluster environment, but I'm sure more exist.
A LocalExecutionEnvironment doesn't load a config file on startup, so you have to do all of the configuration in the application. By default it also doesn't offer a REST endpoint. You can solve both these issues by doing something like this:
String cwd = Paths.get(".").toAbsolutePath().normalize().toString();
Configuration conf = GlobalConfiguration.loadConfiguration(cwd);
env = StreamExecutionEnvironment.createLocalEnvironmentWithWebUI(conf);
Logging may be another issue that will require a workaround.
I don't believe you'll be able to use the Flink CLI to control the job, but if you create the Web UI (as shown above) you can at least use the REST API to do things like triggering savepoints (after first using the REST API to get the job ID).

Can't set parallelism using Flink's CLI or Web-UI when using Apache Beam

I am using Flink 1.2.1 running on Docker, with Task Managers distributed across different VMs as part of a Docker Swarm.
Uploading an Apache Beam application using the Flink Web UI and trying to set the parallelism at job submission point doesn't work. Neither does submit a job using the Flink CLI.
It seems like the parallelism doesn't get picked up at client level, it ends up defaulting to 1.
When I set the parallelism programmatically within the Apache Beam code, it works: flinkPipelineOptions.setParallelism(4);
I suspect the root of the problem may be in the org.apache.beam.runners.flink.DefaultParallelismFactory class, as it checks for Flink's GlobalConfiguration, which may not pick up runtime values passed to Flink.
Any ideas on how this could be fixed or worked around? I need to be able to change the parallelism dynamically, so the programmatic approach won't work, nor will setting the Flink configuration at system level.
I am using the following documentation:
https://ci.apache.org/projects/flink/flink-docs-release-1.2/dev/parallel.html
https://beam.apache.org/documentation/sdks/javadoc/2.0.0/org/apache/beam/runners/flink/DefaultParallelismFactory.html
This should probably be fixed in the Beam Flink Runner but as a workaround you can try setting the parallelism to -1 programatically. This should make the translation pick up the parallelism that is specified when submitting the job.

Flink dynamic scaling

I am currently studying scalability on Flink. Starting from Version 1.2.0, dynamic rescaling was introduced. I am looking at scaling a long running job which reads data from Kafka source.
Questions regarding dynamic rescaling.
To scale out my flink application, for example: add new task managers, must I restart the job / yarn session to use the newly added resource?
I think it's possible to write Yarn client to deploy new task managers and make it talk to job manager, is that already available in existing flink yarn client application?
Pardon me if these questions are too basic, I did go through the documents and I have to admit I have not been able to put the concepts altogether with some test deployments on yarn recently.
Currently, Dynamic Scaling means the capability to update the operators' parallelism(Flink 1.2), either for keyed state or for non-keyed state.
To scale out my flink application, for example: add new task managers, must I restart the job / yarn session to use the newly added resource? - Yes, the job has to be stopped first, update the parallelism, and restart it again. Do not have to worry about the state, Flink will handle them, including repartition.
I think it's possible to write Yarn client to deploy new task managers and make it talk to job manager, is that already available in existing flink yarn client application? - No, you can not. This feature seems to be added in the future. Currently, we can not do that.

Deploying AngularJs + Sinatra to AWS

I have an AngularJS site consuming an API written in Sinatra.
I'm simply trying to deploy these 2 components together on an AWS EC2 instance.
How would one go about doing that? What tools do you recommend? What structure do you think is most suitable?
Cheers
This is based upon my experience of utilizing the HashciCorp line of tools.
Manual: Launch an Ubuntu image, gem install sinatra and deploy your code. Take a snapshot for safe keeping. This one off approach is good for a development box to iron out the configuration process. Write down the commands you run and any options you may need.
Automated: Use the Packer EC2 Builder and Shell Provisioner to automate your commands from the previous manual approach. This will give you a configured AMI that can be launched.
You can apply different methods of getting to an AMI using different toolsets. However, in the end, you want a single immutable image that can be deployed. repeatedly.

Resources