Apache Flink as a Service using REST API

Apache Flink as a Service using REST API - apache-flink

I am trying to create a web application with Apache Flink as a backend. Here flink need to talk to application layer(generally UI), so that the results from flink can be sent to frontend.
In Short i am looking for flink counterparts of projects like:
Spark Job Server https://github.com/spark-jobserver/spark-jobserver
Mist https://github.com/Hydrospheredata/mist

The (deceptively titled) Monitoring Rest API detailed here seems like the way to go. Even though it says "Monitoring" it actually can upload, start, and stop jobs. In fact, based on the documentation it seems like it should do everything the web UI does.

Related

Integrate Django and ReactJS with Kafka to generate some analytical data for users?

I'm implementing a Django web service, which is about to have different platform apps,
Reactjs for computers, a swift app for ios, and Kotlin for android devices. the protocol is rest API and perhaps a chat feature included then Django channels are used as well. The data format is JSON. For deployment, I intend to use docker which includes Django, celery, and ReactJS app. And the database is on another separate server which is PostgreSQL. I was thinking to collect some user activity data and some history logs to show the user itself what she/he has done so far. After hours of searching, I came up with Kafka! unfortunately, I have no idea how can I use Kafka and integrate these stuff together and how can I deploy these things. I wish there was a system schema for this specific kind of system that shows what is what and where is what?

Kafka will only integrate your database and Django, with some effort, and ideally a separate Kafka Connect service.
From React (or other clients), you'll need to query some Django API routes which will then query your database. Kafka won't help with your frontend, and isn't really what is exposing the history/activity you're interested in displaying. In other words, you could simply write that to the database, and skip Kafka entirely.
Essentially, you're following the CQRS design pattern if you properly separate Kafka writes from end user / UI reads.
shows what's what and what's where!
Unclear what this means, but data lineage and metadata tools are a whole separate thing. For example, LinkedIn DataHub collects information such as this

How to configure graphite metrics reporter for kinesis data analytics application

I am running a Flink application as part of the AWS Kinesis Data Analytics service. Flink has built in support for metrics and I have a simple counter setup that I can see is working, it is available in the flink dashboard.
Now, I want to configure graphite to be used as collecting my metrics. According to Flink this is possible: https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/metric_reporters/#graphite
My problem is that I can not get the Flink application to read my configuration.
I have tried:
Creating the file conf/flink-conf.yaml together with the java code but it seems to be ignored.
Passing in a configuration override to StreamExecutionEnvironment.getExecutionEnvironment(configuration) , but also seems to be ignored.
How do I get metrics reported to graphite?

Got response from Amazon Support:
Since this is a managed service, it is not possible to change the configuration at this time.

Flink REST Monitor API

After reading from this documentation: https://ci.apache.org/projects/flink/flink-docs-release-1.2/monitoring/rest_api.html, I know that I can send a POST Request to start a Flink job from a savepoint.
The problem is that: This REST interface is hosted in JobManager, which is only alive if there is a job already running there (I run this locally in a JVM). So this is a contradiction because if I want to start a job, there is no job already running right?
Anyone have any clues?

Flink's monitoring API web server and the web dashboard web server are currently the same and run together at the same port. This means that these instructions for how to start up the web UI from inside an IDE should do the job.

angular.js integration with apache kafka

I am new to apache kafka and apache spark. I want to integrate the kafka with my angularjs code. Basically I want to make sure that, when a user click on any link or searches anything on my website, then the those searches and clicks should be triggered as an event and send it to the kafka data pipe for the use of analytics.
My question is how can I integrate frontend code which is in angular.js to apache kafka?
Can I send the searches and click stream data directly to apache spark using kafka pipeline or do I need to send those data to kafka and apache spark will do polling to kafka server and receive the data in batches?

I don't think (just cannot find at glance) there is Kafka client for front-end JavaScript. I cannot actually imagine stable setup when millions of producers (each client's browser) writing to the same Kafka topic.
What you need to do in Angular, is to call your server side function to log your events in Kafka.
Server side code may be written in a bunch of languages, including JavaScript for node.js.
Please take a look for available clients at Kafka Documentation
Update 2019: There are several projects implementing REST over HTTP(s) proxy for producer and consumer interfaces. For example Kafka Rest project (source). Never tried these by myself though.

Best/Correct way to create a client-server constant listener

I am creating an app that involves sending and receiving settings... The desktop application is constantly sending information to a hosted MySQL database, and the Android app will query this same information. It is something similar to the whatsapp web (but in this case, I'll be using a desktop app instead of webpages).
Until this part, everything is working as I need... but, this same Android app will be used to send settings to the desktop app, and the desktop will read and change its settings according to what was just sent.
If I need to constantly query the hosted MySQL database and check if there is any kind of changes sent from the Android, I believe that I'll have a performance drop... each time a query loop is finished, I would have to query, check for any modifications and so on.
Is there a better or correct way to do this kind integration between two apps? I've read something about WebSockets, but I don't have much technical information about this, neither examples that I can use in this case.
Thank you very much for your knowledge sharing.

Here are some useful sites on WebSocket:
http://websocket.org
http://blog.kaazing.com/ [some useful blog posts]
http://www.html5rocks.com/en/tutorials/websockets/basics/
https://goo.gl/5OaJff [mozilla site]
You may want to consider the Observer/Observables pattern. The MYSQL is the Observable and your desktop app and Android app are Observers (and you can add other Observers in the future). Its a common pattern with lots of examples out there. But you'll need a centralized WebSocket server and an Observer/Observable coordination subsystem. You can setup a pub/sub message broker that uses WebSocket with a nice JMS, MQTT, etc, API to make your life easier. ActiveMQ, IBM MQ Lite, Kaazing JMS Edition... lots of options.
full disclosure. I work for Kaazing.