Apache Kafka - Aggregation - apache-camel

Is there any concept for aggregation in Kafka like the concept of camel-aggregation .
If yes, Could anyone provide a detailed description of what it is and how to use it .

I am not familiar with Apache Camel, but you can use Kafka Streams to aggregate messaged:
I refer to the docs for now: http://docs.confluent.io/current/streams/developer-guide.html#streams-developer-guide-dsl-aggregating
Also check out the examples: https://github.com/confluentinc/kafka-streams-examples/tree/5.3.1-post/

Related

Prometheus Query example for Flink metrics

Need a help. Does any one of you have an example Prometheus query for following metrics. These are gaugue metrics, I am not sure, which operator to use for these metrics.
flink_taskmanager_job_task_numRecordsInPerSecond
flink_jobmanager_job_fullRestarts/flink_jobmanager_job_fullRestarts
flink_taskmanager_job_task_isBackPressured
flink_jobmanager_job_numberOfFailedCheckpoints
flink_jobmanager_job_lastCheckpointDuration
thanks.
I think it would be good if you look some official Prometheus API examples and Grafana docs if you use it.
Here the query that I use :
sum by(job_name)(flink_jobmanager_job_totalNumberOfCheckpoints{job_name=~"myJobName_.+"})

PyFlink datastream API support for windowing

Does Apache Flink's Python SDK (PyFlink) Datastream API support operators like Windowing? Whatever examples I have seen so far for Windowing with PyFlink, all use the Table API. The Datastream API does support these operators, but looks like these are not available via PyFlink yet?
Thanks!
That's correct, PyFlink doesn't yet support the DataStream window API. Follow FLINK-21842 to track progress on this issue.

Migration from Mule to Apache Camel

I am trying to explore is there any tools or any document is available in order to do the Migration from Mule into Apache camel.
please share if anything is available.
Thanks
Arun K
After a quick Internet search I suspect the answer is no. Probably their models are two different. You will need to rewrite from scratch the integrations. Note that Camel.appears to favor a Java DSL approach while Mule uses an XML DSL to configure flows.
However if you have a significant number of integrations in Camel XML DSL it might make sense to create a translator at least to get some base translation that will need to be completed manually.

How to add Kafka as bounded source with Apache Flink 1.12 with DataStream API Batch mode

I want to use Kafka source as a bounded data source with Apache Flink 1.12, I tried it using FlinkKafkaConsumer connector but it is giving me the following reason
Caused by: java.lang.IllegalStateException: Detected an UNBOUNDED source with the 'execution.runtime-mode' set to 'BATCH'. This combination is not allowed, please set the 'execution.runtime-mode' to STREAMING or AUTOMATIC
at org.apache.flink.util.Preconditions.checkState(Preconditions.java:198) ~[flink-core-1.12.0.jar:1.12.0]
Based on the flink latest documentation we can use Kafka as a bounded source, but there is no example provided on how it is possible, also nowhere it was mentioned it is the best way to go ahead with this approach.
Can someone help me to get some example working code to achieve this usecase
Here's an example:
KafkaSource<String> source = KafkaSource
.<String>builder()
.setBootstrapServers(...)
.setGroupId(...)
.setTopics(...)
.setDeserializer(...)
.setStartingOffsets(OffsetsInitializer.earliest())
.setBounded(OffsetsInitializer.latest())
.build();
env.fromSource(source, WatermarkStrategy.forMonotonousTimestamps(), "Kafka Source"));
See the javadocs for more info.

Is there any way to index kafka outputs in Apache solr?

I'm new to Apache solr and I want to index data from kafka into solr. Can anyone give simple example of doing this ?
The easiest way to get started on this would probably be to use Kafka Connect.
Connect is part of the apache Kafka package, so should already be installed on your Kakfa node(s). Please refer to the quickstart for a brief introduction on how to run connect.
For writing data to Solr there are two connectors that you could try:
https://github.com/jcustenborder/kafka-connect-solr
https://github.com/MSurendra/kafka-connect-solr
While I don't have any experience with either of them, I'd probably try Jeremy's first based on latest commit and the fact that he works for Confluent.

Resources