Creating Prometheus collectors directly in Flink? - apache-flink

I understand that Flink has its own metrics collection abstraction and supports integration with Prometheus. Due to the abstraction it doesn't support some Prometheus concepts directly. I'm trying to do something like this:
message_counter = Counter.build().name("messages").labelNames("source", "dest").register();
// Then..
void processMessage(Message m) {
..
message_counter.labels(m.source, m.dest).inc()
}
Obviously this is for situations where there is a relatively small set of sources and destinations. As far as I can tell, Flink's metrics requires you to pre-register all combinations of the labels in advance, or maintain them in your own data structure and build them as necessary. That's doable, but it feels like I'm re-implementing one of the nice features of simpleclient in my own code.
Is there any way to bypass Flink's abstraction of Prometheus and instantiate counters directly? I've tried the following in combination with metrics.reporters: prom:
class FlinkMetricsExposingMapFunction extends RichMapFunction<Integer, Integer> {
// Using Flink metrics
private transient org.apache.flink.metrics.Counter eventCounter;
// Using Prometheus directly
private static io.prometheus.client.Counter message_counter =
io.prometheus.client.Counter.build()
.name("messages")
.labelNames("source", "dest")
.register();
#Override
public Integer map(Integer value) {
message_counter.labels("in", "out").inc();
return value;
}
}
The problem is that the counter is missing from the metrics endpoint (which happily serves the Flink-defined eventCounter). I examined the contents of defaultRegistry.metricFamilySamples() inside this map function and it only contains the counter I defined and nothing else.
Defining the counter using a transient and setting it in open() fails for different NPE-related reasons that I don't fully understand, but I suspect it's because Prometheus doesn't like the same counter being registered more than once, and I can't figure out or seem to guard against this happening.
Has anyone managed to get this working? Or am I completely on the wrong track?
Edit:
After banging my head against this for some time, I decided that it would be easier to re-implement Prometheus simpleclient's method for defining labels, which turns out to be fairly trivial. It would be nicer if Flink natively supported using labels in this kind of way or if this approach could be made a little more generic, but it's better than nothing.
private transient ConcurrentMap<List<String>, Counter> messageCounterMap;
private Counter labelledCounter(String source, String dest) {
List<Integer> key = Arrays.asList(source, dest);
Counter c = messageCounterMap.get(key);
if (c != null) {
return c;
}
Counter c2 = getRuntimeContext().getMetricGroup().addGroup("source", source)
.addGroup("dest", dest).counter("incoming_messages");
Counter tmp = messageCounterMap.putIfAbsent(key, c2);
return tmp == null ? c2 : tmp;
}
#Override
public void open(Configuration parameters) {
this.messageCounterMap = new ConcurrentHashMap<List<String>, Counter>();
}
#Override
public Integer map(Message msg) {
labelledCounter(msg.source, msg.dest).inc();
}

Related

Flink integration test(s) with Testcontainers

I have a simple Apache Flink job that looks very much like this:
public final class Application {
public static void main(final String... args) throws Exception {
final var env = StreamExecutionEnvironment.getExecutionEnvironment();
final var executionConfig = env.getConfig();
final var params = ParameterTool.fromArgs(args);
executionConfig.setGlobalJobParameters(params);
executionConfig.setParallelism(params.getInt("application.parallelism"));
final var source = KafkaSource.<CustomKafkaMessage>builder()
.setBootstrapServers(params.get("application.kafka.bootstrap-servers"))
.setGroupId(config.get("application.kafka.consumer.group-id"))
// .setStartingOffsets(OffsetsInitializer.committedOffsets(OffsetResetStrategy.EARLIEST))
.setStartingOffsets(OffsetsInitializer.earliest())
.setTopics(config.getString("application.kafka.listener.topics"))
.setValueOnlyDeserializer(new MessageDeserializationSchema())
.build();
env.fromSource(source, WatermarkStrategy.noWatermarks(), "custom.kafka-source")
.uid("custom.kafka-source")
.rebalance()
.flatMap(new CustomFlatMapFunction())
.uid("custom.flatmap-function")
.filter(new CustomFilterFunction())
.uid("custom.filter-function")
.addSink(new CustomDiscardSink()) // Will be a Kafka sink in the future
.uid("custom.discard-sink");
env.execute(config.get("application.job-name"));
}
}
Problem is that I would like to provide an integration test for the entire application — sort of like an end-to-end (set of) test(s) for the entire job. I'm using Testcontainers, but I'm not really sure how to move forward with this. For instance, this is how the test looks like (for now):
#Testcontainers
final class ApplicationTest {
private static final DockerImageName DOCKER_IMAGE = DockerImageName.parse("confluentinc/cp-kafka:7.0.1");
#Container
private static final KafkaContainer KAFKA_CONTAINER = new KafkaContainer(DOCKER_IMAGE);
#ClassRule // How come this work in JUnit Jupiter? :/
public static MiniClusterResource cluster;
#BeforeAll
static void init() {
KAFKA_CONTAINER.start();
// ...probably need to wait and create the topic(s) as well
final var config = new MiniClusterResourceConfiguration.Builder().setNumberSlotsPerTaskManager(2)
.setNumberTaskManagers(1)
.build();
cluster = new MiniClusterResource(config);
}
#Test
void main() throws Exception {
// new Application(); // ...what's next?
}
}
I'm not sure how to implement what's required to trigger the job as-is from that point on. Basically, I would like to execute what was defined before, without (almost) any modifications — I've seen plenty of examples that practically build the entire job again, so that's not an option.
Can somebody provide any pointers here?
MessageDeserializationSchema is unbounded, so isEndOfStream returns false. Not sure if that's an impediment.
In order to make the pipeline more testable, I suggest you create a method on your Application class that takes a source and a sink as parameters, and creates and executes the pipeline, using those connectors.
In your tests you can call that method with special sources and sinks that you use for testing. In particular, you will want to use a KafkaSource that uses .setBounded(...) in the tests so that it cleanly handles just the range of data intended for the test(s).
The solutions and tests for the Apache Flink training exercises are organized along these lines; for example, see RideCleansingSolution.java and RideCleansingIntegrationTest.java. These examples don't use kafka or test containers, but hopefully they'll still be helpful.
I would suggest you instrument your application as an opaque-box test by interacting with it through its public API. This can be done either as an out-process test (e.g. by running your application in a container as well, using Testcontainers) are as an in-process test (by creating your Application and calling its main() method).
Now in your comments you explained, that you want to check for the side-effects of interacting with your application (Kafka messages being published). To check this, connect to the KafkaContainer with your own KafkaConsumer from within the test and use a library such as Awaitiliy to wait until the messages have been received.

How to create a fan-out DeferredTask in Google App Engine - JAVA

I have a Java App Engine project and I am using DeferredTasks for push queues.
/** A hypothetical expensive operation we want to defer on a background task. */
public static class ExpensiveOperation implements DeferredTask {
#Override
public void run() {
System.out.println("Doing an expensive operation...");
// expensive operation to be backgrounded goes here
}
}
I want to be able to create multiple shards of a DeferredTask to be able to have more through-put. Basically, I want to run one DeferredTask that then runs many more DeferredTasks (up to 1,000 of them). Essentially a fan-out task. How can I do that?
One issue is that when creating tasks you need to specify the name of them in the queue.yaml file. But if I want to have 1,000 tasks, do I really need to specify 1,000 of them in that file? It would get very tedious to write out "task-1", "task-2", etc.
Is there a better way to do this?
This is usually done by specifying a shard parameter for each task and reusing the same queue. As noted in your example, the entire java object is serialized with DeferredTask. So you can simply pass in any values you want in a constructor. E.g.
public static class ShardedOperation implements DeferredTask {
private final int shard;
public ShardedOperation(int shard) {
this.shard = shard;
}
}
...
#Override
public void run() {
System.out.println("Fanning out an expensive operation...");
Queue queue = QueueFactory.getDefaultQueue();
for (int i = 0; i < 1000; ++i) {
queue.add(TaskOptions.Builder.withPayload(new ShardedOperation(i)));
}
}
This matches the section you linked to https://cloud.google.com/appengine/docs/standard/java/taskqueue/push/creating-tasks#using_the_instead_of_a_worker_service where the default queue is used.

Reading file that is being appended in Flink

We have a legacy application that is writing results as records to some local files. We want to process these records in real-time thus we are planning to use Flink as an engine. I know that I can read text files using StreamingExecutionEnvironment#readFile. It seems that we need something similar to PROCESS_CONTINUOUSLY there but this flag causes a whole file to be reprocessed on each change, what is not what we want here.
Of course, I can write my custom source that saves number of records per file in its state. But I suppose there might be some problem with such approach with checkpointing or something - my reasoning is that if that would be easy to implement reliably, it would have been already implemented in Flink.
Any tips / suggestions how to approach this?
You can do this rather easily with a custom source, so long as you are content to be reading from a single file (per source instance). You will need to use operator state and implement checkpointing. The state handling and checkpointing will look something like this:
public class CheckpointedFileSource implements SourceFunction<Event>, ListCheckpointed<Long> {
private long eventCnt = 0;
public void run(SourceContext<Event> sourceContext) throws Exception {
final Object lock = sourceContext.getCheckpointLock();
// skip over previously emitted events
...
while (not cancelled) {
read event from file;
synchronized (lock) {
eventCnt++;
sourceContext.collectWithTimestamp(event, timestamp);
}
}
}
#Override
public List<Long> snapshotState(long checkpointId, long checkpointTimestamp) throws Exception {
return Collections.singletonList(eventCnt);
}
#Override
public void restoreState(List<Long> state) throws Exception {
for (Long s : state)
this.eventCnt = s;
}
}
For a complete example see the checkpointed taxi ride data source used in the Flink training exercises. You’ll have to adapt it a bit, since it’s designed to read a static file, rather than one that is being appended to.

Flink streaming job is not scaling as expected

We are in the middle of testing scaling ability of Flink. But we found that scaling not working, no matter increase more slot or increase number of Task Manager. We would expect a linear, if not close-to-linear scaling performance but the result even show degradation. Appreciated any comments.
Test Details,
-VMWare vsphere
-Just a simple pass through test,
- auto gen source 3mil records, each 1kb in size, parallelism=1
- source pass into next map operator, which just return the same record, and sent counter to statsD, parallelism is in cases = 2,4,6
3 TM, total 6 slots(2/TM) each JM/TM has 32 vCPU, 100GB memory
Result:
2 slots: 26 seconds, 3mil/26=115k TPS
4 slots: 23 seconds, 3mil/23=130k TPS
6 slots: 22 seconds, 3mil/22=136k TPS
As shown the scaling is almost nothing. Any clue? Thanks.
You really should be using a RichParallelSourceFunction. If you care about making the records from different instances of the source distinct, you can get ahold of each instance's index from the RuntimeContext, which is available via the getRuntimeContext() method in the RichFunction interface.
Also, Flink has a built-in statsd metrics reporter that you should be using instead of rolling your own. Moreover, numRecordsIn, numRecordsOut, numRecordsInPerSecond, and numRecordsOutPerSecond are already being computed for you, so no need to create this instrumentation yourself. You can also access these metrics via Flink's web interface, or the REST API.
As for why you might be experiencing poor scalability with the Kafka consumer, there are many things that could cause this. If you are using event time processing, then idle partitions could be holding things up (see https://issues.apache.org/jira/browse/FLINK-5479). If the stream is keyed, then data skew could be an issue. If you are connecting to an external database or service, then it could easily be a bottleneck. If checkpointing is misconfigured it could cause this. Or insufficient network capacity.
I would start to debug this by looking at some key metrics in the Flink web UI. Is the load well balanced across the sub-tasks, or is it skewed? You could turn on latency tracking and see if one of the kafka partitions is misbehaving (by inspecting the latency at the sink(s), which will be reported on a per-partition basis). And you could look for back pressure.
please refer to the sample code,
public class passthru extends RichMapFunction<String, String> {
public void open(Configuration configuration) throws Exception {
... ...
stats = new NonBlockingStatsDClient();
}
public String map(String value) throws Exception {
... ...
stats.increment();
return value;
}
}
public class datagen extends RichSourceFunction<String> {
... ...
public void run(SourceContext<String> ctx) throws Exception {
int i = 0;
while (run){
String idx = String.format("%09d", i);
ctx.collect("{\"<a 1kb json content with idx in certain json field>\"}");
i++;
if(i == loop)
run = false;
}
}
... ...
}
public class Job {
public static void main(String[] args) throws Exception {
... ...
DataStream<String> stream = env.addSource(new datagen(loop)).rebalance();
DataStream<String> convert = stream.map(new passthru(statsdUrl));
env.execute("Flink");
}
}
the reductionState code,
dataStream.flatMap(xxx).keyBy(new KeySelector<xxx, AggregationKey>() {
public AggregationKey getKey(rec r) throws Exception {
... ...
}
}).process(new Aggr());
public class Aggr extends ProcessFunction<rec, rec> {
private ReducingState<rec> store;
public void open(Configuration parameters) throws Exception {
store= getRuntimeContext().getReducingState(new ReducingStateDescriptor<>(
"reduction store", new ReduceFunction<rec>() {
... ...
}
public void processElement(rec r, Context ctx, Collector<rec> out)
throws Exception {
... ...
store.add(r);

getting numOfRecordsIn using counters in Flink

I want to show numRecordsIn for an operator in Flink and for doing this I have been following ppt by data artisans at here. code for the counter is given below
public static class mapper extends RichMapFunction<String,String>{
public Counter counter;
#Override
public void open(Configuration parameters) throws Exception {
super.open(parameters);
this.counter = getRuntimeContext()
.getMetricGroup()
.counter("numRecordsIn");
}
#Override
public String map(String s) throws Exception {
counter.inc();
System.out.println("counter val " + counter.toString());
return null;
}
}
The problem is that how do I specify which operator I want to show number_of_Records_In?
Metric counter are exposed via Flink's metric system. In order to take a look at them, you have to configure a metric reporter. A description how to register a metric reporter can be found here.
Flink includes a number of built-in metrics, including numRecordsIn. So if that's what you want to measure, there's no need to write any code to implement that particular measurement. Similarly for numRecordsInPerSecond, and a host of others.
The code you asked about causes the numRecordsIn counter to be incremented for the operator in which the metric is being used.
A good way to better understand the metrics system is to bring up a simple streaming job and look at the metrics in Flink's web ui. I also found it really helpful to query the monitoring REST api while a job was running.

Resources