Can async subscriber example lose messages? - google-cloud-pubsub

Starting off with pubsub. When reading the google cloud documentation, i ran into a snippet of code, and i think i see a flaw with the example.
This is the code i am talking about. It uses the async subscriber.
public class SubscriberExample {
private static final String PROJECT_ID = ServiceOptions.getDefaultProjectId();
private static final BlockingQueue<PubsubMessage> messages = new LinkedBlockingDeque<>();
static class MessageReceiverExample implements MessageReceiver {
public void receiveMessage(PubsubMessage message, AckReplyConsumer consumer) {
public static void main(String... args) throws Exception {
String subscriptionId = args[0];
ProjectSubscriptionName subscriptionName = ProjectSubscriptionName.of(
PROJECT_ID, subscriptionId);
Subscriber subscriber = null;
try {
subscriber =
Subscriber.newBuilder(subscriptionName, new MessageReceiverExample()).build();
while (true) {
PubsubMessage message = messages.take();
} finally {
if (subscriber != null) {
My question is, what if a bunch of messages have been acknowledged, and the BlockingQueue is not empty, and the server crashes. Then i would lose some messages right? (Acknowledged in PubSub, but not actually processed).
Wouldn't the best implementation be to only acknowledge the message after the it has been processed? Instead of acknowledging it and leaving it on a queue, and assuming it will be processed. I understand this will decouple the receiving of messages and process of messages, and potentially increase throughput, but still it risks losing messages right?

Yes, one should not acknowledge a message until it has been fully processed. Otherwise, the message may never be processed because it will not be redelivered in the event of a crash or restart if it was acknowledged. I have entered an issue to update the example.


Flink - Need way to notify one stream from another

I have an Apache flink usecase that works as follows:
I have data events coming in through first stream. Part of each event is a foreign key for which I expect data from the second stream. E.g.: I am getting data for major cities in the first stream which has a city-code and I need the average temperature over time for this city code streamed through the second stream. It is not possible to have temperatures streamed for all possible cities, we have to request the city for which we need the data.
So we need some way to "notify" the second stream source that we need data for this city "pushed" when we encounter it the first time in the first stream.
This would have been easy if this notification could be done from the first stream. The problem is that the second stream is coming to us through a websocket part of which is a control channel through which we have to make the request - so the request HAS to be made from the second stream.
Check event in the first stream. Read city code x.
Have we seen this city code? If not, notify the second stream, we need data for city code x.
Second stream sends message to source for data for x.
Data starts flowing in for city x, which is used to join downstream.
If notification from the first stream was possible, this would be easy - I could have done it from step 2, so data starts flowing in the second stream. But that is not possible as the request needs to be send on the same websocket connection that feeds the second stream.
I have explored using CoProcessFunction or RichCoMapFunction for this - but it is not clear how this can be done. I have seen some examples of Broadcast State Pattern - but even that does not seem to fit the usecase.
Can someone help me with some pointers on possible solutions?
So I made it work using the suggestion of the side output stream. Thanks #whatisinthename and #kkrugler for the suggestions.
Still trying to figure out details, but here's a summary
From the notification stream (stream 1), create a side output stream (stream 1-1).
Use an extended class (TempRequester) of KeyedProcessFunction, to process the side output stream 1-1 and create Stream 2 from it. The KeyedProcessFunction has the websocket connection.
In the open method of the KeyedProcessFunction create the connection to websocket (handshaking etc.). Have a ListState state to keep the list of city codes.
In the processElement function of TempRequester, check the city code coming in from side output stream 1-1. If present in ListState, do nothing. Else, send a message through websocket control channel and request city data and add the code to ListState. Create a process timer (this is one time) to fire after 500 milliseconds or so. The websocket server writes the temp data very frequently and that is saved in a queue.
In the onTimer method, check the queue, read the data and push out (out.collect...). Create a timer again. So essentially, once the first city code gets in, we create a timer that runs every 500 milliseconds and dumps the records received out into the second stream.
Now the first and second streams can be joined downstream (I used the table API).
Not sure if this is the most elegant solution, but it worked. Thanks for the suggestions.
Here's the approximate main code:
DataStream<Event> notificationStream =
final OutputTag<String> outputTag = new OutputTag<String>("cities-seen"){};
SingleOutputStreamOperator<Event> mainDataStream = notificationStream.process(new ProcessFunction<Event, Event>() {
public void processElement(
Event value,
Context ctx,
Collector<Event> out) throws Exception {
// emit data to regular output
// emit data to side output
ctx.output(outputTag, event.cityCode);
DataStream<String> sideOutputStream = mainDataStream.getSideOutput(outputTag);
DataStream<TemperatureData> temperatureStream = sideOutputStream
.keyBy(value -> value)
.process(new TempRequester());
// set up the Java Table API and rest of SQL based joins ...
And the approximate code for TempRequester (ProcessFunction):
public static class TempRequester extends KeyedProcessFunction<String, String, TemperatureData> {
private ListState<String> allCities;
private volatile boolean running = true;
//This is the queue for requesting city codes
private BlockingQueue<String> messagesToSend = new ArrayBlockingQueue<>(100);
//This is the queue for receiving temperature data
private ConcurrentLinkedQueue<TemperatureData> messages = new ConcurrentLinkedQueue<TemperatureData>();
private static final int TIMEOUT = 500;
public void open(Configuration parameters) throws Exception {;
allCities = getRuntimeContext().getListState(new ListStateDescriptor<>("List of cities seen", String.class));
... rest of websocket client setup code ...
public void close() throws Exception {
running = false;
private boolean initialized = false;
public void processElement(String cityCode, Context ctx, Collector<TemperatureData> collector) throws Exception {
boolean citycodeFound =, false)
.anyMatch(s -> cityCode.equals(s));
if (!citycodeFound) {
messagesToSend.put(.. add city code ..);
if (!initialized) {
ctx.timerService().registerProcessingTimeTimer(ctx.timestamp()+ TIMEOUT);
initialized = true;
public void onTimer(long timestamp, OnTimerContext ctx, Collector<TemperatureData> out) throws Exception {
TemperatureData p;
while ((p = messages.poll()) != null) {
ctx.timerService().registerProcessingTimeTimer(ctx.timestamp() + TIMEOUT);

Google Cloud PubSub send the message to more than one consumer (in the same subscription)

I have a Java SpringBoot2 application (app1) that sends messages to a Google Cloud PubSub topic (it is the publisher).
Other Java SpringBoot2 application (app2) is subscribed to a subscription to receive those messages. But in this case, I have more than one instance (the k8s auto-scaling is enabled), so I have more than one pod for this app consuming messages from the PubSub.
Some messages are consumed by one instance of app2, but many others are sent to more than one app2 instance, so the messages process is duplicated for these messages.
Here is the code of consumer (app2):
private final static int ACK_DEAD_LINE_IN_SECONDS = 30;
private static final long POLLING_PERIOD_MS = 250L;
private static final int WINDOW_MAX_SIZE = 1000;
private static final Duration WINDOW_MAX_TIME = Duration.ofSeconds(1L);
private PubSubAdmin pubSubAdmin;
public ApplicationRunner runner(PubSubReactiveFactory reactiveFactory) {
return args -> {
createSubscription("subscription-id", "topic-id", ACK_DEAD_LINE_IN_SECONDS);
reactiveFactory.poll(subscription, POLLING_PERIOD_MS) // Poll the PubSub periodically
.map(msg -> Pair.of(msg, getMessageValue(msg))) // Extract the message as a pair
.bufferTimeout(WINDOW_MAX_SIZE, WINDOW_MAX_TIME) // Create a buffer of messages to bulk process
.flatMap(this::processBuffer) // Process the buffer
.doOnError(e -> log.error("Error processing event window", e))
private void createSubscription(String subscriptionName, String topicName, int ackDeadline) {
try {
pubSubAdmin.createSubscription(subscriptionName, topicName, ackDeadline);
} catch (AlreadyExistsException e) {"Pubsub subscription '{}' already configured for topic '{}': {}", subscriptionName, topicName, e.getMessage());
private Flux<Void> processBuffer(List<Pair<AcknowledgeablePubsubMessage, PreparedRecordEvent>> msgsWindow) {
return Flux.fromStream(
.collect(Collectors.groupingBy(msg -> msg.getRight().getData())) // Group the messages by same data
private Mono<Void> processDataBuffer(List<Pair<AcknowledgeablePubsubMessage, PreparedRecordEvent>> dataMsgsWindow) {
return processData(
.doOnSuccess(it ->
dataMsgsWindow.forEach(msg -> {"Mark msg ACK");
.doOnError(e -> {
log.error("Error on PreparedRecordEvent event", e);
dataMsgsWindow.forEach(msg -> {
log.error("Mark msg NACK");
private Mono<Void> processData(Data data, Set<Record> records) {
// For each message, make calculations over the records associated to the data
final DataQuality calculated = calculatorService.calculateDataQualityFor(data, records); // Arithmetic calculations
return this.daasClient.updateMetrics(calculated) // Update DB record with a DaaS to wrap DB access
.flatMap(it -> {
if (it.getProcessedRows() >= it.getValidRows()) {
return finish(data);
return Mono.just(data);
private Mono<Data> finish(Data data) {
return dataClient.updateStatus(data.getId, DataStatus.DONE) // Update DB record with a DaaS to wrap DB access
.doOnSuccess(updatedData -> pubSubClient.publish(
new Qa0DonedataEvent(updatedData) // Publis a new event in other topic
.doOnError(err -> {
log.error("Error finishing data");
I need that each messages is consumed by one and only one app2 instance. Anybody know if this is possible? Any idea to achieve this?
Maybe the right way is to create one subscription for each app2 instance and configure the topic to send each message t exactly one subscription instead of to every one. It is possible?
According to the official documentation, once a message is sent to a subscriber, Pub/Sub tries not to deliver it to any other subscriber on the same subscription (app2 instances are subscriber of the same subscription):
Once a message is sent to a subscriber, the subscriber should
acknowledge the message. A message is considered outstanding once it
has been sent out for delivery and before a subscriber acknowledges
it. Pub/Sub will repeatedly attempt to deliver any message that has
not been acknowledged. While a message is outstanding to a subscriber,
however, Pub/Sub tries not to deliver it to any other subscriber on
the same subscription. The subscriber has a configurable, limited
amount of time -- known as the ackDeadline -- to acknowledge the
outstanding message. Once the deadline passes, the message is no
longer considered outstanding, and Pub/Sub will attempt to redeliver
the message
In general, Cloud Pub/Sub has at-least-once delivery semantics. That means that it will be possible to have messages redelivered that have already been acked and to have messages delivered to multiple subscribers receive the same message for a subscription. These two cases should be relatively rare for a well-behaved subscriber, but without keeping track of the IDs of all messages delivered across all subscribers, it will not be possible to guarantee that there won't be duplicates.
If it is happening with some frequency, it would be good to check if your messages are getting acknowledged within the ack deadline. You are buffering messages for 1s, which should be relatively small compared to your ack deadline of 30s, but it also depends on how long the messages ultimately take to process. For example, if the buffer is being processed in sequential order, it could be that the later messages in your 1000-message buffer aren't being processed in time. You could look at the subscription/expired_ack_deadlines_count metric in Cloud Monitoring to determine if it is indeed the case that your acks for messages are late. Note that late acks for even a small number of messages could result in more duplicates. See the "Message Redelivery & Duplication Rate" section of the Fine-tuning Pub/Sub performance with batch and flow control settings post.
Ok, after doing tests, reading documentation and reviewing the code, I have found a "small" error in it.
We had a wrong "retry" on the "processDataBuffer" method, so when an error happened, the messages in the buffer were marked as NACK, so they were delivered to another instance, but due to retry, they were executed again, correctly, so messages were also marked as ACK.
For this, some of them were prosecuted twice.
private Mono<Void> processDataBuffer(List<Pair<AcknowledgeablePubsubMessage, PreparedRecordEvent>> dataMsgsWindow) {
return processData(
.doOnSuccess(it ->
dataMsgsWindow.forEach(msg -> {"Mark msg ACK");
.doOnError(e -> {
log.error("Error on PreparedRecordEvent event", e);
dataMsgsWindow.forEach(msg -> {
log.error("Mark msg NACK");
.retry(); // this retry has been deleted
My question is resolved.
Once corrected the mentioned bug, I still receive duplicated messages. It is accepted that Google Cloud's PubSub does not guarantee the "exactly one deliver" when you use buffers or windows. This is exactly my scenario, so I have to implement a mechanism to remove dups based on a message id.

Update external Database in RichCoFlatMapFunction

I have a RichCoFlatMapFunction
DataStream<Metadata> metadataKeyedStream =
SingleOutputStreamOperator<Output> outputStream =
.assignTimestampsAndWatermarks(new RecordTimeExtractor())
.flatMap(new CustomCoFlatMap(metadataTable.listAllAsMap()));
public class CustomCoFlatMap extends RichCoFlatMapFunction<Record, Metadata, Output> {
private transient Map<String, Metadata> datasource;
private transient ValueState<String, Metadata> metadataState;
public void setDataSource(Map<String, Metadata> datasource) {
this.datasource = datasource;
public void open(Configuration parameters) throws Exception {
// read ValueState
metadataState = getRuntimeContext().getState(
new ValueStateDescriptor<String, Metadata>("metadataState", Metadata.class));
public void flatMap2(Metadata metadata, Collector<Output> collector) throws Exception {
// if metadata record is removed from table, removing the same from local state
if(metadata.getEventName().equals("REMOVE")) {
// update metadata in ValueState
public void flatMap1(Record record, Collector<Output> collector) throws Exception {
Metadata metadata = this.metadataState.value();
// if metadata is not present in ValueState
if(metadata == null) {
// get metadata from datasource
metadata = datasource.get(record.getId());
// if metadata found in datasource, add it to ValueState
if(metadata != null) {
Output output = new Output(record.getId(), metadataState.getName(),
metadataState.getVersion(), metadata.getType());
if(metadata.getId() == 123) {
// here I want to update metadata into another Database
// can I do it here directly ?
Here, in flatmap1 method, I want to update a database. Can I do that operation in flatmap1, I am asking this because it involves some wait time to query DB and then update db.
While it in principle it is possible to do this, it's not a good idea. Doing synchronous i/o in a Flink user function causes two problems:
You are tying up considerable resources that are spending most of their time idle, waiting for a response.
While waiting, that operator is creating backpressure that prevents checkpoint barriers from making progress. This can easily cause occasional checkpoint timeouts and job failures.
It would be better to use a KeyedCoProcessFunction instead, and emit the intended database update as a side output. This can then be handled downstream either by a database sink or by using a RichAsyncFunction.

How to process a message without message leaving the queue till a condition is met?

This is regarding a particular use case which I am planning to address via flink streaming.
A message is sent to flink stream processing, the stream is keyed by and thus gets partitioned as expected. However, each message per key needs to evaluated till a condition is met e.g. lets say there is a banking system, where the account transaction (messages) for an account needs to be processed in sequence, and it is not possible to process a message out of sequence as it will lead to an inconsistent system state. The system needs to wait for a message to be processed (maybe even over 2-3 days) before processing the next message in sequence. How this can be achieved in flink without blocking any part of message processing which can be associated with other keys ?
Thanks in advance !
Have you had a look at the CEP library? You could specify a pattern like:
Pattern<Event, ?> pattern = Pattern.<Event>begin("firstOfSequence").where(new FilterFunction<Event>() {
private static final long serialVersionUID = 5726188262756267490L;
public boolean filter(Event value) throws Exception {
return value.isFirstOfSequence();
}).followedBy("secondOfSequence").where(new FilterFunction<Event>() {
private static final long serialVersionUID = 5726188262756267490L;
public boolean filter(Event value) throws Exception {
return value.isSecondOfSequence();

GAE Channel Sending First Message Repetatively

So this was working on this project a few months ago. I'm using Google App Engine's Channel API to push messages to my GWT app. I'm using to interact through GWT.
Lets say I send 1 message to the client: "First Message"
The client will receive the message, "First Message" just fine.
Then let's say I send another message, "Second Message" to the client.
The client will again receive the message, "First Message".
This will continue happening. There have been some instances where I'll receive the second message, and it will be the message that gets stuck repeating.
When I finally close the page, and thus close the channel, I again receive the repeated message without sending something from the server.
Does anyone have any idea what is going on? I don't think this was happening when I was working on this a few months ago, and I can see no changes to the GAE Channel API.
Here is some code:
String json = AutoBeanHelper.toJson(proxy);
log.fine("Item's JSON Received: " + json);
List<ChannelEntity> channels = channelDAO.getByUserId();
if (channels.size() > 1) {
log.warning("Multiple channels for single user detected.");
ChannelService channelService = ChannelServiceFactory.getChannelService();
for (ChannelEntity channel : channels) {
channelService.sendMessage(new ChannelMessage(channel.getClientId(), json));
So whenever I store a new item of a specific type (this is in that entities update function):
1. I turn it into JSON.
2. I then log that JSON.
3. I get that users channel.
4. I send it to that users channel.
When I look at my logs, I see that the variable above that I'm logging is showing correctly, meaning I'm logging the correct JSON message but when I display the JSON in an alert on the client-side as soon as it gets to the client, it's the previous message that seems to be stuck repeating. I really don't see what I could be doing wrong here.
Let me know if you would like to see another part of the code. For good measure, here is the code on the client:
eventBus.addHandler(ReceivedChannelTokenEvent.TYPE, new ReceivedChannelTokenEventHandler() {
public void onEvent(ReceivedChannelTokenEvent event) {
ChannelFactory.createChannel(event.getChannelToken(), new ChannelCreatedCallback() {
public void onChannelCreated(Channel channel) {
final Socket channelSocket = SocketListener() {
public void onOpen() {
Window.alert("Channel Opened");
public void onMessage(String json) {
eventBus.fireEvent(new MessageReceivedEvent(json));
public void onError(SocketError error) {
Window.alert("Channel Error: " + error.getDescription());
if ( error.getDescription().equals(CHANNEL_ERROR_TOKEN_TIME_OUT) ) {
eventBus.fireEvent(new ChannelTimeOutEvent());
public void onClose() {
Window.alert("Channel Closed.");
Window.addWindowClosingHandler(new Window.ClosingHandler() {
public void onWindowClosing(ClosingEvent event) {
I see a lot of questions on SO where someone has a bug in their code but think it's part of the framework. Without seeing any code, I suspect there's some bug where you think you're sending "Second Message", but you're really sending a cached version of "First Message".
So I was finally able to figure it out. It seems that in the onMessage function within the app when I call
eventBus.fireEvent(new MessageReceivedEvent(json));
it seems that it never returns from this, thus never exiting the onMessage function in the code, causing me to receive the same message repetitively.
