I'm trying to consume messages as a polling consumer, from time to time, from an ActiveMQ topic using a durable subscriber endpoint.
In my bean, I have a ConsumerTemplate from which I am trying to receive exchanges, and send to another URI.
The bean method is:
public void pollConsumer() throws Exception {
long count = 0;
try {
if ( consumerEndpoint == null ) consumerEndpoint = consumer.getCamelContext().getEndpoint( endpointUri );
logger.debug( "Consuming: " + consumerEndpoint.getEndpointUri() );
consumer.start();
producer.start();
while ( true ) {
logger.trace("Awaiting message: " + ++count );
Exchange exchange = consumer.receive( consumerEndpoint, 1000 );
if ( exchange == null ) break;
logger.trace("Processing message: " + count );
producer.send( exchange );
consumer.doneUoW( exchange );
logger.trace("Processed message: " + count );
}
producer.stop();
consumer.stop();
} catch ( Throwable t ) {
logger.error("Something went wrong!", t );
throw t;
}
}
When called, the logger displays the "Consuming" message in the form
activemq://topic:fromQueue.Name?clientId=MyClient&durableSubscriptionName=MyClient&selector=RecordType+IN+%28+%271%27%2C+%272%27+%29+AND+SubType+%3D+%272%27
which is correct as far as I can see (the selector should read RecordType IN ('1', '2') AND SubType = '2' without the URL encoding.
I get a single "Awaiting message" log, and nothing else, so it appears that nothing is retrieved.
Bizarrely, it also doesn't register on ActiveMQ as a durable subscriber, so it appears that it isn't doing anything at all, but it's not registering any errors either, so I'm rather baffled.
Can anyone suggest why this might not be working, or at least where I should start looking?
Your pollConsumer will stop if it has to wait more than a second for a message to be on the queue/topic.
It waits 1 second for a message, after which it returns null and will break out of the while loop and stop the consumer.
Exchange exchange = consumer.receive( consumerEndpoint, 1000 );
==> if ( exchange == null ) break;
logger.trace("Processing message: " + count );
producer.send( exchange );
consumer.doneUoW( exchange );
It would be easier just to use an apache-camel route to do what you describe.
Having noted #pcoates answer, and tried extending the timeout for testing purposes, it became obvious that the issue was that the durable subscription option on the URI wasn't being acted on, and as there were no new messages on the topic during the 1 second wait, nothing happened.
The answer to another question relating to durable subscriptions explains that you can't use a durable subscription from a polling consumer.
My workaround therefore to subscribe to the topic and route message to a new queue, and to have the polling consumer on this new queue. It's not great, since I'd rather not have an additional queue, but it works and is less effort than writing a new version of JMSPollingConsumer.
Related
I have a Java SpringBoot2 application (app1) that sends messages to a Google Cloud PubSub topic (it is the publisher).
Other Java SpringBoot2 application (app2) is subscribed to a subscription to receive those messages. But in this case, I have more than one instance (the k8s auto-scaling is enabled), so I have more than one pod for this app consuming messages from the PubSub.
Some messages are consumed by one instance of app2, but many others are sent to more than one app2 instance, so the messages process is duplicated for these messages.
Here is the code of consumer (app2):
private final static int ACK_DEAD_LINE_IN_SECONDS = 30;
private static final long POLLING_PERIOD_MS = 250L;
private static final int WINDOW_MAX_SIZE = 1000;
private static final Duration WINDOW_MAX_TIME = Duration.ofSeconds(1L);
#Autowired
private PubSubAdmin pubSubAdmin;
#Bean
public ApplicationRunner runner(PubSubReactiveFactory reactiveFactory) {
return args -> {
createSubscription("subscription-id", "topic-id", ACK_DEAD_LINE_IN_SECONDS);
reactiveFactory.poll(subscription, POLLING_PERIOD_MS) // Poll the PubSub periodically
.map(msg -> Pair.of(msg, getMessageValue(msg))) // Extract the message as a pair
.bufferTimeout(WINDOW_MAX_SIZE, WINDOW_MAX_TIME) // Create a buffer of messages to bulk process
.flatMap(this::processBuffer) // Process the buffer
.doOnError(e -> log.error("Error processing event window", e))
.retry()
.subscribe();
};
}
private void createSubscription(String subscriptionName, String topicName, int ackDeadline) {
pubSubAdmin.createTopic(topicName);
try {
pubSubAdmin.createSubscription(subscriptionName, topicName, ackDeadline);
} catch (AlreadyExistsException e) {
log.info("Pubsub subscription '{}' already configured for topic '{}': {}", subscriptionName, topicName, e.getMessage());
}
}
private Flux<Void> processBuffer(List<Pair<AcknowledgeablePubsubMessage, PreparedRecordEvent>> msgsWindow) {
return Flux.fromStream(
msgsWindow.stream()
.collect(Collectors.groupingBy(msg -> msg.getRight().getData())) // Group the messages by same data
.values()
.stream()
)
.flatMap(this::processDataBuffer);
}
private Mono<Void> processDataBuffer(List<Pair<AcknowledgeablePubsubMessage, PreparedRecordEvent>> dataMsgsWindow) {
return processData(
dataMsgsWindow.get(0).getRight().getData(),
dataMsgsWindow.stream()
.map(Pair::getRight)
.map(PreparedRecordEvent::getRecord)
.collect(Collectors.toSet())
)
.doOnSuccess(it ->
dataMsgsWindow.forEach(msg -> {
log.info("Mark msg ACK");
msg.getLeft().ack();
})
)
.doOnError(e -> {
log.error("Error on PreparedRecordEvent event", e);
dataMsgsWindow.forEach(msg -> {
log.error("Mark msg NACK");
msg.getLeft().nack();
});
})
.retry();
}
private Mono<Void> processData(Data data, Set<Record> records) {
// For each message, make calculations over the records associated to the data
final DataQuality calculated = calculatorService.calculateDataQualityFor(data, records); // Arithmetic calculations
return this.daasClient.updateMetrics(calculated) // Update DB record with a DaaS to wrap DB access
.flatMap(it -> {
if (it.getProcessedRows() >= it.getValidRows()) {
return finish(data);
}
return Mono.just(data);
})
.then();
}
private Mono<Data> finish(Data data) {
return dataClient.updateStatus(data.getId, DataStatus.DONE) // Update DB record with a DaaS to wrap DB access
.doOnSuccess(updatedData -> pubSubClient.publish(
new Qa0DonedataEvent(updatedData) // Publis a new event in other topic
))
.doOnError(err -> {
log.error("Error finishing data");
})
.onErrorReturn(data);
}
I need that each messages is consumed by one and only one app2 instance. Anybody know if this is possible? Any idea to achieve this?
Maybe the right way is to create one subscription for each app2 instance and configure the topic to send each message t exactly one subscription instead of to every one. It is possible?
According to the official documentation, once a message is sent to a subscriber, Pub/Sub tries not to deliver it to any other subscriber on the same subscription (app2 instances are subscriber of the same subscription):
Once a message is sent to a subscriber, the subscriber should
acknowledge the message. A message is considered outstanding once it
has been sent out for delivery and before a subscriber acknowledges
it. Pub/Sub will repeatedly attempt to deliver any message that has
not been acknowledged. While a message is outstanding to a subscriber,
however, Pub/Sub tries not to deliver it to any other subscriber on
the same subscription. The subscriber has a configurable, limited
amount of time -- known as the ackDeadline -- to acknowledge the
outstanding message. Once the deadline passes, the message is no
longer considered outstanding, and Pub/Sub will attempt to redeliver
the message
In general, Cloud Pub/Sub has at-least-once delivery semantics. That means that it will be possible to have messages redelivered that have already been acked and to have messages delivered to multiple subscribers receive the same message for a subscription. These two cases should be relatively rare for a well-behaved subscriber, but without keeping track of the IDs of all messages delivered across all subscribers, it will not be possible to guarantee that there won't be duplicates.
If it is happening with some frequency, it would be good to check if your messages are getting acknowledged within the ack deadline. You are buffering messages for 1s, which should be relatively small compared to your ack deadline of 30s, but it also depends on how long the messages ultimately take to process. For example, if the buffer is being processed in sequential order, it could be that the later messages in your 1000-message buffer aren't being processed in time. You could look at the subscription/expired_ack_deadlines_count metric in Cloud Monitoring to determine if it is indeed the case that your acks for messages are late. Note that late acks for even a small number of messages could result in more duplicates. See the "Message Redelivery & Duplication Rate" section of the Fine-tuning Pub/Sub performance with batch and flow control settings post.
Ok, after doing tests, reading documentation and reviewing the code, I have found a "small" error in it.
We had a wrong "retry" on the "processDataBuffer" method, so when an error happened, the messages in the buffer were marked as NACK, so they were delivered to another instance, but due to retry, they were executed again, correctly, so messages were also marked as ACK.
For this, some of them were prosecuted twice.
private Mono<Void> processDataBuffer(List<Pair<AcknowledgeablePubsubMessage, PreparedRecordEvent>> dataMsgsWindow) {
return processData(
dataMsgsWindow.get(0).getRight().getData(),
dataMsgsWindow.stream()
.map(Pair::getRight)
.map(PreparedRecordEvent::getRecord)
.collect(Collectors.toSet())
)
.doOnSuccess(it ->
dataMsgsWindow.forEach(msg -> {
log.info("Mark msg ACK");
msg.getLeft().ack();
})
)
.doOnError(e -> {
log.error("Error on PreparedRecordEvent event", e);
dataMsgsWindow.forEach(msg -> {
log.error("Mark msg NACK");
msg.getLeft().nack();
});
})
.retry(); // this retry has been deleted
}
My question is resolved.
Once corrected the mentioned bug, I still receive duplicated messages. It is accepted that Google Cloud's PubSub does not guarantee the "exactly one deliver" when you use buffers or windows. This is exactly my scenario, so I have to implement a mechanism to remove dups based on a message id.
I need to consume JMS messages with Camel everyday at 9pm (or from 9pm to 10pm to give it the time to consume all the messages).
I can't see any "scheduler" option for URIs "cMQConnectionFactory:queue:myQueue" while it exists for "file://" or "ftp://" URIs.
If I put a cTimer before it will send an empty message to the queue, not schedule the consumer.
You can use a route policy where you can setup for example a cron expression to tell when the route is started and when its stopped.
http://camel.apache.org/scheduledroutepolicy.html
Other alternatives is to start/stop the route via the Java API or JMX etc and have some other logic that knows when to do that according to the clock.
This is something that has caused me a significant amount of trouble. There are a number of ways of skinning this cat, and none of them are great as far as I can see.
On is to set the route not to start automatically, and use a schedule to start the route and then stop it again after a short time using the controlbus EIP. http://camel.apache.org/controlbus.html
I didn't like this approach because I didn't trust that it would drain the queue completely once and only once per trigger.
Another is to use a pollEnrich to query the queue, but that only seems to pick up one item from the queue, but I wanted to completely drain it (only once).
I wrote a custom bean that uses consumer and producer templates to read all the entries in a queue with a specified time-out.
I found an example on the internet somewhere, but it took me a long time to find, and quickly searching again I can't find it now.
So what I have is:
from("timer:myTimer...")
.beanRef( "myConsumerBean", "pollConsumer" )
from("direct:myProcessingRoute")
.to("whatever");
And a simple pollConsumer method:
public void pollConsumer() throws Exception {
if ( consumerEndpoint == null ) consumerEndpoint = consumer.getCamelContext().getEndpoint( endpointUri );
consumer.start();
producer.start();
while ( true ) {
Exchange exchange = consumer.receive( consumerEndpoint, 1000 );
if ( exchange == null ) break;
producer.send( exchange );
consumer.doneUoW( exchange );
}
producer.stop();
consumer.stop();
}
where the producer is a DefaultProducerTemplate, consumer is a DefaultConsumerTemplate, and these are configured in the bean configuration.
This seems to work for me, but if anyone gives you a better answer I'll be very interested to see how it can be done better.
I would like to know if the below is expected behaviour for Camel idempotent consumer:
I have removeOnFailure=true for the route, which means basically when the exchange fails idempotent consumer should remove the Identifier from the repository. This brings up a very interesting scenario which allows duplicate on the exchange.
Suppose I have identifier=12345 and first attempt to execute the exchange was Succesfull which means identifier is added to idempotent repository. Next attempt to use same identifier i.e 12345 fails as this is caught as Duplicate Message (CamelDuplicateMessage). But at this point having removeOnFailure=true will remove the identifier from the repository which on next attempt will allow the exchange to go through successfully without catching the default message. Hence, creating a room for duplication on the exchange.
Can someone advise if this is expected behaviour or some bug?
Sample Route:
from("direct:Route-DeDupeCheck").routeId("Route-DeDupeCheck")
.log(LoggingLevel.DEBUG, "~~~~~~~ Reached to Route-DeDupeCheck: ${property.xref}")
.idempotentConsumer(simple("${property.xref}"), MemoryIdempotentRepository.memoryIdempotentRepository()) //TODO: To replace with Redis DB for caching
.removeOnFailure(true)
.skipDuplicate(false)
.filter(exchangeProperty(Exchange.DUPLICATE_MESSAGE).isEqualTo(true))
.log("~~~~~~~ Duplicate Message Found!")
.to("amq:queue:{{jms.duplicateQueue}}?exchangePattern=InOnly") //TODO: To send this to Duplicate JMS Queue
.throwException(new AZBizException("409", "Duplicate Message!"));
Your basic premise is wrong.
Next attempt to use same identifier i.e 12345 fails as this is caught
as Duplicate Message (CamelDuplicateMessage)
When there is a duplicated message, it is not considered as a failure. It is just ignored from further processing(unless you have skipDuplicate option set to true).
Hence the scenario what you just explained cannot occur what so ever.
It is very easy to test. Considering you have a route like this,
public void configure() throws Exception {
//getContext().setTracing(true); Use this to enable tracing
from("direct:abc")
.idempotentConsumer(header("myid"),
MemoryIdempotentRepository.memoryIdempotentRepository(200))
.removeOnFailure(true)
.log("Recieved id : ${header.myid}");
}
}
And a Producer like this
#EndpointInject(uri = "direct:abc")
ProducerTemplate producerTemplate;
for(int i=0, i<5,i++) {
producerTemplate.sendBodyAndHeader("somebody","myid", "1");
}
What you see in logs is
INFO 18768 --- [tp1402599109-31] route1 : Recieved id : 1
And just once.
My GCM Endpoint is derived from the code at /github.com/GoogleCloudPlatform/gradle-appengine-templates/tree/master/GcmEndpoints/root/src/main. Each Android client device
registers with the endpoint. A message can be sent to the first 10 registered devices using this code:
#Api(name = "messaging", version = "v1", namespace = #ApiNamespace(ownerDomain = "${endpointOwnerDomain}", ownerName = "${endpointOwnerDomain}", packagePath="${endpointPackagePath}"))
public class MessagingEndpoint {
private static final Logger log = Logger.getLogger(MessagingEndpoint.class.getName());
/** Api Keys can be obtained from the google cloud console */
private static final String API_KEY = System.getProperty("gcm.api.key");
/**
* Send to the first 10 devices (You can modify this to send to any number of devices or a specific device)
*
* #param message The message to send
*/
public void sendMessage(#Named("message") String message) throws IOException {
if(message == null || message.trim().length() == 0) {
log.warning("Not sending message because it is empty");
return;
}
// crop longer messages
if (message.length() > 1000) {
message = message.substring(0, 1000) + "[...]";
}
Sender sender = new Sender(API_KEY);
Message msg = new Message.Builder().addData("message", message).build();
List<RegistrationRecord> records = ofy().load().type(RegistrationRecord.class).limit(10).list();
for(RegistrationRecord record : records) {
Result result = sender.send(msg, record.getRegId(), 5);
if (result.getMessageId() != null) {
log.info("Message sent to " + record.getRegId());
String canonicalRegId = result.getCanonicalRegistrationId();
if (canonicalRegId != null) {
// if the regId changed, we have to update the datastore
log.info("Registration Id changed for " + record.getRegId() + " updating to " + canonicalRegId);
record.setRegId(canonicalRegId);
ofy().save().entity(record).now();
}
} else {
String error = result.getErrorCodeName();
if (error.equals(Constants.ERROR_NOT_REGISTERED)) {
log.warning("Registration Id " + record.getRegId() + " no longer registered with GCM, removing from datastore");
// if the device is no longer registered with Gcm, remove it from the datastore
ofy().delete().entity(record).now();
}
else {
log.warning("Error when sending message : " + error);
}
}
}
}
}
The above code sends to the first 10 registered devices. I would like to send to all registered clients. According to http://objectify-appengine.googlecode.com/svn/branches/allow-parent-filtering/javadoc/com/googlecode/objectify/cmd/Query.html#limit(int) setting limit(0) accomplishes this. But I'm not convinced there will not be a problem for very large numbers of registered clients due to memory constraints or the time it takes to execute the query. https://code.google.com/p/objectify-appengine/source/browse/Queries.wiki?repo=wiki states "Cursors let you take a "checkpoint" in a query result set, store the checkpoint elsewhere, and then resume from where you left off later. This is often used in combination with the Task Queue API to iterate through large datasets that cannot be processed in the 60s limit of a single request".
Note the comment about the 60s limit of a single request.
So my question - if I modified the sample code at /github.com/GoogleCloudPlatform/gradle-appengine-templates/tree/master/GcmEndpoints/root/src/main to request all objects from the datastore, by replacing limit(10) with limit(0), will this ever fail for a large number of objects? And if it will fail, roughly what number of objects?
This is a poor pattern, even with cursors. At the very least, you'll hit the hard 60s limit for a single request. And since you're doing updates on the RegistrationRecord, you need a transaction, which will slow down the process even more.
This is exactly what the task queue is for. The best way is to do it in two tasks:
Your api endpoint enqueues "send message to everyone" and returns immediately.
That first task is the "mapper" which iterates the RegistrationRecords with a keys-only query. For each key, enqueue a "reducer" task for "send X message to this record".
The reducer task sends the message and (in a transaction) performs your record update.
Using Deferred this actually isn't much code at all.
The first task frees you client immediately and gives you 10m to iterating RegistrationRecord keys rather than the 60s limit for a normal request. If you have your chunking right and batch queue submissions, you should be able to generate thousands of reducer tasks per second.
This will effortlessly scale to hundreds of thousands of users, and might get you into millions. If you need to scale higher, you can apply a map/reduce approach to parallelize the mapping. Then it's just a question of how many instances you want to throw at the problem.
I have used this approach to great effect in the past sending out millions of apple push notifications at a time. The task queue is your friend, use it heavily.
Your query will time out if you try to retrieve too many entities. You will need to use cursors in your loop.
No one can say how many entities can be retrieved before this timeout - it depends on the size of your entities, complexity of your query, and, most importantly, what else happens in your loop. For example, in your case you can dramatically speed up your loop (and thus retrieve many more entities before a timeout) by creating tasks instead of building and sending messages within the loop itself.
Note that by default a query returns entities in chunks of 20 - you will need to increase the chunk size if you have a large number of entities.
I am trying to validate message with Apache Camel (ver 2.10.4) that is sent to a Virtual Topic in FuseESB (based on Apache ServiceMix, ver 7.1.0), using validator:xsd (message is XML in TextMessage), and when it fails the validation, I want to redirect the message to another topic, and stop processing, so do not send it to usual consumers. Because consumers will fail with invalid message.
I wanted to do validation with routing, so validating once, instead of doing it on multiple consumers.
Is this possible with Camel? and what would be the syntax?
my current approach is like this:
static final String ACTIVEMQ_TOPIC_PREFIX = "activemq:topic:";
static final String ACTIVEMQ_CONSUMER_PREFIX = "activemq:queue:Consumer.*.";
static final String TOPIC_ORDER_CREATED = "VirtualTopic.order.created";
static final String TOPIC_ORDER_CREATED_ERROR =
"VirtualTopic.order.created.error";
static final String DIRECT_ORDER_CREATED_ERROR = "direct:orderCreatedError";
from(DIRECT_ORDER_CREATED_ERROR)
.to(ACTIVEMQ_TOPIC_PREFIX + TOPIC_ORDER_CREATED_ERROR)
.log("Message sent to " + TOPIC_ORDER_CREATED_ERROR);
// validate order.created topic message
// before sending to consumer queues.
from(ACTIVEMQ_TOPIC_PREFIX + TOPIC_ORDER_CREATED)
.errorHandler(deadLetterChannel(DIRECT_ORDER_CREATED_ERROR))
.choice() // validation is enabled with property
.when(simple("${properties:" + PROP_VALIDATION_ENABLED + "} == true"))
.log("Validating order created body")
.to("validator:xsd/myxsd.xsd") // validate against xsd
.onException(ValidationException.class)
.handled(true)
.maximumRedeliveries(0)
.useOriginalMessage()
// if invalid send to error topic
.to(DIRECT_ORDER_CREATED_ERROR)
.stop()
.end()
.end()
.to(ACTIVEMQ_CONSUMER_PREFIX + TOPIC_ORDER_CREATED)
.log("Message sent to " + TOPIC_ORDER_CREATED);
I see "Validating order created body" and "Message sent to VirtualTopic.order.created.error" in the logs. On webconsole, I see a message enqueued in error topic for one message of main topic.
The problem is the consumer of VirtualTopic.order.created still gets the invalid message
Could you please help me to find the right syntax to intercept message before it goes to consumers of VirtualTopic?
Thanks
You can just use the deadLetterChannel, and have it use the original message, and then any errors gets handled and moved to the DLQ.
from(ACTIVEMQ_TOPIC_PREFIX + TOPIC_ORDER_CREATED)
.errorHandler(deadLetterChannel(DIRECT_ORDER_CREATED_ERROR).useOriginalMessage())
.choice() // validation is enabled with property
.when(simple("${properties:" + PROP_VALIDATION_ENABLED + "} == true"))
.log("Validating order created body")
.to("validator:xsd/myxsd.xsd") // validate against xsd
.end()
.to(ACTIVEMQ_CONSUMER_PREFIX + TOPIC_ORDER_CREATED)
.log("Message sent to " + TOPIC_ORDER_CREATED);
Also if you are using onException in the route, then you should put that in the top of the route, not in the middle.