Infinite AMQP Consumer with Alpakka - akka-stream

I'm trying to implement a very simple service connected to an AMQP broker with Alpakka. I just want it to consume messages from its queue as a stream at the moment they are pushed on a given exchange/topic.
Everything seemed to work fine in my tests, but when I tried to start my service, I realized that my stream was only consuming my messages once and then exited.
Basically I'm using the code from Alpakka documentation :
def consume()={
val amqpSource = AmqpSource.committableSource(
TemporaryQueueSourceSettings(connectionProvider, exchangeName)
.withDeclaration(exchangeDeclaration)
.withRoutingKey(topic),
bufferSize = prefetchCount
)
val amqpSink = AmqpSink.replyTo(AmqpReplyToSinkSettings(connectionProvider))
amqpSource.mapAsync(4)(msg => onMessage(msg)).runWith(amqpSink)
}
I tried to schedule the consume() execution every second, but I experienced OutOfMemoryException issues.
Is there any proper way to make this code run as an infinite loop ?

If you want to have a Source restarted when it fails or is cancelled, wrap it with RestartSource.withBackoff.

Related

how to use a simple way to determine the end of the streaming of client in asynchronous GRPC++?

Now I'm learning Bidirectional streaming in asynchronous GRPC++.
Thanks for the master:https://github.com/Mityuha/grpc_async. I get much useful information to know the realization principle of this mode.But I have a question about it:
Not much to say,the code is following:
the server:
if(!ok || mcounter >= greeting.size())//ctx_.IsCancelled() doesn't work
{
std::cout << "[ProceedMM]: Trying finish" << std::endl;
status_ = FINISH;
responder_.Finish(Status(), (void*)this);
}
the client:
void AsyncCompleteRpc()
{
void* got_tag;
bool ok = false;
while(cq_.Next(&got_tag, &ok))
{
AbstractAsyncClientCall* call = static_cast<AbstractAsyncClientCall*>(got_tag);
call->Proceed(ok);
}
std::cout << "Completion queue is shutting down." << std::endl;
}
in this server,the end of ClientStream is judged by the bool value of OK which is send by client.It isn't similar to the way of synchronous GRPC,which is judged the steaming end by the return of bool Read(RequestType* request) in the class of ServerReaderWriter in many times.It's so strange to find the same way in the class of ServerAsyncReaderWriter which is void Read(R* msg, void* tag).Though I know it's because of the asynchronous way.But if I don't know how much times of asynchronous streaming without the judgement of "OK", how to find the way like synchronous streaming to judge the end of client streaming.Because I test the performance by java which is the same code between synchronous with asynchronous ways,which don't have the bool value of OK in asynchronous ways.
So can someone help me?Or tell me some ways to deal with it or find a way to test the performance testing of GRPC++ by Bazel of in my another question.
I'm not 100% sure that I get the question, but what ok tells you is (when false) that the operation you requested couldn't be completed and nothing else will ever complete successfully on that side of the stream. So if you issue a Read operation and the Next gives you a !ok value, then you can be sure that no more data will ever come back from the client. A more detailed explanation is given in the comments for the CompletionQueue class.
Thanks and good luck with gRPC.
In the case of receiving a stream in an asynchronous client of gRPC, you will use a ClientAsyncReader<> class to receive data. This class differs when both send and receive are stream, but logic is the same.
This class has a Finish() method which you need to call after finishing sending your rpc data to server. When answer stream from server is finished, a message to CompletionQueue will be added which corresponds to this method. This Finish method returns final status when its message is returned in CQ. You can find out that your stream is finished. Your code will be similar to this:
response_reader_ = stub->PrepareAsyncXYZ(ctx_, req, cq);
response_reader_->StartCall(&start_data_);
response_reader_->Finish(&status_, &finish_data_);
in this sample, message in CQ will have finish_data_ tag and you can use it to handle it properly. You will probably will need to manage messages for Finish() and Read() by reference counting, because you will probably get an additional failed read message too. when message with finish_data_ is received in CQ, status_ will have the valid value of status.
At least it is how I wrote it.

How to schedule JMS consuming in Apache Camel?

I need to consume JMS messages with Camel everyday at 9pm (or from 9pm to 10pm to give it the time to consume all the messages).
I can't see any "scheduler" option for URIs "cMQConnectionFactory:queue:myQueue" while it exists for "file://" or "ftp://" URIs.
If I put a cTimer before it will send an empty message to the queue, not schedule the consumer.
You can use a route policy where you can setup for example a cron expression to tell when the route is started and when its stopped.
http://camel.apache.org/scheduledroutepolicy.html
Other alternatives is to start/stop the route via the Java API or JMX etc and have some other logic that knows when to do that according to the clock.
This is something that has caused me a significant amount of trouble. There are a number of ways of skinning this cat, and none of them are great as far as I can see.
On is to set the route not to start automatically, and use a schedule to start the route and then stop it again after a short time using the controlbus EIP. http://camel.apache.org/controlbus.html
I didn't like this approach because I didn't trust that it would drain the queue completely once and only once per trigger.
Another is to use a pollEnrich to query the queue, but that only seems to pick up one item from the queue, but I wanted to completely drain it (only once).
I wrote a custom bean that uses consumer and producer templates to read all the entries in a queue with a specified time-out.
I found an example on the internet somewhere, but it took me a long time to find, and quickly searching again I can't find it now.
So what I have is:
from("timer:myTimer...")
.beanRef( "myConsumerBean", "pollConsumer" )
from("direct:myProcessingRoute")
.to("whatever");
And a simple pollConsumer method:
public void pollConsumer() throws Exception {
if ( consumerEndpoint == null ) consumerEndpoint = consumer.getCamelContext().getEndpoint( endpointUri );
consumer.start();
producer.start();
while ( true ) {
Exchange exchange = consumer.receive( consumerEndpoint, 1000 );
if ( exchange == null ) break;
producer.send( exchange );
consumer.doneUoW( exchange );
}
producer.stop();
consumer.stop();
}
where the producer is a DefaultProducerTemplate, consumer is a DefaultConsumerTemplate, and these are configured in the bean configuration.
This seems to work for me, but if anyone gives you a better answer I'll be very interested to see how it can be done better.

Job Queue using Google PubSub

I want to have a simple task queue. There will be multiple consumers running on different machines, but I only want each task to be consumed once.
If I have multiple subscribers taking messages from a topic using the same subscription ID is there a chance that the message will be read twice?
I've tested something along these lines successfully but I'm concerned that there could be synchronization issues.
client = SubscriberClient.create(SubscriberSettings.defaultBuilder().build());
subName = SubscriptionName.create(projectId, "Queue");
client.createSubscription(subName, topicName, PushConfig.getDefaultInstance(), 0);
Thread subscriber = new Thread() {
public void run() {
while (!interrupted()) {
PullResponse response = subscriberClient.pull(subscriptionName, false, 1);
List<ReceivedMessage> messages = response.getReceivedMessagesList();
mess = messasges.get(0);
client.acknowledge(subscriptionName, ImmutableList.of(mess.getAckId()));
doSomethingWith(mess.getMessage().getData().toStringUtf8());
}
}
};
subscriber.start();
In short, yes there is a chance that some messages will be duplicated: GCP promises at-least-once delivery. Exactly-once-delivery is theoretically impossible in any distributed system. You should design your doSomethingWith code to be idempotent if possible so duplicate messages are not a problem.
You should also only acknowledge a message once you have finished processing it: what would happen if your machine dies after acknowledge but before doSomethingWith returns? your message will be lost! (this fundamental idea is why exactly-once delivery is impossible).
If losing messages is preferable to double processing them, you could add a locking process (write a "processed" token to a consistent database), but this can fail if the write is handled before the message is processed. But at this point you might be able to find a messaging technology that is designed for at-most-once, rather than optimised for reliability.

Stop Camel after too many retries

I am trying to implement more advanced Apache Camel error handling:
in case if there are too many pending retries then stop processing at all and log all collected exceptions somewhere.
First part (stop on too many retries) is already implemented by following helper method, that gets size of retry queue and I just stop context if queue is over some limit:
static Long getToRetryTaskCount(CamelContext context) {
Long retryTaskCount = null;
ScheduledExecutorService errorHandlerExecutor = context.getErrorHandlerExecutorService();
if (errorHandlerExecutor instanceof SizedScheduledExecutorService)
{
SizedScheduledExecutorService svc = (SizedScheduledExecutorService) errorHandlerExecutor;
ScheduledThreadPoolExecutor executor = svc.getScheduledThreadPoolExecutor();
BlockingQueue<Runnable> queue = executor.getQueue();
retryTaskCount = (long) queue.size();
}
return retryTaskCount;
}
But this code smells to me and I don't like it and also I don't see here any way to collect the exceptions caused all this retries.
There is also a new control bus component in camel 2.11 which could do what you want (source)
template.sendBody("controlbus:route?routeId=foo&action=stop", null);
I wouldn't try to shutdown the CamelContext, just the route in question...that way the rest of your app can still function, you can get route stats and view/move messages to alternate queues, etc.
see https://camel.apache.org/how-can-i-stop-a-route-from-a-route.html

how to manually set a task to run in a gae queue for the second time

I have a task that runs in GAE queue.
according to my logic, I want to determine if the task will run again or not.
I don't want it do be normally executed by the queue and then to put it again in the queue
because I want to have the ability to check the "X-AppEngine-TaskRetryCount"
and quit trying after several attempts.
To my understanding it seems that the only case that a task will re-executed is when an internal GAE error will happen (or If my code will take too long in a "DeadlineExceededException" cases..(And I don't want to hold the code "hostage" for that long :) )
How can I re-enter a task to the queue in a manner that GAE will set X-AppEngine-TaskRetryCount ++ ??
You can programmatically retry / restart a task using a self.error() in python.
From the docs: App engine retries a task by returning any HTTP status code outside of the range 200–299
And at the beginning of the task you can test for the number of retries using:
retries = int(self.request.headers['X-Appengine-Taskretrycount'])
if retries < 10 :
self.error(409)
return

Resources