Akka Streams, what happens with the buffer after you call shutdown on the killingSwitch? DSL Graph - akka-stream

I have a DLS graph that is connected to rabbitMQ. (source and sink)
If I start the service with 10 messages already in the queue and value of akka.stream.materializer.max-input-buffer-size is 1, and I trigger the killingSwitch after one message is processed and another one is in-flight, it seems that I lose the message that is in the akka-streams buffer. (the stream does not shutdown until all the jobs that are in-flight complete)
I end-up with 7 messages remaining in the queue.
Any idea how that buffer works? or how to get access to that buffer? or how to process that message at well?
ex:
start queue messages
5646245d2b0000251a9fe92b
56def590430000fd1dac3e47
542560eae4b0ba04ec469e12 (the messages that will get lost)
55835213e4b03eb77098e88e
569edf2851000098027fdad8
6cb975919f254472b61c012d0b76e119
53667258e4b09a003032bcb3
92e4765c5dae4c8485b0a3aa088b8c1b
5326b1c4e4b0b5ce16824303
5623f7912c000072223bc3af
acknowledge messages
5646245d2b0000251a9fe92b
56def590430000fd1dac3e47
542560eae4b0ba04ec469e12 (lost message)
processed messages
5646245d2b0000251a9fe92b
56def590430000fd1dac3e47
the messages in rabbitMQ buffer that are requeued
55835213e4b03eb77098e88e
messages left in the queue
55835213e4b03eb77098e88e
569edf2851000098027fdad8
6cb975919f254472b61c012d0b76e119
53667258e4b09a003032bcb3
92e4765c5dae4c8485b0a3aa088b8c1b
5326b1c4e4b0b5ce16824303
5623f7912c000072223bc3af

Related

how to perform parallel processing of gcp pubsub messages in apache camel

I have this code below that takes message from pubsub source topic -> transform it as per a template -> then publish the transformed message to a target topic.
But to improve performance I need to do this task in parallel.That is i need to poll 500 messages,and then transform it in parallel and then publish them to the target topic.
From the camel gcp component documentation I believe maxMessagesPerPoll and concurrentConsumers parameter will do the job.Due to lack of documentation I am not sure how does it internally works.
I mean a) if I poll say 500 message ,will then it create 500 parallel route that will process the messages and publish it to the target topic b)what about ordering of the messages c) should I be looking at parallel processing EIPs as an alternative
etc.
The concept is not clear to me
Was go
// my route
private void addRouteToContext(final PubSub pubSub) throws Exception {
this.camelContext.addRoutes(new RouteBuilder() {
#Override
public void configure() throws Exception {
errorHandler(deadLetterChannel("google-pubsub:{{gcp_project_id}}:{{pubsub.dead.letter.topic}}")
.useOriginalMessage().onPrepareFailure(new FailureProcessor()));
/*
* from topic
*/
from("google-pubsub:{{gcp_project_id}}:" + pubSub.getFromSubscription() + "?"
+ "maxMessagesPerPoll={{consumer.maxMessagesPerPoll}}&"
+ "concurrentConsumers={{consumer.concurrentConsumers}}").
/*
* transform using the velocity
*/
to("velocity:" + pubSub.getToTemplate() + "?contentCache=true").
/*
* attach header to the transform message
*/
setHeader("Header ", simple("${date:now:yyyyMMdd}")).routeId(pubSub.getRouteId()).
/*
* log the transformed event
*/
log("${body}").
/*
* publish the transformed event to the target topic
*/
to("google-pubsub:{{gcp_project_id}}:" + pubSub.getToTopic());
}
});
}
a) if I poll say 500 message ,will then it create 500 parallel route that will process the messages and publish it to the target topic
No, Camel does not create 500 parallel threads in this case. As you suspect, the number of concurrent consumer threads is set with concurrentConsumers. So if you define 5 concurrentConsumers with a maxMessagesPerPoll of 500, every consumer will fetch up to 500 messages and process them one after the other in a single thread. That is, you have 5 messages processed in parallel.
what about ordering of the messages
As soon as you process messages in parallel, the order of messages is messed up. But this already happens with 1 Consumer when you got processing errors and they are detoured to your deadLetterChannel and reprocessed later.
should I be looking at parallel processing EIPs as an alternative
Only if the concurrentConsumers option is not sufficient.
When you mention the concurrentConsumers option(let's say concurrentConsumers=10), you are asking Camel to create a thread pool of 10 threads, and each of those 10 threads will pick up a different message from the pub-sub queue and process them.
The thing to note here is that when you are specifying the concurrentConsumers option, the thread pool uses a fixed size, which means that a fixed number of active threads are waiting at all times to process incoming messages. So 10 threads(since I specified concurrentConsumers=10) will be waiting to process my messages, even if there aren't 10 messages coming in simultaneously.
Obviously, this is not going to guarantee that the incoming messages will be processed in the same order. If you are looking to have the messages in the same order, you can have a look at the Resequencer EIP to order your messages.
As for your third question, I don't think google-pubsub component allows a parallel processing option. You can make your own using the Threads EIP. This would definitely give more control over your concurrency.
Using Threads, your code would look something like this:
from("google-pubsub:project-id:destinationName?maxMessagesPerPoll=20")
// the 2 parameters are 'pool size' and 'max pool size'
.threads(5, 20)
.to("direct:out");

blocking Inter task communication in RTOS

I'm writing a module which contains a task with the highest priority and it should be in blocking until it receives a message from an other task the start doing its duty as a highest priority task. It uses mailbox mechanism for signaling .
My problem is
I want the task -which send a signal to higher task- gets back message in blocking mode
Here is my question
should I post through mailbox 1 and then fetch from mailbox 2 or there is a better solution?
I use "FreeRTOS" if it helps
EDIT
I think I described the problem very bad
I mean do I need 2 mailbox in order to communicate between task to task or ISR to task or I can use just one mailbox with other implementation!!??
For your edited question:
You have to use two message queues. One for each task otherwise you won't be able to wait correctly.
So for your blocking message transfer, the code looks like this:
High priority task:
while(-1){
xQueueReceive(high_prio_queue, &msg, portMAX_DELAY);
[your complex code]
xQueueSend(low_prio_queue, &return_msg, timeout);
}
Low priority task:
xQueueSend(high_prio_queue, &msg, timeout);
//will only wait if your high priority task gets blocked before sending
xQueueReceive(low_prio_queue, &return_msg, portMAX_DELAY);
From ISR:
xQueueSendFromISR(high_prio_queue, &msg, &unblocked);
It is very simple. For example queues used and the freeRTOS.
The task waits for the queue. It is in the blocked state
while(1)
{
xQueueReceive(queue, &object, portMAX_DELAY);
....
another task send the data to the queue.
xQueueSend(queue, &object, timeout);
When the data is received the task is given the control. Then it checks if anything is in the queue. If not it waits in blocked state.

Mule 4 - Empty VM queue error consuming messages

I have a flow in Mule 4 that reads data from a CSV file and inserts it into Salesforce using a Batch:
All Salesforce results are inserted into a non-persistent VM queue (transient by default).
All messages are inserted for each block of records and are consumed without problems at the end of the batch.
However, when I have finished, the following error appears after 10 seconds:
Message : Tried to consume messages from VM queue 'productQueue' but it was empty after timeout of 10 SECONDS.
Error type : VM:EMPTY_QUEUE
Element : testing-threadingSub_Flow/processors/0/processors/0 # testing-threading:testing-threading.xml:95 (Consume)
Element XML : <vm:consume doc:name="Consume" doc:id="6b7b2df6-c986-425c-a6f0-29613a876d37" config-ref="VM_Config" queueName="demoQueue" timeout="10"></vm:consume>
Why does the consumer of the queue run if there are no more messages to process?
I want this component to only read messages when it is his turn. Maybe I'm using the wrong kind of VM?
The VM consume operation will try reading from queue up to a specified timeout which is configurable and then log that error if the queue is empty.
Somehow your foreach block is executing the consume more times than the required amount/messages available. If you share you foreach xml configuration we might be able to see more as to why.
Other than solving why foreach is running the consume more than necessary. There are a few options to modify this behaviour:
Wrap the consume in in a try to supress the error:
<try doc:name="Try" >
<vm:consume ... />
<error-handler >
<on-error-continue enableNotifications="false" logException="false" type=" ">
<logger />
</on-error-continue>
</error-handler>
</try>
Or maybe do not use a consume, and use a different flow with a VM listener to listen for the messages on that VM queue. This might change how your app needs to work.

FreeRtos problems in ADC task and Streaming Task

I have an ADC task that uses 4 channels and uses the DMA for transfer I also have a streaming client which streams the ADC data through the TCP socket I made the ADS Task lower priority than the streaming client.
I'm sending an integer that selects which ADC channel is selected as a message queue to the streaming client.
The problem is I get queue overflow when sending that adc channel integer.
ADC TASK
if(bufferSelect != BUFFERS_NOT_READY)
{
if(xQueueSend(g_adcQueue, &bufferSelect, 0) != pdPASS)
{
throwError(ERROR_MESSAGE_QUEUE_FULL);
PRINTF("%s\r\n", getErrorMessage(ERROR_MESSAGE_QUEUE_FULL));
}
bufferSelect = BUFFERS_NOT_READY;
}
Streaming client task
/* obtain next buffer ready event */
if(xQueueReceive(g_adcQueue, &bufferSelect, 0) == pdFALSE)
{
g_stopStreaming = true;
continue;
}
You seem to handle the queue full status as an error, which it normally isn't - One of the purposes of queues is to back-pressure the producer, and that is exactly what you should do here: If the streaming task cannot digest the data you are throwing at it, you are simply producing too much.
The priority of the consumer does only help keeping queue fill state at a reasonable level when there is no inactive (waiting for I/O) periods in the consumer code. As soon as you have such wait periods in your consumer, priority alone doesn't relieve you from accepting that queues can become full.

Job Queue using Google PubSub

I want to have a simple task queue. There will be multiple consumers running on different machines, but I only want each task to be consumed once.
If I have multiple subscribers taking messages from a topic using the same subscription ID is there a chance that the message will be read twice?
I've tested something along these lines successfully but I'm concerned that there could be synchronization issues.
client = SubscriberClient.create(SubscriberSettings.defaultBuilder().build());
subName = SubscriptionName.create(projectId, "Queue");
client.createSubscription(subName, topicName, PushConfig.getDefaultInstance(), 0);
Thread subscriber = new Thread() {
public void run() {
while (!interrupted()) {
PullResponse response = subscriberClient.pull(subscriptionName, false, 1);
List<ReceivedMessage> messages = response.getReceivedMessagesList();
mess = messasges.get(0);
client.acknowledge(subscriptionName, ImmutableList.of(mess.getAckId()));
doSomethingWith(mess.getMessage().getData().toStringUtf8());
}
}
};
subscriber.start();
In short, yes there is a chance that some messages will be duplicated: GCP promises at-least-once delivery. Exactly-once-delivery is theoretically impossible in any distributed system. You should design your doSomethingWith code to be idempotent if possible so duplicate messages are not a problem.
You should also only acknowledge a message once you have finished processing it: what would happen if your machine dies after acknowledge but before doSomethingWith returns? your message will be lost! (this fundamental idea is why exactly-once delivery is impossible).
If losing messages is preferable to double processing them, you could add a locking process (write a "processed" token to a consistent database), but this can fail if the write is handled before the message is processed. But at this point you might be able to find a messaging technology that is designed for at-most-once, rather than optimised for reliability.

Resources