In producer using librdkafka, is it possible to know the number of items produced but not sent yet. I have turned ack off, only interested in sending the message out from the producer. I wish to avoid producing more messages if there are certain number of messages on the send queue(ie the messages produced but not yet sent).
If you set statistics.interval.ms in the producer configuration file, it will start sending the internal metrics in the callback you register.
The callback you can set using the function rd_kafka_conf_set_stats_cb. The statistics are received in the callback in the form of json data.
One the fields from the json data, msg_cnt, tell you the queued messages that you are looking for.
More details are available at https://github.com/edenhill/librdkafka/blob/master/STATISTICS.md
Related
I'm consuming messages from SQS FIFO queue with maxMessagesPerPoll=5 set.
Currently I'm processing each message individually which is a total waste of resources.
In my case, as we are using FIFO queue and all of those 5 messages are related to the same object, I could process them all toghether.
I though this might be done by using aggregate pattern but I wasn't able to get any results.
My consumer route looks like this:
from("aws-sqs://my-queue?maxMessagesPerPoll=5&messageGroupIdStrategy=usePropertyValue")
.process(exchange -> {
// process the message
})
I believe it should be possible to do something like this
from("aws-sqs://my-queue?maxMessagesPerPoll=5&messageGroupIdStrategy=usePropertyValue")
.aggregate(const(true), new GroupedExchangeAggregationStrategy())
.completionFromBatchConsumer()
.process(exchange -> {
// process ALL messages together as I now have a list of all exchanges
})
but the processor is never invoked.
Second thing:
If I'm able to make this work, when does ACK is sent to SQS? When each individual message is processed or when the aggregate process finishes? I hope the latter
When the processor is not called, the aggregator probably still waits for new messages to aggregate.
You could try to use completionSize(5) instead of completionFromBatchConsumer() for a test. If this works, the batch completion definition is the problem.
For the ACK against the broker: unfortunately no. I think the message is commited when it arrives at the aggregator.
The Camel aggregator component is a "stateful" component and therefore it must end the current transaction.
For this reason you can equip such components with persistent repositories to avoid data loss when the process is killed. In such a scenario the already aggregated messages would obviously be lost if you don't have a persistent repository attached.
The problem lies in GroupedExchangeAggregationStrategy
When I use this strategy, the output is an "array" of all exchanges. This means that the exchange that comes to the completion predicate no longer has the initial properties. Instead it has CamelGroupedExchange and CamelAggregatedSize which makes no use for the completionFromBatchConsumer()
As I don't actually need all exchanges being aggregated, it's enough to use GroupedBodyAggregationStrategy. Then exchange properties will remain as in the original exchange and just the body will contain an "array"
Another solution would be to use completionSize(Predicate predicate) and use a custom predicate that extracts necessary value from groupped exchanges.
I have a scenario where I get as input Message A. Message A must then be split into 3 different types of message, and forwarded to other routes. It is important that the messages arrive in a precise order, Ie. A-1 must be sent before A-2, which must be sent before A-3.
To do this I have done the following (outline):
from("activemq:queue:somequeue-local")
.multicast().to("direct:a1","direct:a2","direct:a3");
from("direct:a1)
//split incoming message and prepare output document for A-1
.to("activemq:queue:otherqueue")
.from("direct:a2)
//split incoming message and prepare output document for A-2
.to("activemq:queue:otherqueue")
.from("direct:a3)
//split incoming message and prepare output document for A-3
.to("activemq:queue:otherqueue")
And in another context, responsible for sending out the info to the external system, I have
.from("activemq:queue:otherqueue?maxMessagesPerTask=1&concurrentConsumers=1&maxConcurrentConsumers=1")
// do different stuff based on which type we are called with then end with
.beanref("somebean","writeToFileAndCallImportbat");
Now, my problem is, that when I get to the receiver, I get the messages in random order. Sometimes A-1,A-3,A-2, sometimes right, A-1,A-2,A-3.
I have tried adding JMSXGroupID and JMSXGroupSeq to the messages, but without any luck.
I have also tried skipping the MQ part entirely, and use direct-vm: to call the shared receiver, but then it looks like I have three simultanious invocations of the receiver at once, and still in random execution order.
I was under the impression that multicast would run sequential, unless otherwise prompted to?
Is there something fundamentally wrong with the approach taken?
I am using Camel version 2.12.
Or, said more plainly:
I would like a route that creates three different output messages, and executes a batch file on them, in order. How do I go about that?
If you use the Splitter pattern, have you checked to see if the streaming property is set to false.
If enabled then Camel will split in a streaming fashion, which means it will split the input message in chunks. This reduces the memory overhead. For example if you split big messages its recommended to enable streaming. If streaming is enabled then the sub-message replies will be aggregated out-of-order, eg in the order they come back. If disabled, Camel will process sub-message replies in the same order as they where splitted.
So, it turned out to not be a problem with multicast after all.
Rather, in each of my sub-routes, I did this:
.split(..stax(SpecialClass)).streaming()
.beanRef("transformationBean","somefunction")
.aggregate(constant("1"), new MyAggregator())
.completionTimeout(5000)
.completionSize(1000)
.to(writeToFileAndRunBat)
Which, I assumed meant "Process all elements in the split, and if you aren't finished in 5 seconds or after 1000 elements, break out".
I changed it to
.split(..stax(SpecialClass), , new MyAggregator()).streaming()
.beanRef("transformationBean","somefunction")
.end()
.to(writeToFileAndRunBat)
Coming to think of it, it makes perfect sense, as the first version couldn't really know when we were done, while the last (I assume) just iterate over all elements in the split and calls the Aggregator for each.
Also, I had to .end() in the first version. So I guess the whole thing was just acting random.
I have setup external activator, which executes a simple application (as a startup) and all it does is run the following sql statement
WAITFOR(receive top(1) * from [dbo].[DBTriggersQueue]), TIMEOUT 5000;
I have a update trigger on a table, which adds messages to the queue.
My problem is if i run several update scripts(4 updates one after another) it adds the messages to the queue and my SBEA receives the event notification, however it only receives 1-2 event notifications (according to the EATrace Log) and therefore only 2 messages get dealt with by my simple app.
Does anyone know what could be causing the notifications from being sent or received by the SBEA after the initial 1/2?
It seems like something to do with time, because if I run each of my update scripts with few second delay then the event notification gets sent per update and messages dealt with.
Activation is not a per-message trigger. When you're activated you're supposed to RECEIVE all messages. Once activated keep issuing RECEIVE statements and process the messages until RECEIVE returns an empty result set. Read more details at Understanding When Activation Occurs.
I'm trying to implement a mina service where the response to the final message should be based on the previous messages. Each message (header (1), data (n), end (1)) should receive a response, but the response to the "end" message should be based on the "header", and any "data" messages received as well as the "end" message. Currently, I'm routing the messages to an aggregator which completes when it finds a "header" and "end" message for a particular correlation id. Unfortunately, the response is being sent before (or at the same time?) the message is sent to the aggregator, so I don't have access to the aggregated message (which contains all the data I need to build the correct response) when building the response.
Is there a way to do this without manually storing and accessing the accumulated data (that is, without re-implementing camel's aggregator)?
Edit:
Route is something like:
<camelContext>
<route>
<from uri="mina:..."/>
<process ref="messageProcessor"/>
<aggregate>
<process ref="completeMessageProcessor"/>
</aggregate>
</route>
</camelContext>
I left out some tags and attributes (correlationExpression, completionPredicate, strategyRef, etc.) for clarity.
The messages were being aggregated properly, and they were being processed properly when "completed" (that is, when aggregated). But the response sent back through the mina endpoint to the client was the one generated by the messageProcessor, never the one generated by the completeMessageProcessor.
For example (and yes, it's a rather contrived example, but bear with me), let's say the protocol involves the client sending a header message which includes the total number of data messages it expects to send. Then it sends a number of data messages, which might be different in number to what it expected to send. Finally, it sends a footer, or end, message. The server should then respond back with the difference between the expected number of messages and the actual number of messages. With the route as written, that is impossible, since the number of messages is not known by the messageProcessor, which only processes individual messages. The completeMessageProcessor, having the aggregated message (consisting of header, all the data, and the end) does know this number, but the response generated at that point is not propagated back to the mina endpoint.
Changing the parsing of the messages to generate a message only when entire composed message is received is not an option, since the server must respond to the individual messages.
off the top, my guess is that the messageProcessor is setting up the OUT message, but the completeMessageProcessor is setting up the IN message. The mina consumer response is expecting/using the OUT message instead.
you can add some logging to verify this. if this is the case, then you might change your messageProcessor to use the IN body instead (or use exchanges headers) and add a transform after your completeMessageProcessor to set the OUT body based on the IN body
<transform>
<simple>${in.body}</simple>
</transform>
see this for more information: http://camel.apache.org/using-getin-or-getout-methods-on-exchange.html
UPDATE: after some discussion, the real issue is that the aggregator currently only handles "InOnly" exchanges
I implemented my own frame decoder to parse the bytes received through a UDP socket (using NioDatagramChannelFactory and ConnectionlessBootstrap) according to our protocol.
Just to follow what is happening in the server while receiving messages, I added trace logs in each callback method of the decoder.
It appears that for almost every message the server receives, we can see that the event "channelInterestChanged" is received twice in the method channelInterestChanged(). The value of the event is first 0 (OP_NONE) then 1 (OP_READ).
I read the documentation about this, but I am still not sure to understand why I receive such events. I first through it was because the receive buffer (or the selector queue) was full, but the server receives this event the same number of times it receives the "messageReceived" event (before the decode() method is called) and all the messages/frames are properly decoded as expected. When messages are missing, I do no see any event at all. In this case it is probably because the receive buffer of the datagram socket is full. But even if I increase this receive buffer, I continue to see these events and to miss messages.
So, I am wondering why for each message received, the server also receives two "channelInterestChanged", one with the OP_NONE value and one with the OP_READ value. Please, takes note also that in the channel pipeline, after my frame decoder, there is an ExecutionHandler and another business-specific handler (which sends a JMS message to an ActiveMQ instance).
Any idea or explanation for me?
Thank you.
When a DownStreamChannelStateEvent fired from a handler (e.g calling channel.setReadable(), channel.setWriteable()), the event will change the channel's nio selector key's interested option in the NioDatagramWorker, later, a UpstreamChannelStateEvent will be fired with changed option (i.e OP_READ or OP_NONE)
Your frame decoder handler receives UpstreamChannelStateEvents because, some other handlers in the pipeline are changing the channel's read interest options (the purpose of calling channel.setReadable/setWriteable is, throttling the read/write to avoid congestion, OutOfMemoryError in the application).
If you have any MemoryAwareThreadPoolExecutor in your pipeline (which monitors the size of the channel memory used), it may suspend or resume reading by calling channel.setReadable() any time if the channel receives messages too fast. You may have to configure the MATPE instance with optimum maxChannelMemorySize, maxTotalMemorySize or disable it by setting it to 0.