Splitter/Aggregator with fire/forget and timeout - apache-camel

We have a splitter process which pushes messages to different queues. There's another process which collects and aggregates these messages for further processing.
We want to have a timeout between the moment of splitting and being aggregated.
IIUC aggregation timeout starts with the first message and is it being reset after every aggregated message (it is interval based, not for the complete message).
What's the best solution to solve this?

EDIT
Here's the best I was able to come up with, although it's a bit of a hack. First, you save a timestamp as a message header and publish it to the queue with the body:
from("somewhere")
.split(body())
.process(e -> e.getIn().setHeader("aggregation_timeout",
ZonedDateTime.now().plusSeconds(COMPLETION_TIMEOUT)))
.to("aggregation-route-uri");
Then, when consuming and aggregating, you use a custom aggregation strategy that will save the aggregation_timeout from the first message in the current group and then use a completionPredicate that reads that value to check whether the timeout has expired (alternatively, if you're aggregating in a way that keeps the message ordering intact, you could just read the header from the first message). Use a short completionTimeout as a safeguard for cases when the interval between two messages is long:
from("aggregation-route-uri")
.aggregate(bySomething())
.aggregationStrategy((oldExchange, newExchange) -> {
// read aggregation_timeout header from first message
// and set it as property in grouped exchange
// perform aggregation
})
.completionTimeout(1000) // intentionally low value, here as a safeguard
.completionPredicate(e -> {
// complete once the timeout has been reached
return e.getProperty("aggregation_timeout", ZonedDateTime.class)
.isAfter(ZonedDateTime.now());
})
.process(e -> // do something with aggregates);

Related

Apache Camel - method shortcut to jump to aggregator when exception thrown in the middle of a multiple step split route

I'd like to ask if there is some way to skip rest of the split route and jump directly to aggregator part, when in the exception handler I mark the split route to continue.
I have a route like this:
receive a message
fetch config for 3 endpoints
merge config and message as a tuple for each endpoint, and create a list of it
.split(), and in the split route I convert message according to config for each endpoint(s1), fetch oauth token(s2), send to final endpoint with token(s3), collect response for each endpoint(s4), aggregate(split aggregator; splitting ends here, let's call it sa)
return as a whole one result
stop
You can see, in the split route there are 4 steps(s1-s4); if any of these step fails I want to jump to aggregation(sa). For example, it does not make sense to continue the split route, if s1 or s2 fails.
I define an onException() clause to handle the exception and mark it to continue(continued(true)),because anyway I want to reach aggregator. Also, if I mark continue(false), not only split route, but the whole route(meaning the main route even before splitting) will be rolled back. I want to decide rollback after getting all the causes/exceptions in each split branch.
I have a workaround for a simple case, which is, in exception handler for errors in s2, I add a property in the exchange oauth_failed to be true, and add a condition check choice().when() after s2; if this prop is null, then go to s3 (continue sending). Solely for this purpose I must isolated s3 as a separate route(direct:s3).
.bean(S2Bean.class)
.choice()
.when(simple("${exchangeProperty.oauth_failed} == null")) // null = continue the flow
.to("direct:s3")
.endChoice()
// otherwise, it will skip s3 and s4, and jump to aggregator directly
.end()
But, what can I do if s1 throws exception? Do I need to isolate s2 as a direct endpoint too? Then each step in the pipeline should be a separate endpoint. I don't like that.
Find a solution: use doTry and doCatch in split route and don't .stop().
from("direct:split")
.doTry()
.bean(S1Bean.class)
.bean(S2Bean.class)
.bean(S3Bean.class)
.bean(S4Bean.class)
.endDoTry()
.doCatch(javax.ws.rs.ProcessingException.class) // oauth timeout
.log(LoggingLevel.ERROR, "Time out, never retry, just aggregate")
.bean(MyGenericExceptionHandler.class)
.doCatch(Exception.class)
.log(LoggingLevel.ERROR, "Other exceptions, mark as failed, aggregate")
.bean(MyGenericExceptionHandler.class)
.end();
And in the MyGenericExceptionHandler, exchange.getIn().setBody(xxx) to set body to the expected type which my aggregator needs. The exception is in exchange.getProperty(Exchange.EXCEPTION_CAUGHT, Exception.class), response code is null. (I create a dto to contain both status code and/or exception, so that either success or failure, I aggregate with same class)
Don't call stop().

Can I send an alert when a message is published to a pubsub topic?

We are using pubsub & a cloud function to process a stream of incoming data. I am setting up a dead letter topic to handle cases where a message cannot be processed, as described at Cloud Pub/Sub > Guides > Handling message failures.
I've configured a subscription on the dead-letter topic to retain messages for 7 days, we're doing this using terraform:
resource "google_pubsub_subscription" "dead_letter_monitoring" {
project = var.project_id
name = "var.dead_letter_sub_name
topic = google_pubsub_topic.dead_letter.name
expiration_policy { ttl = "" }
message_retention_duration = "604800s" # 7 days
retain_acked_messages = true
ack_deadline_seconds = 600
}
We've tested our cloud function robustly and consequently our expectation is that messages will appear on this dead-letter topic very very rarely, perhaps never. Nevertheless we're putting it in place just to make sure that we catch any anomalies.
Given the rarity of which we expect messages to appear on the dead-letter-topic we need to set up an alert to send an email when such a message appears. Is it possible to do this? I've taken a look through the alerts one can create at https://console.cloud.google.com/monitoring/alerting/policies/create however I didn't see anything that could accomplish this.
I know that I could write a cloud function to consume a message from the subscription and act upon it accordingly however I'd rather not have to do that, a monitoring alert feels like a much more elegant way of achieving this.
is this possible?
Yes, you can use Cloud Monitoring for that. Create a new policy and perform that configuration
Select PubSub Topic and Published message. Observe the value every minute and count them (aligner in the advanced option). Now, in the config, when it's above 0 from the most recent value, the alert is raised.
To filter on only your topic you can add a filter by topic_id on your topic name.
Then, configure your alert to send an email. It should work!

How to split a message do some extra processing on one of them and aggregate them back

I need to configure some camel routes based on some configuration files.
All configured routes will need to split a message into one or two sub messages then do some JMS integration work on the first one and then aggregate together the JMS reply with the optional second message. In a simplified picture it will look like below:
message -- > split --> message 1 --> JMS request/reply --> aggregate --> more processing
\--> message 2 /
The aggregation will be done on completion size which I am able to know upfront if it is going to be 1 or 2 depending of the route meta data. When the second message is present no other processing is needed before being merged back with the JMS reply.
Si in short I need a split followed by a routing followed by an aggregation which is quite a common pattern. The only particularity is is that in case the second split message is present I don't need to do anything on it before aggregating it back.
In java DSL it will looks something like this:
from("direct:abc")
// The splitter below will set the JmsIntegration flag
.split().method(MySplitter.class, "split")
.choice()
.when(header("JmsIntegration"))
.inOut("jms:someQueue"))
.otherwise()
// what should I have on here?
.to(???)
.end()
.aggregate(...)to(...);
So my questions would be:
What should I put on the otherwise branch?
What I need in fact is an if: if the split message needs JMS go to JMS and then move to aggregator if it is not just go straight to the aggregator. I am considering creating a dummy processor which will actually do nothing but this seems to me a naive approach.
Am I on a wrong path. If so what would be the alternative
Initially I was thinking about a message enricher but I would not like to sent the original message to the JMS
I also considered putting my aggregation strategy inside my splitter but again I could not put it all together.
Based off your post it looks like you are trying to have the return of your enrichment merge with the original message, but you want to send a custom message to the jms endpoint. I would recommend storing your original message in either a bean or a cache or something of the sort, leveraging all of your conversions with camel and then have your aggregation strategy leverage your storage to return your desired format.
from("direct:abc")
.split().method(MySplitter.class, "split")
.choice()
.when(header("JmsIntegration"))
.beanRef("MyStorageBean", "storeOriginal")
.convertBodyTo(MyJmsFormat.class)
//This aggregation strategy could have a reference
//to your storage bean and retrieve the instance
.enrich("jms:someQueue", myCustomAggreationStrategyInstance)
.otherwise()
.end()
.aggregate(...)
.to("direct:continueProcessing");
Option #2: Based off of your comment saying you needed the "original message that the direct:abc endpoint received this can be simplified a lot. In this example we can use camel's existing Original message store to retrieve the message that was passed into direct:abc. If Your message after the split has a JmsIntegration header we will convert the body to the desired format for the jms call, leverage the enrich statement to make the jms call and a custom aggregator that gives you access to the message used to call the jms endpoint, the message that came back, and the original message direct:abc has. If your flow does not have a JmsIntegration header the message will go to the Otherwise statement in your route which does no additional processing before ending the choice statement and then the spit messages are aggregated back together with whatever custom strategy you need.
from("direct:abc")
.split().method(MySplitter.class, "split")
.choice()
.when(header("JmsIntegration"))
.convertBodyTo(MyJmsFormat.class)
//See aggregationStrategy sample below
.enrich("jms:someQueue", myAggStrat)
.otherwise()
//Non JmsIntegration header messages come here,
//but receive no work and are passed on.
.end()
.aggregate(...)
.to("direct:continueProcessing");
//Your Custom Aggregator
public Exchange aggregate(Exchange oldExchange, Exchange newExchange) {
//This logic will retrieve the original message passed into direct:abc
Message originalMessage =(Message)exchange.getUnitOfWork().getOriginalInMessage();
//TODO logic for manipulating your exchanges and returning the desired result
}
You said you considered using Enricher, but you don't want to send raw message. You can resolve this neatly by using a pre-JMS route:
from("direct:abc")
.enrich("direct:sendToJms", new MyAggregation());
.to("direct:continue");
from("direct:sendToJms")
// do marshalling or conversion here as necessary
.convertBodyTo(MyJmsRequest.class)
.to("jms:someQueue");
public class MyAggregation implements AggregationStrategy {
public Exchange aggregate(Exchange original, Exchange resource) {
MyBody originalBody = original.getIn().getBody(MyBody.class);
MyJmsResponse resourceResponse = resource.getIn().getBody(MyJmsResponse.class);
Object mergeResult = ... // combine original body and resource response
original.getIn().setBody(mergeResult);
return original;
}
}
Splitter automatically aggregates split exchanges back together. However, default (since 2.3) aggregation strategy is to return the original exchange. You can easily override the default strategy with your own by specifying it directly on the Splitter. Furthermore, if you don't have an alternative flow for your Choice, then it's much easier to use Filter. Example:
from("direct:abc")
.split().method(MySplitter.class, "split").aggregationStrategy(new MyStrategy())
.filter(header("JmsIntegration"))
.inOut("jms:someQueue"))
.end()
.end()
.to(...);
You still need to implement MyStrategy to combine the two messages.

Writing a GSM modem driver?

I've been working on an application which uses a GSM modem for one of two things; check its status using the built in HTTP stack by sending a GET request to the server, or sending data to the server (using UDP). I have tried several different methods to keep this as reliable as possible, and I'm finally ready to ask for help.
My application is written for the SIMCOM908 module and the PIC18 platform (I'm using a PIC18 Explorer for development).
So the problem is sometimes the modem is busy doing something, and misses a command. As a human, I would see that and just resend the command. Adding a facility for my MCU to timeout and resend isn't an issue.
What is an issue is that the modem sends unsolicited responses after different events. When the modem changes registration status (with the cell tower) it would respond with +CGREG: 1, ... or when the GPS is ready GPS Ready. These responses can happen at any time, including in the middle of a command (like creating an IP connection).
This is a problem, because I haven't thought of a way to deal with this. My application needs to send a command (to connect to the server for example, AT+CIPSTART="UDP","example.com",5000) This command will response with 'OK', and then when the command has finished 'CONNECT OK'. However, I need to be able to react to the many other possible responses, and I haven't figured out a way of doing this. What do I need to do with my code to; wait for a response from the modem, check the response, perform an action based on that response?
I am code limited (being an 8-bit microcontroller!) and would like the keep repetition to a minimum. How can I write a response function that will take a response from the GSM module (solicited or now) and then let the rest of my program know what is happening?
Ideally, I'd like to do something with those responses. Like keep an internal state (when I hear GPS Ready, I know I can power the GPS etc.
Maybe there are some things I should think about, or maybe there's an open source project that already solves this problem?
Here's what I have so far:
/* Command responses */
enum {
// Common
OK = 0,
ERROR,
TIMEOUT,
OTHER,
// CGREG
NOT_REGISTERED,
// CGATT
NOT_ATTACHED,
// Network Status
NO_NETWORK,
// GPRS status
NO_ADDRESS,
// HTTP ACTION
NETWORK_ERROR,
// IP Stack State
IP_INITIAL,
IP_STATUS,
IP_CONFIG,
UDP_CLOSING,
UDP_CLOSED,
UDP_CONNECTING
} gsmResponse;
int gsm_sendCommand(const char * cmd) {
unsigned long timeout = timer_getCurrentTime() + 5000;
uart_clearb(GSM_UART); // Clear the input buffer
uart_puts(GSM_UART, cmd); // Send the command to the module
while (strstr(bf2, "\r") == NULL) { // Keep waiting for a response from the module
if (timeout < timer_getCurrentTime()) { // Check we haven't timed out yet
printf("Command timed out: %s\r\n", cmd);
return TIMEOUT;
}
}
timer_delay(100); // Let the rest of the response be received.
return OK;
}
int gsm_simpleCommand(const char * cmd) {
if (gsm_sendCommand(cmd) == TIMEOUT)
return TIMEOUT;
// Getting an ERROR response is quick, so if there is a response, this will be there
if (strstr(bf2, "ERROR") != NULL)
return ERROR;
// Sometimes the OK (meaning the command ran) can take a while
// As long as there wasn't an error, we can wait for the OK
while (strstr(bf2, "OK") == NULL);
return OK;
}
A simple command is any AT command that is specifically looking for OK or ERROR in response. Something like AT. However, I also use it for more advanced commands like AT+CPIN? because it means I will have captured the whole response, and can further search for the +CPIN: READY. However, none of this actually response to the unsolicited responses. In fact, the gsm_sendCommand() function will return early when the unsolicited response is received.
What a good way to manage complex, occasionally unsolicited, status messages like this? Please take note that this application is written in C, and runs on an 8bit microcontroller!
Having to handle both unsolicited messages as well as responses to requests in the same data stream is difficult since you will need to demultiplex the incoming stream and dispatch the results to the appropriate handler. It's a bit like an interrupt handler in that you have to drop what you were doing and handle this other bit of information which you were not necessarily expecting.
Some modules have a secondary serial port which can also be used for messages. If this is possible you could have unsolicited messages only appear on a single serial port while the main port is for your AT commands. This may not be possible, and some GSM modules will not support the complete command set on a secondary port.
Perhaps a better approach is to just disable unsolicited messages. Most commands all the state to be requested. eg While waiting for registration, instead of waiting for an unsolicited registration message to appear, simply poll the module for the current registration state. This allows you to always be in control, and you only have to handle the responses for the command just sent. If you're waiting for multiple events you can poll in a loop for each item in turn. This will generally make the code simpler as you only have to handle a single response at a time. The downside is that your response times are limited by your polling rate.
If you're set on continuing with the unsolicited message approach, I'd suggest implementing a small queue for unsolicited messages. While waiting for responses to a command, if the response does not match the command, just push the response on a queue. Then, when you've either received a response to your AT command or timed out you can process the unsolicited message queue afterwards.

Why are these deferred tasks not being executed in the order in which they were added?

I'm using Twilio to send sms's with appengine. Twilio doesn't accept sms's longer than 160 characters so I have to split them. I am splitting the sms's and sending them as follows:
def send_sms_via_twilio(mobile_number, message_text):
client = TwilioRestClient(twilio_account_sid , twilio_auth_token)
message = client.sms.messages.create(to=mobile_number, from_=my_twilio_number, body=message_text)
split_list = split_sms(long_message)
for each_message in split_list:
send_sms_via_twilio(each_message)
However I found that the order of sending varied. For example sometimes I'd recieve message 2/5 then 1/5 then 4/5 etc and other times the order would be correct. The order of the split_list is definately correct. To overcome the incorrect order of the sms's I tried
for each_message in split_list:
deferred.defer(send_sms_via_twilio, each_message, _countdown=1)
However I encountered the same problem. I then tried
for each_message in split_list:
deferred.defer(send_sms_via_twilio, each_message, _countdown=1, _queue="send-text-message")
and defined my queue as
- name: send-text-message
rate: 1/s
bucket_size: 10
max_concurrent_requests: 1
retry_parameters:
task_retry_limit: 5
Thinking that the issue was concurrency (running in python27) and that if I limited max_concurrent_requests this issue would be solved. However the issue is still present i.e. the texts still get sent in the wrong order. I checked the logs but couldnt see any notification of task failure - they just seem to be executing in the wrong order.
Is there something I am missing? How can I fix this issue.
Note that the SMS messaging (specifically the underlying protocols like SMPP) are asynchronous by definition. It means there is no way you can specify the order of distinct SMS messages.
There is a way to specify the order of SMS packets by using the UDH (user defined headers) in the binary body of those messages. But this works only for long SMS messages -- those that are too long to be sent in one message. For example, if your msg exceeds 160 GSM-7 characters or 80 UTF-16 characters it will be send as more than one message with UDH.
In that case the mobile phone won't show message parts as they arrive. It will collect them in memory until the last one comes and then assembles them in the right order. For the end user this is just a message longer than usual and you don't have to write "1/3", "2/3", ... in the message.
Disclaimer: I work for a company that enables you to send and receive both multiple binary messages with user-specified headers (UDH) and/or standard long messages.
If you are not tied to Twilio try using SMSified. They automatically split the message for you, insure it is in the correct order, and add "1/2, 2/2..." to the end of the message. In other words you just send the complete message to their REST API, no matter the length, and they handle the rest. Since they also use a REST API you can continue to use Python.

Resources