Send aggregation result on the main route, not the input messages - apache-camel

I want to get the result of the aggregation as the main message of the route, not the original messages that came on the IN.
I also want to do this in one route.
I know I can use .to("direct:one_result") after the aggregation, but I have a strong limitation on doing this in one route, because I'm generating routes dynamically.
My .to("mock:out") will be replaced by a longer route definition.
from("direct:in").routeId("TEST_AGGREGATION_ROUTE")
.log("<IN> ${body}")
.aggregate(header("THE_ID"), (oldExchange, newExchange) -> {
final List<Object> body;
final Exchange outExchange;
if (oldExchange == null) {
outExchange = newExchange;
body = new ArrayList<>();
body.add(newExchange.getIn().getBody());
} else {
outExchange = oldExchange;
body = oldExchange.getIn().getBody(List.class);
body.add(newExchange.getIn().getBody());
}
outExchange.getIn().setBody(body);
return outExchange;
})
.completionSize(4)
.completionTimeout(30000)
.log("<AGGREGATION> size = ${body.size}") // HERE I GET THE AGGREGATION RESULT
.end()
.log("<OUT> ${body}") // HERE I GET THE INPUT MESSAGES
.to("mock:out")
;
The test output looks like:
TEST_AGGREGATION_ROUTE - <IN> BODY1
TEST_AGGREGATION_ROUTE - <OUT> BODY1
TEST_AGGREGATION_ROUTE - <IN> BODY2
TEST_AGGREGATION_ROUTE - <OUT> BODY2
TEST_AGGREGATION_ROUTE - <IN> BODY3
TEST_AGGREGATION_ROUTE - <OUT> BODY3
TEST_AGGREGATION_ROUTE - <IN> BODY4
TEST_AGGREGATION_ROUTE - <AGGREGATION> size = 4
TEST_AGGREGATION_ROUTE - <OUT> BODY4

There is a mistake in your routing. You should not process the final results of the aggregation "outside the loop" but in a sub route. Do not put any statement after your end().
from("direct:in")
...
.aggregate(header("THE_ID"), (oldExchange, newExchange) -> {...})
.completionSize(4)
.completionTimeout(30000)
.to("direct:processAggregation")
.end();
from("direct:processAggregation")
.log("<AGGREGATION> size = ${body.size}")
.log("<OUT> ${body}");
Once the aggregation will have reached its completion size, the whole aggregate will be sent to the very first next "to(...)" endpoint. So what you want to do with every aggregate should me modelled in a separate route.

Related

Use feeder next 10 results and repeat the request 10 times in Gatling

I am using Gatling 3.6.1 and I am trying to repeat the request 10 times for the next 10 products from the feeder file. This is what I tried:
feed(products, 10)
.repeat(10, "index") {
exec(session => {
val index = session("index").as[Int]
val counter = index + 1
session.set("counter", counter)
})
.exec(productIdsRequest())
}
private def productIdsRequest() = {
http("ProductId${counter}")
.get(path + "products/${product_code${counter}}")
.check(jsonPath("$..code").count.gt(2))
}
I am having trouble getting the counter value to my API URL.
I would like to have something like
products/${product_code1},
products/${product_code2} etc.
But instead, I get the error 'nested attribute definition is not allowed'
So basically I would like that every request gets called with one product from the feeder (in the batch of 10 products)
Can you please help?
Thanks!
Disclaimer: I don't know how realized your feeder products.
If I clearly understand you - just need to move .repeat on high level:
.repeat(10, "counter") {
feed(product)
.exec(http("ProductId ${counter}")
.get("products/${product_code}")
.check(jsonPath("$..code").count.gt(2)))
}

AKKA HTTP + AKKA stream 100% CPU utilization

I have a web API exposing one GET endpoint using Akka HTTP and, the logic that it takes the parameter form the requester go and call external web service using AKKA Streams and based on the response it goes and query another endpoint also using akka stream.
first external endpoint call looks like this
def poolFlow(uri: String): Flow[(HttpRequest, T), (Try[HttpResponse], T), HostConnectionPool] =
Http().cachedHostConnectionPool[T](host = uri, 80)
def parseResponse(parallelism: Int): Flow[(Try[HttpResponse], T), (ByteString, T), NotUsed] =
Flow[(Try[HttpResponse], T)].mapAsync(parallelism) {
case (Success(HttpResponse(_, _, entity, _)), t) =>
entity.dataBytes.alsoTo(Sink.ignore)
.runFold(ByteString.empty)(_ ++ _)
.map(e => e -> t)
case (Failure(ex), _) => throw ex
}
def parse(result: String, data: RequestShape): (Coord, Coord, String) =
(data.src, data.dst, result)
val parseEntity: Flow[(ByteString, RequestShape), (Coord, Coord, String), NotUsed] =
Flow[(ByteString, RequestShape)] map {
case (entity, request) => parse(entity.utf8String, request)
}
and the stream consumer
val routerResponse = httpRequests
.map(buildHttpRequest)
.via(RouterRequestProcessor.poolFlow(uri)).async
.via(RouterRequestProcessor.parseResponse(2))
.via(RouterRequestProcessor.parseEntity)
.alsoTo(Sink.ignore)
.runFold(Vector[(Coord, Coord, String)]()) {
(acc, res) => acc :+ res
}
routerResponse
then I do some calculations on routerResponse and create a post to the other external web service,
Second external Stream Consumer
def poolFlow(uri: String): Flow[(HttpRequest, Unit), (Try[HttpResponse], Unit), Http.HostConnectionPool] =
Http().cachedHostConnectionPoolHttps[Unit](host = uri)
val parseEntity: Flow[(ByteString, Unit), (Unit.type, String), NotUsed] = Flow[(ByteString, Unit)] map {
case (entity, _) => parse(entity.utf8String)
}
def parse(result: String): (Unit.type, String) = (Unit, result)
val res = Source.single(httpRequest)
.via(DataRobotRequestProcessor.poolFlow(uri))
.via(DataRobotRequestProcessor.parseResponse(1))
.via(DataRobotRequestProcessor.parseEntity)
.alsoTo(Sink.ignore)
.runFold(List[String]()) {
(acc, res) => acc :+ res._2
}
The Get Endpoint consume the first stream and then build the second request based on the first response,
Notes:
the first external service is fast 1-2 seconds time, and the second's external service is slow 3-4 seconds time.
the first endpoint is being queried using parallelism=2 and the second endpoint is being queried using parallelism=1
The Service is running on AWS ECS Cluster, and for the test purposes it is running on a single node
the problem,
that the web service work for some time but the CPU utilization get higher by dealing with more request, I would assume something to do with back pressure is being triggered, and the CPU stays highly utilized after no request is being sent also which is strange.
Does anybody have a clue whats going on

Camel Rest DSL - AggregationStrategy strange behavior

The title is a little a canvasser, and it is of course my fault if it does not work, as it should.
I want to perform a data transfer from a rdbms to solr and mongo db.
To do that, I have to complete the following steps (for example) :
Get customers ids to transfer
Get custometrs details
Get customers invoices
Get customers payments
Then, aggregate and save to mongo db and solr for indexing.
Here is my code, but I can not get it to work :
from("seda:initial-data-transfer")
.setProperty("recipientList", simple("direct:details,direct:invoices,direct:payments"))
.setProperty("afterAggregate", simple("direct:mongodb,direct:solr"))
.setBody(constant("{{query.initial-data-transfer.ids}}"))
.to(jdbc)
.process(new RowSetIdsProcessor())
.split().tokenize(",", 1000) // ~200k ids - group by 1000 ids
.to("direct:customers-ids");
from("direct:customers-ids")
.recipientList(exchangeProperty("recipientList").tokenize(","))
// ? .aggregationStrategy(new CustomerAggregationStrategy()).parallelProcessing()
.aggregate(header("CamelCorrelationId"), new CustomerAggregationStrategy())
.completionPredicate(new CustomerAggregationPredicate()) // true if details + invoices + payments, etc ....
// maybe a timeOut here ?
.process(businessDataServiceProcessor)
.recipientList(exchangeProperty("afterAggregate").tokenize(","));
from("direct:details")
.setHeader("query", constant("{{query.details}}"))
.bean(SqlTransform.class,"detailsQuery").to(jdbc)
.process(new DetailsProcessor());
from("direct:invoices")
.setHeader("query", constant("{{query.invoices}}"))
.bean(SqlTransform.class,"invoicessQuery").to(jdbc)
.process(new InvoicesProcessor());
I do not understand how works AggregationStrategy.
Sometimes, I can perform 2 or 3 blocks of 1000 ids, and save to mongo DB and Solr but after, all exchanges are empty in the aggregationStrategy ...
I tried a lot of thinks .. but each time, the aggregation fail.
Thanks for your help
Update :
Here is a part of the CustomerAggregationStrategy :
public class CustomerAggregationStrategy implements AggregationStrategy {
#Override
public Exchange aggregate(Exchange oldExchange, Exchange newExchange) {
Message newIn = newExchange.getIn();
CustomerDataCollector collector = null;
if (oldExchange == null) {
int completionSize = newExchange.getProperty("completionSize", Integer.class);
collector = new CustomerDataCollector(completionSize);
CollectData(collector, newIn, newExchange);
newIn.setBody(collector);
return newExchange;
}
collector = oldExchange.getIn().getBody(CustomerDataCollector.class);
CollectData(collector, newIn, newExchange);
return oldExchange;
}
private void CollectData(CustomerDataCollector collector, Message message, Exchange exchange) {
String recipientListEndpoint = (String)exchange.getProperty(Exchange.RECIPIENT_LIST_ENDPOINT);
switch (recipientListEndpoint){
case "direct://details" :
collector.setDetails(message.getBody(Map.class));
break;
case "direct://invoices" :
collector.setInvoices(message.getBody(Map.class));
break;
case "direct://payments" :
collector.setPayments(message.getBody(Map.class));
break;
}
}
}
Update :
I can log this in the CustomerAggregationStrategy :
String camelCorrelationId = (String)exchange.getProperty(Exchange.CORRELATION_ID);
[t-AggregateTask] .i.c.a.CustomerAggregationStrategy : CustomerAggregationStrategy.CollectData : direct://details ID-UC-0172-50578-1484523575668-0-5
[t-AggregateTask] .i.c.a.CustomerAggregationStrategy : CustomerAggregationStrategy.CollectData : direct://invoices ID-UC-0172-50578-1484523575668-0-5
[t-AggregateTask] .i.c.a.CustomerAggregationStrategy : CustomerAggregationStrategy.CollectData : direct://payments ID-UC-0172-50578-1484523575668-0-5
Same values for the CamelCorrelationId as expected.
I thing the CamelCorrelationId is correct. Doesn't it ?
Ok, it is better now.
After the tokeniszer, I set the property CustomCorrelationId like this.
.split().tokenize(",", 1000)
.setProperty("CustomCorrelationId",header("breadcrumbId"))
.to("direct:customers-ids")
And aggregate on this value like this :
from("direct:customers-ids")
.recipientList(exchangeProperty("recipientList").tokenize(","))
from("direct:details")
.setHeader("query", constant("{{query.details}}"))
.bean(SqlTransform.class,"detailsQuery").to(jdbc)
.process(new DetailsProcessor())
.to("direct:aggregate");
...
from("direct:aggregate").routeId("aggregate")
.log("route : ${routeId}")
.aggregate(property("CustomCorrelationId"), new CustomAggregationStrategy())
.completionPredicate(new CustomerAggregationPredicate())
.process(businessDataServiceProcessor)
.recipientList(exchangeProperty("afterAggregate").tokenize(","));
This work fine now and data are correctly aggregated. Thanks for your help.
You pointed out the way.

Akka stream stops after one element

My akka stream is stopping after a single element. Here's my stream:
val firehoseSource = Source.actorPublisher[FirehoseActor.RawTweet](
FirehoseActor.props(
auth = ...
)
)
val ref = Flow[FirehoseActor.RawTweet]
.map(r => ResponseParser.parseTweet(r.payload))
.map { t => println("Received: " + t); t }
.to(Sink.onComplete({
case Success(_) => logger.info("Stream completed")
case Failure(x) => logger.error(s"Stream failed: ${x.getMessage}")
}))
.runWith(firehoseSource)
FirehoseActor connects to the Twitter firehose and buffers messages to a queue. When the actor receives a Request message, it takes the next element and returns it:
def receive = {
case Request(_) =>
logger.info("Received request for next firehose element")
onNext(RawTweet(queue.take()))
}
The problem is that only a single tweet is being printed to the console. The program doesn't quit or throw any errors, and I've sprinkled logging statements around, and none are printed.
I thought the sink would keep applying pressure to pull elements through but that doesn't seem to be the case since neither of the messages in Sink.onComplete get printed. I also tried using Sink.ignore but that only printed a single element as well. The log message in the actor only gets printed once as well.
What sink do I need to use to make it pull elements through the flow indefinitely?
Ah I should have respected totalDemand in my actor. This fixes the issue:
def receive = {
case Request(_) =>
logger.info("Received request for next firehose element")
while (totalDemand > 0) {
onNext(RawTweet(queue.take()))
}
I was expecting to receive a Request for each element in the stream, but apparently each Flow will send a Request.

Pagination in Google cloud endpoints + Datastore + Objectify

I want to return a List of "Posts" from an endpoint with optional pagination.
I need 100 results per query.
The Code i have written is as follows, it doesn't seem to work.
I am referring to an example at Objectify Wiki
Another option i know of is using query.offset(100);
But i read somewhere that this just loads the entire table and then ignores the first 100 entries which is not optimal.
I guess this must be a common use case and an optimal solution will be available.
public CollectionResponse<Post> getPosts(#Nullable #Named("cursor") String cursor,User auth) throws OAuthRequestException {
if (auth!=null){
Query<Post> query = ofy().load().type(Post.class).filter("isReviewed", true).order("-timeStamp").limit(100);
if (cursor!=null){
query.startAt(Cursor.fromWebSafeString(cursor));
log.info("Cursor received :" + Cursor.fromWebSafeString(cursor));
} else {
log.info("Cursor received : null");
}
QueryResultIterator<Post> iterator = query.iterator();
for (int i = 1 ; i <=100 ; i++){
if (iterator.hasNext()) iterator.next();
else break;
}
log.info("Cursor generated :" + iterator.getCursor());
return CollectionResponse.<Post>builder().setItems(query.list()).setNextPageToken(iterator.getCursor().toWebSafeString()).build();
} else throw new OAuthRequestException("Login please.");
}
This is a code using Offsets which seems to work fine.
#ApiMethod(
name = "getPosts",
httpMethod = ApiMethod.HttpMethod.GET
)
public CollectionResponse<Post> getPosts(#Nullable #Named("offset") Integer offset,User auth) throws OAuthRequestException {
if (auth!=null){
if (offset==null) offset = 0;
Query<Post> query = ofy().load().type(Post.class).filter("isReviewed", true).order("-timeStamp").offset(offset).limit(LIMIT);
log.info("Offset received :" + offset);
log.info("Offset generated :" + (LIMIT+offset));
return CollectionResponse.<Post>builder().setItems(query.list()).setNextPageToken(String.valueOf(LIMIT + offset)).build();
} else throw new OAuthRequestException("Login please.");
}
Be sure to assign the query:
query = query.startAt(cursor);
Objectify's API uses a functional style. startAt() does not mutate the object.
Try the following:
Remove your for loop -- not sure why it is there. But just iterate through your list and build out the list of items that you want to send back. You should stick to the iterator and not force it for 100 items in a loop.
Next, once you have iterated through it, use the iterator.getStartCursor() as the value of the cursor.

Resources