Does Apache Camel create multiple threads for multiple from() inside configure() in 3 server nodes? - apache-camel

If we deploy following camel code in three Wildfly nodes:
configure(){
from("sftp").to("MyQueue")
from("MyQueue").to("database")
}
How does program execute in all three nodes?
Does it creates 6 threads i.e 1 thread for from("sftp") - Polling and 1 thread for from("queue") for listening to sftp response.

Not sure about the SFTP consumer, but 1 sounds coherent because an SFTP endpoint does not support concurrency.
Said that, you should take care when you run multiple Camel applications that consume from the same FTP endpoint. Otherwise you will get lots of errors because they compete against each other about the same files.
For a JMS consumer you can configure the number of concurrent consumers when you configure the broker connection.

Related

Apache Camel, SpringBoot application in PCF results with message failures

We have developed Apache Camel, SpringBoot based application to read data from Oracle table and do some transformation of record and publish to Kafka topic. We are using Camel SQL component for DB integration and implemented Split & parallel processing pattern to parallelize processing to achieve high throughput. Oracle tables are partitioned, so we are creating multiple routes, one route per table partition, to speed up the processing. We have 30 partitions in table and so created 30 routes.
from(buildSelectSqlEndpoint(routeId))
.process(new GenericEventProcessor("MessagesReceived"))
.split(body())
.parallelProcessing()
.process(new GenericEventProcessor("SplitAndParallelProcessing"))
.process(new InventoryProcessor())
.process(new GenericEventProcessor("ConvertedToAvroFormat"))
.to(buildKafkaProducerEndPoint(routeId));
I tested the application in local laptop with an adequate load and it is processing as expected. But when we deployed the application in PCF, we see some of the threads are failing. I have enabled the Camel debug log and i see below debug log line -
Pipeline - Message exchange has failed: so breaking out of pipeline
for exchange
Due to these thousands of messages are not published to Kafka.
From initial analysis, i figured out Camel is creating one thread for each route and based on maxMessagePerPoll configuration, it creates number of threads equal to maxMessagePerPoll. With Split & parallel processing enabled, Camel creates additional threads equal to maxMessagePerPoll for Kafka publish. with this approach, we will have hundreds of threads created and not sure any issue with PCF. Also I tried number of routes with 2, to check the message delivery failures, and still see hundreds of failures and with only 2 routes, increases total processing time for millions of records.
Could you please let me know, how to use Apache Camel with containers like PCF? any additional configurations we need to have in PCF or Camel ?

spring boot different instances of a microservice and data integrity

If several instance of a same microservice contain their own database,for scalability, how update all the databases when a create, update or delete operation is made ? which tool compatible with Eureka and Zuul spring propose for that ?
I would suggest you to use RabbitMQ
The basic architecture of a message queue is simple, there are client applications called producers that create messages and deliver them to the broker (the message queue). Other applications, called consumers, connects to the queue and subscribes to the messages to be processed. A software can be a producer, or consumer, or both a consumer and a producer of messages. Messages placed onto the queue are stored until the consumer retrieves them.
Why to use this RabbitMQ??
https://www.cloudamqp.com/blog/2015-05-18-part1-rabbitmq-for-beginners-what-is-rabbitmq.html
Official document for rabbitMQ....
https://www.rabbitmq.com/
How to install rabbitMQ:
https://www.journaldev.com/11655/spring-rabbitmq
Configuration in spring boot application:
https://spring.io/guides/gs/messaging-rabbitmq/
I would suggest you use an Event-based architecture where any service has done his work it produces the event and other services subscribe that event will also start his work.
you can use Kafka queue for same. also, read Distributed Sagas for Microservices
One more thing is that inter-communication use UDP instead of TCP.
Most of the databases offer replication these days with near 0 latency. Unless you use the other databases you can let the database do the synchronization for you.

is camel-cxf consumer multithreaded and how to check if an component uses multiple threads?

We are using camel-cxf as consumer (soap) in our project and asked ourself if camel-cxf uses multiple threads to react on requests.
We think it uses multiple threads, right?!
But what does this mean for the rest of the route? Is all multithreaded after "from" or is there a point of synchronization?
And what does this mean for "parallelProcessing" or "threads"?
In our case we use jdbc component later in the route. IS camel-jdbc also using multiple threads?
How to know in general what threading model is used by a given component?
Let's start with your last question:
How to know in general what threading model is used by a given
component?
You are probably asking which component is single-threaded by default and which ones are multi-threaded?
You need to ask yourself which approach makes most sense for a component and read the component's documentation. Normally the flags will tell you what behavior is applied by default. CXF is a component that requires a web server, jetty in this case, for a SOAP (over HTTP) client to be able to call the service. HTTP is a stateless protocol, a web server has to scale to many clients, thus it makes a lot of sense for a web server to be multi-threaded. So yes, two simultanious requests to a CXF endpoint are handled by two separate (jetty) threads. The route starting at the CXF endpoint is executed simultaniously by the jetty threads that received the request.
On the contrary, if you are polling for file system changes, e.g. you want to check if a certain file was created, it makes no sense to apply multiple threads to the task of polling. Thus the file consumer is single threaded. The thread employed by the file consumer to do the polling will also execute your route that processes the file(s) that were picked up during a poll.
If processing the files identified by a poll takes a long time compared to your polling intervall, and you cannot afford to miss a poll, then you need to hand of the processing of the rest of the route to another thread so your polling thread is again free to do, well, polling. Enter the Threads DSL.
Then you have processors like the splitter that create many tasks from a single task. To make the splitter work for everyone it must be assumed that the tasks created by the splitter cannot be performed out of order and/or fully independent of each other. So the safe default is to run the steps wrapped by the split step in the thread that executes the route as a whole. But if you the route author knows that the individual split items can be processed independent of each other, then you can parallelize the processing of the steps wrapped by the split step by setting parallelProcessing="true".
Both the threads DSL and the using parallelProcessing="true" acquire threads from a thread pool. Camel creates a pool for you. But if you want to use multiple pools or a pool with a different configuration, then you can always supply your own.

Akka Camel multiple consumers

I'm using akka + camel to consume message from activemq, and I'm trying to figure out how to deploy this consumer in multiple machines without duplicate the message. In this case I'm consuming message from a topic and the activemq should know I have one akka system in various machines, instead of various single independent systems.
I tried to accomplish that using akka cluster, but that example using a frontend that subscribe to a cluster of backend does not help since my "backend" actor is the activemq consumer itself and I can't tell activemq to subscribe to my cluster.
Any ideas?
JMS versions < 2.0 does not allow multiple nodes to share a topic subscription (not duplicating the message to each consumer). To cope with that ActiveMQ provides Virtual Topic (you can consume messages published to a topic from a Queue which allows for multiple consumers - load balancing).
It's all naming conventions. So you simply publish to the topic VirtualTopic.Orders and then consume from the queue Consumer.ClusterX.VirtualTopic.Orders. Naming conventions could be changed - see docs.
http://activemq.apache.org/virtual-destinations.html

RabbitMQ message consumers stop consuming messages

Our team is in a spike sprint to choose between ActiveMQ or RabbitMQ. We made 2 little producer/consumer spikes sending an object message with an array of 16 strings, a timestamp, and 2 integers. The spikes are ok on our devs machines (messages are well consumed).
Then came the benchs. We first noticed that somtimes, on our machines, when we were sending a lot of messages the consumer was sometimes hanging. It was there, but the messsages were accumulating in the queue.
When we went on the bench plateform :
cluster of 2 rabbitmq machines 4 cores/3.2Ghz, 4Gb RAM, load balanced by a VIP
one to 6 consumers running on the rabbitmq machines, saving the messages in a mysql DB (same type of machine for the DB)
12 producers running on 12 AS machines (tomcat), attacked with jmeter running on another machine. The load is about 600 to 700 http request per second, on the servlets that produces the same load of RabbitMQ messages.
We noticed that sometimes, consumers hang (well, they are not blocked, but they dont consume messages anymore). We can see that because each consumer save around 100 msg/sec in database, so when one is stopping consumming, the overall messages saved per seconds in DB fall down with the same ratio (if let say 3 consumers stop, we fall around 600 msg/sec to 300 msg/sec).
During that time, the producers are ok, and still produce at the jmeter rate (around 600 msg/sec). The messages are in the queues and taken by the consumers still "alive".
We load all the servlets with the producers first, then launch all the consumers one by one, checking if the connexions are ok, then run jmeter.
We are sending messages to one direct exchange. All consumers are listening to one persistent queue bounded to the exchange.
That point is major for our choice. Have you seen this with rabbitmq, do you have an idea of what is going on ?
Thank you for your answers.
It's always worth
setting the prefetch count when using basic.consume :
channel.basicQos(100);
before the channel.basicConsume line in order to ensure you never have
more than 100 messages queued up in your QueueingConsumer.
I have seen this behavior when using the RabbitMQ STOMP plugin. I haven't found a solution yet.
Are you using the STOMP plugin?
The channel in a RabbitMQ is not thread safe.
so check in consumer channel for any thread requests.

Resources