Camel and source/target system availabilites strategy - apache-camel

I am new to Camel and am looking for patterns or strategies to manage the availability of a target system in a Camel route.
For example, say I want:
- to read input data from an file server
- process the data (data -> targetData)
- to send the target data (TargetData) to a target web site using Rest services (call it TargetSystem)
My question is what is the best strategy if the TargetSytem is down?
I understand that if a route fails it is possible to rollback the overall process. But in case TargetSystem is an external system and can be down for hours, I don't think trying to rollback the process untill the target system is up is a good approach.
Is there any pattern or strategy that fits well with this issue?
Regards
Gilles

This is the pattern I use with a couple of systems.
Persist your TargetData somewhere (Database table, JMS queue, Redis, ...)
Create a route that reads unsent TargetData and sends it to TargetSystem
If transfer is OK, mark TargetData accordingly (eg: set a flag in the table row, remove from Queue, etc...)
Trigger such route periodically from a timer and/or other routes. You can even trigger it with a shell command to manually "force" a resend of unsent data.
You can customize this to your needs and for example add logging where appropriate, keep track in a DB table when each tentative send occurs, how many times it failed, how many retries and so on...
You can now modularize your application in 2 modules: one that receives Data and process it to TargetData and another that manages the transfer of TargetData to TargetSystem.
Module can mean 2 CamelContextes, 2 OSGi bundles, 2 totally separate Java applications.

Related

What are the Camel constructs for "long-running routes"?

We are investigating Camel for use in a new system; one of our situations is that a set of steps is started, and some of the steps in the set can take hours or days to execute. We need to execute other steps after such long-elapsed-time steps are finished.
We also need to have the system survive reboot while some of the "routes" with these steps are in progress, i.e., the system might be rebooted, and we need the "routes" in progress to remain in their state and pick up where they left off.
I know we can use a queued messaging system such as JMS, and that messages put into such a queue can be handled to be persisted. What isn't clear to me is how (or whether) that fits into Camel -- would we need to treat the step(s) after each queue as its own route, so that it could, on startup, read from the queue? That's going to split up our processing steps into many more 'routes' than we would have otherwise, but maybe that's the way it's done.
Is/are there Camel construct/constructs which assist with this kind of system? If I know their terms and basic outline, I can probably figure it out, but what I really need is an explanation of what the constructs do.
Camel is not a human workflow / long-lasting tasks system. For that kind look at BPMS systems. Camel is more fitting for real time / near real time integrations.
For long tasks you persist their state in some external system like a message broker or database or BPMS, and then you can use Camel routes to process and move from one state to the next - or where Camel fit in such as integrating with the many different systems you can do OOTB with the 200+ Camel components.
Camel do provide graceful shutdown so you can safely shutdown or reboot Camel. But in the unlikely event of a crash, you may want to look at transactions and idempotency if you are talking about surviving a system crash.
You are referring to asynchronous processing of messages in routes. Camel has a couple of components that you can use to achieve this.
SEDA: In memory not persistent and can only call end points in the same route.
VM: In memory not persistent and can call endpoints in different camel contexts but limited to the same JVM. This component is a extension of SEDA.
JMS: Persistence is configurable on the queue stack. Much more heavy weight but also more fault tolerant than SEDA/JVM.
SEDA/JVM can be used as low overhead replacements for JMS components and in some cases I would use them exclusively. In your case the persistence element is a required so SEDA/JVM is not an option, but to keep things simple the examples will use SEDA as you can get some basics up and running quickly.
The example will assume the following we have a timer that kicks off and then there is two processes it needs to run. See screenshot below:
In this route the message flow is synchronous between the timer and the two process beans.
If we wanted to make these steps asynchronous we would need to break each step into a route of its own. We would then connect these routes using one of the components listed in the beginning. See the screenshot below:
Notice we have three routes each route only has one "processing" step in it. Route Test only has a timer which fires a messages to the SEDA queue called processOne. This message is received on the SEDA queue and sent to the Process_One bean. After this it is the send to the SEDA queue called processTwo, where it is received and passed to the Process_Two bean. All this is done asynchronously.
You can replace the SEDA components with JMS once you get to understand the concepts. I suspect that state tracking is going to be the most complicated part as Camel makes the asynchronous part easy.

Processing a million records as a batch in BizTalk

I am looking at suggestions on how to tackle this and whether I am using the right tool for the job. I work primarily on BizTalk and we are currently using BizTalk 2013 R2 with SQL 2014.
Problem:
We would be receiving positional flat files every day(around 50) from various partners and the theoretical total number of records received would be over a million records. Each record has some identifying information that will need to be sent to a web service which would come back essentially with a YES or NO based on which the incoming file is split into two files.
Originally, the scope for daily expected records was 10k which later ballooned to 100k and now is at a million records.
Attempt 1: Scatter-Gather pattern
I am debatching the records in a custom pipeline using the file disassembler, adding a couple of port configurable properties for the scatter part(following Richard Seroter's suggestion of implementing a round-robin assignment) where I control the number of scatter/worker orchestrations I spin up to call the web service and mark the records to be sent to 'Agency A' or 'Agency B' and finally push a control message that spins up the Gather/Aggregator orchestration that collects all the messages that are processed from the workers into the messagebox via correlation and creates two files to be routed to Agency A and Agency B.
So, every file that gets dropped will have it's own set of workers and a aggregator that would process the file.
This works well for files with fewer number of records but if a file has over 100k records, I see throttling happen and the file takes a long time to process and generate the two files.
I have put the receive location/worker & aggregator/send port on separate hosts.
It appears to be that the gatherer seems to be dehydrated and not really aggregating the records processed by the workers until all of them are processed and i think since the ratio of msgs published vs processed is very large, it is throttling.
Approach 2:
Assuming that the Aggregator orchestration is the bottleneck, instead of accumulating them in an orchestration, i pushed the processed records to a SQL db and 'split' the records into two XML files(basically a concatenate of msgs going to Agency A/B and wrapping it in XML declaration and using the correct msg type based on writing some of the context properties to the SQL table along with the record).
These aggregated XML records are polled and routed to the right agencies.
This seems to work okay with 100k records and completes in an acceptable amount of time. Now that the goal post/requirement has again changed with regard to expected volume, i am trying to see if BizTalk is even a feasible choice anymore.
I have indicated that BT is not the right tool for the job to perform such a task but the client is suggesting we add more servers to make it work. I am looking at SSIS.
Meanwhile, while doing some testing, some observations:
Increasing the number of workers improved processing(duh):
It looks like if each worker processed a fewer number of records in it's queue/subscription, they finished their queue quickly. When testing this 100k record file, using 100 workers completed in under 3 hrs. This is with minimal activity on the server from other applications.
I am trying to get the web service hosting team to give me a theoretical maximum no of concurrent connection they can handle. I am leaning towards asking them to see if they can handle 1000 calls and maybe the existing solution would scale with my observations.
I have adjusted a few settings for the host with regard to message count and physical memory threshold so it won't balk with the volume but I am still unsure. I didn't have to mess with these settings before and can use advice to monitor any particular counters.
The post is a bit long but I am hoping this gives an idea on what I did so far. Any help/insight appreciated in tackling this problem. If you are suggesting alternatives, i am restricted to .NET or MS based tools/frameworks but would love to hear on other options as well.
I will try to answer or give more detail if you want to clarify or understand something I didn't make clear.
First, 1 million records/messages is not the issue, but you can make it a problem by handling it poorly.
Here's the pattern I would lay out first.
Load the records into SQL Server with SSIS. This will be very fast.
Process/drain the records into you BizTalk app for...well, whatever needs to be done. Calling the service etc.
Update the SQL Record with the result.
When that process is complete, query out the Yes and No batches as one (large) message each, transform and send.
My guess is the Web Service will be the bottleneck unless it's specifically designed for such a load. You will probably have to tune BizTalk to throttle only when necessary but don't worry about that just yet. A good app pattern is more important.
In such scenarios, you should consider following approach:
De-batch the file and store individual records to MSMQ. You can easily achieve this without any extra coding effort, all you need is to create a send port using MSMQ adapter or WCF custom with netmsmq binding. If required, you can also create separate queues depending on different criteria you may have in your messages.
Receive the messages from MSMQ using receive location on a separate host.
Send them to web service on a different BizTalk host.
Try using messaging only scenarios, you can handle service response using a pipeline component if required. You can use Map on send port itself. In worst case if you need orchestration, it should only be to handle one message processing without any complex pattern.
You can again push messages back to two MSMQ for two different agencies based of web service response.
You can then receive those messages again and write them to file, you can simply use a send port with FileAppend option or use a custom pipeline component to write the received messages to file without aggregating them in orchestration. You can gather them in orchestration, if per file you don't have more than few thousand messages.
With this approach you won't have any bottleneck within BizTalk and you don't need to use complex orchestration pattern which usually end up having many persistent points.
If web service becomes a bottleneck, then you can control the rate of received message from MSMQ using 1) Ordered Delivery on MSMQ receive location and if required 2) using BizTalk host throttling by changing two properties Message Count in Db to a very low number e.g. 1000 from 50K default and increasing Spool and Tracking Data Multiplier accordingly e.g. 500 from 10 default to make sure the multiply of both number is enough for not to cause throttling due to messages within BizTalk. You can also reduce the number of worker threads on BizTalk host to make it little slow.
Please note MSMQ is part of Windows OS and does not require any additional setup. Usually installed by default, if not you can add using add-remove features. You can also use IBM MQ if your organization has the infrastructure. But for one million messages, MSMQ will be just fine.
Apologies on the late update*
We've decided to use SSIS to bulk import the file to a table and since the lookup web service is part of the same organization and network although using a different stack, they have agreed to allow us to call their lookup table upon which their web service is based on and we are using a 'merge' between those tables to identify 'Y' or 'N' and export them out via SSIS as well.
In short, we've skipped using BT. The time it now takes is within a couple of mins for a 1.5 million record file to be processed and send the split files.
Appreciate all the advice provided here.

Very long camel redelivery policy

I am using Camel and I have a business problem. We consume order messages from an activemq queue. The first thing we do is check in our DB to see if the customer exists. If the customer doesn't exist then a support team needs to populate the customer in a different system. Sometimes this can take a 10 hours or even the following day.
My question is how to handle this. It seems to me at a high level I can dequeue these messages, store them in our DB and re-run them at intervals (a custom coded solution) or I could note the error in our DB and then return them back to the activemq queue with a long redelivery policy and expiration, say redeliver every 2 hours for 48 hours.
This would save a lot of code but my question is if approach 2 is a sound approach or could lead to resource issues or problems with not knowing where messages are?
This is a pretty common scenario. If you want insight into how the jobs are progressing, then it's best to use a database for this.
Your queue consumption should be really simple: consume the message, check if the customer exists; if so process, otherwise write a record in a TODO table.
Set up a separate route to run on a timer - every X minutes. It should pull out the TODO records, and for each record check if the customer exists; if so process, otherwise update the record with the current timestamp (the last time the record was retried).
This allows you to have a clear view of the state of the system, that you can then integrate into a console to see what the state of the outstanding jobs is.
There are a couple of downsides with your Option 2:
you're relying on the ActiveMQ scheduler, which uses a KahaDB variant sitting alongside your regular store, and may not be compatible with your H/A setup (you need a shared file system)
you can't see the messages themselves without scanning through the queue, which is an antipattern - using a queue as a database - you may as well use a database, especially if you can anticipate needing to ever selectively remove a particular message.

Resuming Camel Processing after power failure

I'm currently developing a Camel Integration app in which resumption from a previous state of processing is important. When there's a power outage, for instance, it's important that all previously processed messages are not re-processed. The processing should resume from where it left off before the outage.
I've gone through a number of possible solutions including Terracotta and Apache Shiro. I'm not sure how to use either as documentation on the integration with Apache Camel is scarce. I've not settled on the two, however.
I'm looking for suggestions on the potential alternatives I can use or a pointer to some tutorial to get me started.
The difficulty in surviving outages lies primarily in state, and what to do with in-flight messages.
Usually, when you're talking state within routes the solution is to flush it to disk, or other nodes in the cluster. Taking the aggregator pattern as an example, aggregated state is persisted in an aggregation repository. The default implementation is in memory, so if the power goes out, all the state is lost. However, there are other implementations, including one for JDBC, and another using Hazelcast (a lightweight in-memory data grid). I haven't used Hazelcast myself, but JDBC does a synchronous write to disk. The aggregator pattern allows you to resume from where you left off. A similar solution exists for idempotent consumption.
The second question, around in-flight messages is a little more complicated, and largely depends on where you are consuming from. If you're in the middle of handling a web service request, and the power goes out, does it matter if you have lost the message? The user can simply retry. Any effects on external systems can be wrapped in a transaction, or an idempotent consumer with JDBC idempotent repository.
If you are building out integrations based on messaging, you should consume within a transaction, so that if your server goes down, the messages go back into the broker and can be replayed to another consumer.
Be careful when using seda: or threads blocks, these use an in-memory queue to pass exchanges between threads, any messages flowing down these sorts of routes will be lost if someone trips over the power cable. If you can't afford message loss, and need this sort of processing model, consider using a JMS queue as the endpoints between the two routes (with transactions to ensure you pick up where you left off).

Message Queue or DataBase insert and select

I am designing an application and I have two ideas in mind (below). I have a process that collects data appx. 30 KB and this data will be collected every 5 minutes and needs to be updated on client (web side-- 100 users at any given time). Information collected does not need to be stored for future usage.
Options:
I can get data and insert into database every 5 minutes. And then client call will be made to DB and retrieve data and update UI.
Collect data and put it into Topic or Queue. Now multiple clients (consumers) can go to Queue and obtain data.
I am looking for option 2 as better solution because it is faster (no DB calls) and no redundancy of storage.
Can anyone suggest which would be ideal solution and why ?
I don't really understand the difference. The data has to be temporarily stored somewhere until the next update, right.
But all users can see it, not just the first person to get there, right? So a queue is not really an appropriate data structure from my interpretation of your system.
Whether the data is written to something persistent like a database or something less persistent like part of the web server or application server may be relevant here.
Also, you have tagged this as real-time, but I don't see how the web-clients are getting updates real-time without some kind of push/long-pull or whatever.
Seems to me that you need to use a queue and publisher/subscriber pattern.
This is an article about RabitMQ and Publish/Subscribe pattern.
I can get data and insert into database every 5 minutes. And then client call will be made to DB and retrieve data and update UI.
You can program your application to be event oriented. For ie, raise domain events and publish your message for your subscribers.
When you use a queue, the subscriber will dequeue the message addressed to him and, ofc, obeying the order (FIFO). In addition, there will be a guarantee of delivery, different from a database where the record can be delete, and yet not every 'subscriber' have gotten the message.
The pitfalls of using the database to accomplish this is:
Creation of indexes makes querying faster, but inserts slower;
Will have to control the delivery guarantee for every subscriber;
You'll need TTL (Time to Live) strategy for the records purge (considering delivery guarantee);

Resources