I have a JBoss Fuse project, where I receive data (entities) from an external source and process it. I would like to implement event sourcing to be able to simulate the data consumption afterwards.
Is there a possibility to include some kind of event sourcing in camel routes? Is there a certain event store, which works best with camel and can be easily integrated?
Thanks in advance!
Best Regards, Sandra
akka-persistence is one such implementation of event sourcing. It can be integrated with a Camel route relatively easily:
http://doc.akka.io/docs/akka/current/java/camel.html#akka-camel-components
akka-persistence supports a number of pluggable storage backends, including cassandra, jdbc, redis, just to name a few popular options.
Related
I am using Apache camel for quite long time and found it to be a fantastic solution for all kind of system integration related business need. But couple of years back I came accross the Apache Nifi solution. After some googleing I found that though Nifi can work as ETL tool but it is actually meant for stream processing.
In my opinion, "Which is better" is very bad question to ask as that depend on different things. But it will be nice if somebody can describe more about the basic comparison between the two and also the obvious question, when to use what.
It will help to take decision as per my current requirement, which will be the good option in my context or should I use both of them together.
The biggest and most obvious distinction is that NiFi is a no-code approach - 99% of NiFi users will never see a line of code. It is a web based GUI with a drag and drop interface to build pipelines.
NiFi can perform ETL, and can be used in batch use cases, but it is geared towards data streams. It is not just about moving data from A to B, it can do complex (and performant) transformations, enrichments and normalisations. It comes out of the box with support for many specific sources and endpoints (e.g. Kafka, Elastic, HDFS, S3, Postgres, Mongo, etc.) as well as generic sources and endpoints (e.g. TCP, HTTP, IMAP, etc.).
NiFi is not just about messages - it can work natively with a wide array of different formats, but can also be used for binary data and large files (e.g. moving multi-GB video files).
NiFi is deployed as a standalone application - it's not a framework or api or library or something that you integrate in to something else. It is a fully self-contained, realised application that is fully featured out of the box with no additional development. Though it can be extended with custom development if required.
NiFi is natively clustered - it expects (but isn't required) to be deployed on multiple hosts that work together as a cluster for performance, availability and redundancy.
So, the two tools are used quite differently - hopefully that helps highlight some of the key differences
It's true that there is some functional overlap between NiFi and Camel, but they were designed very differently:
Apache NiFi is a data processing and integration platform that is mostly used centrally. It has a low-code approach and prefers configuration.
Apache Camel is an integration framework which is mostly used in distributed solutions. Solutions are coded in Java. Example solutions are adapters, flows, API's, connectors, cloud functions and so on.
They can be used very well together. Especially when using a message broker like Apache ActiveMQ or Apache Kafka.
An example: A java application is enhanced with Camel so that it can send messages to Kafka. In NiFi the first step is consuming those messages from Kafka. Then in the NiFi flow the message is changed in various steps. In the middle the message is put on another Kafka topic. A Camel function (CamelK) in the cloud does various operations on the message, when it's finished it put the message on a Kafka topic. The message goes through a NiFi flow which at the end calls an API created with Camel.
In a blog I wrote in detail on the various ways to combine Camel and Nifi:
https://raymondmeester.medium.com/using-camel-and-nifi-in-one-solution-c7668fafe451
What is mediation engine as referred to in camels documentation(below link)?
https://camel.apache.org/manual/latest/faq/what-is-camel.html
A use-case example too would be greatly appreciated.
The mediation engine reference in this context is originating from the topic of Enterprise Application Integration and is closely related to what GoF Mediator Pattern does - That is encapsulate communication between entities. In the case of EAI, a mediator/mediation engine sits between multiple disparate systems and acts as a broker between them, instead of letting the systems communicate directly.
Mediation approach in EAI offers capabilities like
Reduced coupling between systems. For instance, you do not have to learn and implement a legacy mainframe protocol in modern systems, just because you want to get some data from main frames. A mediation engine like Apache Camel could talk REST over HTTPS at one end and some archaic mainframe protocol on the other end.
Ease of migration: Once the mainframes area replaced with something else, you could just change the mediation layer to do something about it, instead of modifying multiple impacted systems that used to talk to mainframes.
Access to single resource/service via multiple channels: Let's say you have an old system that currently does SOAP over HTTP but you would like to offer REST with JSON payload to some of your new customers. Instead of building completely new systems up-front for this purpose, you could throw Apache Camel as a mediator and it would accept JSON payloads at one end and SOAP on the other end. Whoever wants to talk JSON can go through Camel and whoever wants to do SOAP may continue in a direct connection to the legacy system. Someday if some hypothetical FooBar protocol becomes popular, and if Apache Camel provides you a FooBar component, your users who demands FooBar support could be routed through Camel, to the system that still speaks SOAP.
All these stuff is discussed in detail in the Enterprise Integration Patterns site and book. Apache Camel implements truck loads of patterns as described in the EIP Book. I hope this answer helped you to understand the mediation role Apache Camel could play in Enterprise IT ecosystems.
From Camel in action
The core feature of Camel is its routing and mediation engine. A
routing engine selectively moves a message around, based on the
route’s configuration. In Camel’s case, routes are configured with a
combination of enterprise integration patterns and a domain-specific
language.
In this link, https://camel.apache.org/manual/latest/faq/what-is-camel.html indicated projects that can be components of the Camel route to and from where messages can be sent and consumed (https://camel.apache.org/components/latest/activemq-component.html, https://camel.apache.org/components/latest/cxf-component.html).
Apache camel is kind of a ESB middleware. Mediation with respect to Camel would mean the following
Data format transformation : If Application A speaks JSON and Application B understands CSV format. You can use Apache Camel to transfer JSON to CSV.
Protocol transformation : If Application A knows only to call webservices but Application B prefers reading data from a message queue. You can use Apache Camel to receive this data by exposing a webservice and then push it to a queue for application B to consume.
Content transformation - Filtering or Enriching data : During this transformation process, you can also transform the data by filtering or enriching data fields based on what application B needs. In this way no change is required in A as it sends what it has and no change is required in application B as it gets what it needs.
Connectors : Many ESB now have in-built connectors to connect directly with ERP or SAS based applications. For example, a kafka connector. https://camel.apache.org/blog/Camel-Kafka-connector-intro/
I am slowly getting familiar with Camel, however I am struggling to understand the level of granularity at which it should be considered. Should Camel be used only if passing messages from one application to another, or it is also appropriate to use Camel to pass messages between components and / or layers within a single application?
For example, I have a requirement to expose a web service that accepts bookings, validates them and writes them to a queue. Would you recommend using camel in this scenario or does it really depend on the level of flexibility I want my solution to allow.
Put another way, if I was required to save the bookings to a database I would never have considered camel and instead just built it as a traditional app that calls a DAL to save the booking. Of course I could use camel-ibatis to insert the data but in this context using camel seems overkill.
Thank you for any pointers on this.
As you obviously suspect, it's somewhat of a grey line. The more that you need to flexible, the more benefit you'll get from using Camel.
Just this past week, I built a prototype of an app that needed to accept an HTTP post, put the data on a queue, and then pull messages from the queue and use them to update a Mongo database.
Initially, I used Camel, and it worked well. Then, the requirement for the HTTP POST was removed (it became just consuming messages from a queue and updating the database), and the database update became more complex than was easily supported via a simple string-based camel mongo endpoint spec, so I wound up doing away with camel, and rewrote it with just a jms connector and the Mongo api.
So, as usual, it depends. I would say that if you're just moving data between two endpoints, and there are no content-based decisions or routing, then you probably won't benefit from using Camel. Once you actually want to use one or more of the Enterprise Integration Patterns, then Camel will be a benefit.
As camel has lots of component, it looks like it can do every thing. But sometime it could be more straight forward if you can use the third party lib directly and don't have lots of business logic which can leverage the Enterprise Integration Patterns.
The real benefit of Camel is you can focus on how to route your message to meet your business logic needs without caring about implementation detail of the component.
as suggested, there are no hard rules...here is my take
use Camel to simplify technology challenges
complex message routing algorithms (EIPs)
technology integrations (components)
use Camel for these types of requirements
highly event based processes (EIPs)
exposing multiple interfaces to biz logic (http, file, jms, etc)
complex runtime management needs (lifecycle, policies)
that said...
don't use for just one simple use case
don't add unnecessary complexity
have a quorum of use cases/reasons/justification to use it
along these lines, I presented the following at ApacheCon focused on why/how to use Camel:
http://www.consulting-notes.com/2014/06/apachecon-2014-presentation-apache.html
Can I set hook on changing or adding some rows in table and get notified someway when such event araised? I discover web and only stuck with pipes. But there is no way to get pipe message immediately when it was send. Only periodical tries to receive.
Implementing an Observer pattern from a database should generally be avoided.
Why? It relies on vendor proprietary (non-standard) technology, promotes database vendor lock-in and support risk, and causes a bit of bloat. From an enterprise perspective, if not done in a controlled way, it can look like "skunkworks" - implementing in an unusual way behaviour commonly covered by application and integration patterns and tools. If implemented at a fine-grained level, it can result in tight-coupling to tiny data changes with a huge amount of unpredicted communication and processing, affecting performance. An extra cog in the machine can be an extra breaking point - it might be sensitive to O/S, network, and security configuration or there may be security vulnerabilities in vendor technology.
If you're observing transactional data managed by your app:
implement the Observer pattern in your app. E.g. In Java, CDI and javabeans specs support this directly, and OO custom design as per Gang Of Four book is a perfect solution.
optionally send messages to other apps. Filters/interceptors, MDB messages, CDI events and web services are also useful for notification.
If users are directly modifying master data within the database, then either:
provide a singular admin page within your app to control master data refresh OR
provide a separate master data management app and send messages to dependent apps OR
(best approach) manage master data edits in terms of quality (reviews, testing, etc) and timing (treat same as code change), promote through environments, deploy and refresh data / restart app to a managed shedule
If you're observing transactional data managed by another app (shared database integration) OR you use data-level integration such as ETL to provide your application with data:
try to have data entities written by just one app (read-only by others)
poll staging/ETL control table to understand what/when changes occured OR
use JDBC/ODBC-level proprietary extension for notification or polling, as well mentioned in Alex Poole's answer OR
refactor overlapping data operations from 2 apps into a shared SOA service can either avoid the observation requirement or lift it from a data operation to a higher level SOA/app message
use an ESB or a database adapter to invoke your application for notification or a WS endpoint for bulk data transfer (e.g. Apache Camel, Apache ServiceMix, Mule ESB, Openadaptor)
avoid use of database extension infrastructure such as pipes or advanced queuing
If you use messaging (send or recieve), do so from your application(s). Messaging from the DB is a bit of an antipattern. As a last resort, it is possible to use triggers which invoke web services (http://www.oracle.com/technetwork/developer-tools/jdev/dbcalloutws-howto-084195.html), but great care is required to do this in a very coarse fashion, invoking a business (sub)-process when a set of data changes, rather than crunching fine-grained CRUD type operations. Best to trigger a job and have the job call the web service outside the transaction.
In addition to the other answers, you can look at database change notification. If your application is Java-based there is specific documentation covering JDBC, and similar for .NET here and here; and there's another article here.
You can also look at continuous query notification, which can be used from OCI.
I know link-only answers aren't good but I don't have the experience to write anything up (I have to confess I haven't used either, but I've been meaning to look into DCN for a while now...) and this is too long for a comment *8-)
Within the database itself triggers are what you need. You can run arbitrary PL/SQL when data is inserted, deleted, updated, or any combination thereof.
If you need to have the event propagate outside the database you would need a way to call out to your external application from your PL/SQL trigger. Some possible options are:
DBMS_PIPES - Pipes in Oracle are similar to Unix pipes. One session can write and a separate session can read to transfer information. Also, they are not transactional so you get the message immediately. One drawback is that the API is poll based so I suggest option #2.
Java - PL/SQL can invoke arbitrary Java (assuming you load your class into your database). This opens the door to do just about any type of messaging you'd like including using JMS to push messages to a message queue. Depending on how you implement this you can even have it being transactionally tied the INSERT/UPDATE/DELETE statement itself. The listening application would then just listen to the JMS queue and it wouldn't be tied to the DB publishing the event at all.
Depending on your requirements use triggers or auditing
Look at DBMS_ALERT, DBMS_PIPE or (preferably) AQ (Advanced queuing) it's Oracle's internal messaging system. Oracle's AQ has its own API, but also can treated like Java JMS provider.
There are also techniques like Stream or (XStream) but those are quite complex.
we have a few various applications that stores its data and we need one common service which provides access to these data.
With the applications I mean for example Atlassian Jira, Confluence, SVN, Git, LDAP, few internal mysql databases, etc. Some of them offers you SOAP API, REST API, various command line clients, for some you have to directly access database to get data.
What we want is a common REST API interface, to access all possible data sources. Of course, we have to solve authentication and authorization, caching and many more tasks.
It seems that something like ESB - Enterprise service bus and EIP - Enterprise integration patterns is answer to our needs.
For start, we are playing and actually dig in to Apache Camel - it's not full EIP stack, it's "just" a integration framework. But I guess it's good enough for us right now.
My question is - what you mean about the solution? Are we on the good way?
Thanks!
Camel has a lot of connectors, so that would be a great start.
If you are afraid it is too thin, then take a look at Apache ServiceMix, which provides a deployment (OSGi) container for camel routes (and other things). Camel comes bundled within the standard service mix release out of the box.
The hard task is probably to design the generic API good enough to cover your use cases.
A GIT repo and a Database are very different, is this very generic? Do you only want to access "text" data or something?
I like the approach with camel non the less, since it's rather generic and flexible in these kind of scenarios. That you will need