Versioning AMQP Messages - versioning

Are there any established best practices around using versioned messages with AMQP?
Assuming I'm doing semantic versioning of my message schemas, I'd like to support the current major version of the message along with one previous major version. Is this a realistic expectation?
What are the pros and cons to the different options for handling versioned messages? I've seen a versioned routing key and a version message header as possible solutions.
How do I determine which version of a message to send for a particular consumer? Or do I send multiple versions and somehow correlate them so that in a send/reply scenario I'm only waiting for a single response.
What are some of the other concerns around building a system that supports versioned messages?

Related

Apache Camel vs Apache Nifi

I am using Apache camel for quite long time and found it to be a fantastic solution for all kind of system integration related business need. But couple of years back I came accross the Apache Nifi solution. After some googleing I found that though Nifi can work as ETL tool but it is actually meant for stream processing.
In my opinion, "Which is better" is very bad question to ask as that depend on different things. But it will be nice if somebody can describe more about the basic comparison between the two and also the obvious question, when to use what.
It will help to take decision as per my current requirement, which will be the good option in my context or should I use both of them together.
The biggest and most obvious distinction is that NiFi is a no-code approach - 99% of NiFi users will never see a line of code. It is a web based GUI with a drag and drop interface to build pipelines.
NiFi can perform ETL, and can be used in batch use cases, but it is geared towards data streams. It is not just about moving data from A to B, it can do complex (and performant) transformations, enrichments and normalisations. It comes out of the box with support for many specific sources and endpoints (e.g. Kafka, Elastic, HDFS, S3, Postgres, Mongo, etc.) as well as generic sources and endpoints (e.g. TCP, HTTP, IMAP, etc.).
NiFi is not just about messages - it can work natively with a wide array of different formats, but can also be used for binary data and large files (e.g. moving multi-GB video files).
NiFi is deployed as a standalone application - it's not a framework or api or library or something that you integrate in to something else. It is a fully self-contained, realised application that is fully featured out of the box with no additional development. Though it can be extended with custom development if required.
NiFi is natively clustered - it expects (but isn't required) to be deployed on multiple hosts that work together as a cluster for performance, availability and redundancy.
So, the two tools are used quite differently - hopefully that helps highlight some of the key differences
It's true that there is some functional overlap between NiFi and Camel, but they were designed very differently:
Apache NiFi is a data processing and integration platform that is mostly used centrally. It has a low-code approach and prefers configuration.
Apache Camel is an integration framework which is mostly used in distributed solutions. Solutions are coded in Java. Example solutions are adapters, flows, API's, connectors, cloud functions and so on.
They can be used very well together. Especially when using a message broker like Apache ActiveMQ or Apache Kafka.
An example: A java application is enhanced with Camel so that it can send messages to Kafka. In NiFi the first step is consuming those messages from Kafka. Then in the NiFi flow the message is changed in various steps. In the middle the message is put on another Kafka topic. A Camel function (CamelK) in the cloud does various operations on the message, when it's finished it put the message on a Kafka topic. The message goes through a NiFi flow which at the end calls an API created with Camel.
In a blog I wrote in detail on the various ways to combine Camel and Nifi:
https://raymondmeester.medium.com/using-camel-and-nifi-in-one-solution-c7668fafe451

Guidelines to understand when to use Apache Camel

I am slowly getting familiar with Camel, however I am struggling to understand the level of granularity at which it should be considered. Should Camel be used only if passing messages from one application to another, or it is also appropriate to use Camel to pass messages between components and / or layers within a single application?
For example, I have a requirement to expose a web service that accepts bookings, validates them and writes them to a queue. Would you recommend using camel in this scenario or does it really depend on the level of flexibility I want my solution to allow.
Put another way, if I was required to save the bookings to a database I would never have considered camel and instead just built it as a traditional app that calls a DAL to save the booking. Of course I could use camel-ibatis to insert the data but in this context using camel seems overkill.
Thank you for any pointers on this.
As you obviously suspect, it's somewhat of a grey line. The more that you need to flexible, the more benefit you'll get from using Camel.
Just this past week, I built a prototype of an app that needed to accept an HTTP post, put the data on a queue, and then pull messages from the queue and use them to update a Mongo database.
Initially, I used Camel, and it worked well. Then, the requirement for the HTTP POST was removed (it became just consuming messages from a queue and updating the database), and the database update became more complex than was easily supported via a simple string-based camel mongo endpoint spec, so I wound up doing away with camel, and rewrote it with just a jms connector and the Mongo api.
So, as usual, it depends. I would say that if you're just moving data between two endpoints, and there are no content-based decisions or routing, then you probably won't benefit from using Camel. Once you actually want to use one or more of the Enterprise Integration Patterns, then Camel will be a benefit.
As camel has lots of component, it looks like it can do every thing. But sometime it could be more straight forward if you can use the third party lib directly and don't have lots of business logic which can leverage the Enterprise Integration Patterns.
The real benefit of Camel is you can focus on how to route your message to meet your business logic needs without caring about implementation detail of the component.
as suggested, there are no hard rules...here is my take
use Camel to simplify technology challenges
complex message routing algorithms (EIPs)
technology integrations (components)
use Camel for these types of requirements
highly event based processes (EIPs)
exposing multiple interfaces to biz logic (http, file, jms, etc)
complex runtime management needs (lifecycle, policies)
that said...
don't use for just one simple use case
don't add unnecessary complexity
have a quorum of use cases/reasons/justification to use it
along these lines, I presented the following at ApacheCon focused on why/how to use Camel:
http://www.consulting-notes.com/2014/06/apachecon-2014-presentation-apache.html

rules engine combining node.js in server and php in ui

I am starting an application that combines many components:
node.js in server side for its light asynchronous event-driven behaviour ( and also for the features socket.io ) ==> it will capture data recieved on socket port, store in database, refresh UI and trigger events.
a php framework for the ui part, fulfilling the usal treatement and rendering html pages.
and of course a database server, most likely to collect shared data
Other components can be added as the overall architecture evolve, but for now, my main concern is to integrate a rules engine somewhere overthere.
So the required process is:
a connected user will define his own rules via web interface.
those rules will be stored in database in a specefic format ( to be figured out )
these same rules are processed at runtime via a rules engine ( probably compatible with node.js )
when constraints are met, the matching events should be excuted
I do have a previous experience with rules engines, normally when it comes to distributed systems, I would go for a one supporting API or soap communication to link different environments ( I have done that between .net and JAVA apps using openrules ).
But, in this case, I am considering drools as it has a node.js compatible version nools. So my idea, is that user rules will be stored in DB via drools which should have the same syntaxe to be exectude via nools server side.
At this point, I am still gathering details to pick the adequate architecture. Has anyone gone through this before. I would really appreciate your feedback regarding your experiences with rules engines in such use-cases.
Thanks

How to design/develop an integration layer or bus for different external services/apps

We are currently looking into replacing one of our apps with possibly an ESB or some similar tool and was looking for some insights into how best to approach this.
We currently have a stand alone service that consumes/interact with different external services and data sources, some delivered through SOAP Web Services and others we just use a DB connection. This service is exposed through SOAP and we have other apps that consume this service but are very tightly coupled to it, now we also have other apps that need to consume some of the external services and would like to replace this all together with an ESB or some sort of SOA platform.
What would be the best way to replace this 'external' services integration layer with an ESB? We were thinking of having a 'global' contract/API in which all of the services we consume are exposed as one single contract where all the possible operations and data structures that we use are exposed under one single namespace, would this be the best way of approaching this? and if so are there any tools that could help us automate this process or do we basically have to handcraft this contract/API?. This would also mean that for any changes to the underlying services/API's we will have to update this new API as well.
If not then the other option I see is to basically use the 'ESB' as a 'proxy' layer in which all of our sources are exposed as they are, so we would end up with several different 'contracts' / API endpoints, but I don't really see the value in this.
Also given the above what would be the best tool for the job? is a full blown ESB an overkill or are we much better rolling our own using something like Apache Camel or Spring Integration?.
A few more details:
We are currently integrating over 5 different external services with more to come in the future.
Only a couple of apps consuming our current app at the moment but several other apps/systems in the future will need to consume some of these external services.
We are currently using a single method of communication (SOAP) between these services but some apps might use pub/sub messaging in the future, although SOAP will still be the main protocol used.
I am new to ESB integration so I apologize in advance if I'm misunderstanding a lot of these technologies and the problems they are meant to solve.
Any help/tips/pointers will be greatly appreciated.
Thanks.
You need to put in some design thoughts of what you want to achieve over time.
There are multiple benefits and potential pitfalls with an ESB introduction.
Here are some typical benefits/use cases
When your applications are hard to change or have very different release cycles - then it's convenient to have an ESB in the middle that can adopt the changes quickly. This is very much the case when your organization buys a lot of COTS products and cloud services that might come with an update the next day that breaks the current API.
When you need to adapt data from one master data system to several other systems and they might not support the same interfaces, i.e. CRM system might want data imported via web services as soon as it's available, ERP want data through db/staging tables and production system wants data every weekend in a flat file delivered via FTP. To keep the master data system clean and easy to maintain, just implement one single integration service in the master data system, and adapt this interface to the various other applications within the ESB plattform instead.
Aggregation or splitting of data from various sources to protect your sensitive systems might be a use case. Say that you have an old system that can take a small updates of information at a time and it's not worth to upgrade this system - then an integration solution that can do aggreggation or splitting or throttling can be a good solution.
Other benefits and use cases include the ability to track and wire tap every message passing between systems - which can even be used together with business intelligence tools to gather KPI:s.
A conceptual ESB can also introduce a canonical message format that is used for all services that needs to communicate. If a lot of applications share the same data with several other applications (not only point to point) - then the benefits of a canonical message format can outweight the cost (which is/can be high). An ESB server might be useful to deal with canonical data as it is usually very good at mapping from one format to another.
However, introducing an ESB without a plan what benefits you are trying to achieve is not really a good thing, since it introduces overhead - you need another server to keep alive, you need perhaps another team to understand all data flows. You need particular knowledge with your integration product. Finally, you need to be able to have some governance around it so that your ESB initiative does not drift away from the goals/benefits you have foreseen.
You should choose some technology that you are comfortable with - or think you can be comfortable with. Apache Camel is indeed very powerful and my favorite integration engine - but it's not an ESB as it does not come with a runtime that you can use to deploy/manage/monitor your integration services with. You can use it together with most Java EE application servers or even better - Apache ServiceMix (= Karaf+Camel+ActiveMQ+CXF) which is built for this task.
The same goes with spring integration - you need to run it somewhere, app servers or what not.
There is a large set of different products, both open source and commercial that does these things.

Observer pattern in Oracle

Can I set hook on changing or adding some rows in table and get notified someway when such event araised? I discover web and only stuck with pipes. But there is no way to get pipe message immediately when it was send. Only periodical tries to receive.
Implementing an Observer pattern from a database should generally be avoided.
Why? It relies on vendor proprietary (non-standard) technology, promotes database vendor lock-in and support risk, and causes a bit of bloat. From an enterprise perspective, if not done in a controlled way, it can look like "skunkworks" - implementing in an unusual way behaviour commonly covered by application and integration patterns and tools. If implemented at a fine-grained level, it can result in tight-coupling to tiny data changes with a huge amount of unpredicted communication and processing, affecting performance. An extra cog in the machine can be an extra breaking point - it might be sensitive to O/S, network, and security configuration or there may be security vulnerabilities in vendor technology.
If you're observing transactional data managed by your app:
implement the Observer pattern in your app. E.g. In Java, CDI and javabeans specs support this directly, and OO custom design as per Gang Of Four book is a perfect solution.
optionally send messages to other apps. Filters/interceptors, MDB messages, CDI events and web services are also useful for notification.
If users are directly modifying master data within the database, then either:
provide a singular admin page within your app to control master data refresh OR
provide a separate master data management app and send messages to dependent apps OR
(best approach) manage master data edits in terms of quality (reviews, testing, etc) and timing (treat same as code change), promote through environments, deploy and refresh data / restart app to a managed shedule
If you're observing transactional data managed by another app (shared database integration) OR you use data-level integration such as ETL to provide your application with data:
try to have data entities written by just one app (read-only by others)
poll staging/ETL control table to understand what/when changes occured OR
use JDBC/ODBC-level proprietary extension for notification or polling, as well mentioned in Alex Poole's answer OR
refactor overlapping data operations from 2 apps into a shared SOA service can either avoid the observation requirement or lift it from a data operation to a higher level SOA/app message
use an ESB or a database adapter to invoke your application for notification or a WS endpoint for bulk data transfer (e.g. Apache Camel, Apache ServiceMix, Mule ESB, Openadaptor)
avoid use of database extension infrastructure such as pipes or advanced queuing
If you use messaging (send or recieve), do so from your application(s). Messaging from the DB is a bit of an antipattern. As a last resort, it is possible to use triggers which invoke web services (http://www.oracle.com/technetwork/developer-tools/jdev/dbcalloutws-howto-084195.html), but great care is required to do this in a very coarse fashion, invoking a business (sub)-process when a set of data changes, rather than crunching fine-grained CRUD type operations. Best to trigger a job and have the job call the web service outside the transaction.
In addition to the other answers, you can look at database change notification. If your application is Java-based there is specific documentation covering JDBC, and similar for .NET here and here; and there's another article here.
You can also look at continuous query notification, which can be used from OCI.
I know link-only answers aren't good but I don't have the experience to write anything up (I have to confess I haven't used either, but I've been meaning to look into DCN for a while now...) and this is too long for a comment *8-)
Within the database itself triggers are what you need. You can run arbitrary PL/SQL when data is inserted, deleted, updated, or any combination thereof.
If you need to have the event propagate outside the database you would need a way to call out to your external application from your PL/SQL trigger. Some possible options are:
DBMS_PIPES - Pipes in Oracle are similar to Unix pipes. One session can write and a separate session can read to transfer information. Also, they are not transactional so you get the message immediately. One drawback is that the API is poll based so I suggest option #2.
Java - PL/SQL can invoke arbitrary Java (assuming you load your class into your database). This opens the door to do just about any type of messaging you'd like including using JMS to push messages to a message queue. Depending on how you implement this you can even have it being transactionally tied the INSERT/UPDATE/DELETE statement itself. The listening application would then just listen to the JMS queue and it wouldn't be tied to the DB publishing the event at all.
Depending on your requirements use triggers or auditing
Look at DBMS_ALERT, DBMS_PIPE or (preferably) AQ (Advanced queuing) it's Oracle's internal messaging system. Oracle's AQ has its own API, but also can treated like Java JMS provider.
There are also techniques like Stream or (XStream) but those are quite complex.

Resources