Best Practice of OSGi bundle deployment strategy with apache Camel - apache-camel

For integration purposes, we're using Apache Camel, Karaf with OSGi, so we are creating OSGi bundles. However, what Best Practices exist when it comes to structuring the bundles?
The integrations are fairly straightforward, with an incoming document type (via some protocol like HTTPS, SFTP, JMS), transformation to another document type, and again transportation via some protocol. The basic setup is always the same and follows the VETO Pattern: validate, enrich, transform, operate. Each unique combination of the mentioned protocol/docType defines an integration.
We decouple the connectivity (which includes validation) from the other steps via JMS. When we look at the ETO steps we separate those into their own Java classes and their corresponding XSLT. However, what's the added value of the OSGi framework and how should we divide the integrations between the OSGi bundles?
Take into account performing changes, maintenance and deployments? Consider 2 dozen integration points (unique endpoints) with 50 different integrations running in between, in other words 50 unique transformations between two different docTypes. We can put all code & XSLT's of all 50 integrations in 1 bundle (the other bundle handling connectivity), or 50 bundles with 1 integration each. What are best practices, if any, when it comes to deployment strategy? What to take into account?

You can check out examples from Apache Karaf github repository to see how bundles for OSGi applications are structured there. Christian Schneider has also done talk about OSGi best practices and has some examples in his repository as well.
Generally you want to keep your bundles small with least amount of dependencies as possible. Due to this I would recommend having only one integration per bundle. This makes installing integrations and their dependencies easier and offers some flexibility if you ever decide to split integrations between multiple Karaf instances.
For connectivity stuff your best bet is usually to use/create/publish OSGi services. For example with pax-jdbc-config you can use config files to create new DataSource type services which you can then use to connect to different databases from your integration bundles.
Publishing own custom services is pretty easy with declarative services and could easily be used to share connections to internal systems, blob storages, data access objects, etc while maintaining loose coupling by keeping actual implementations hidden with interfaces. For services the recommended approach is to have separate API and implementation bundle so bundles that use the service can just add dependency to the API bundle.
For deployment you can create your own custom Karaf distribution with bundles pre-installed, deploy bundles using Karaf features or use the hot deploy folder. For the two latter ones you might want to configure Karaf to use external folder for bundle configurations and hot deploy as process of updating Karaf is basically replacing it with new installation.
[Edit]
If we have 50+ transformations and put each in its own bundle, then I would have more than 50 bundles to manage. This is just a single instance. Other customers would have their own instances, again running 50+, 100+ bundles
Would say that the key thing here is to try to simplify and identify repetition in bundles. Often these can be converted to something more generic and re-usable.
With OSGi you can use Declarative services and OSGiDefaultCamelContext to instantiate Camel integration instances per configuration file which can be useful if you have multiple integrations that work pretty much the same but only have minor variations. Don't know if camel has support for this with blueprints.
With many bundles efficient use of Karaf features or OSGi features (R8) can be vital for handling the added complexity. Features make it considerably easier to install and update OSGi applications which consist from multiple bundles, configuration files and other features.
Still there's really no rule on how big is too big for single OSGi bundle. Grouping closely related things in to single bundle can make a lot of sense and help avoid splintering things too much.

Related

Vespa application config best practices

What is the best way to dynamically provide configuration to a vespa application?
It seems that the only method that is talked about is baking configuration values into the application package but is there any way to provide configuration values outside of that? ie are there cli tools to update individual configuration values at runtime?
Are there any recommendations or best practices for managing configuration across different environments (ie production vs development) ? At Oath/VMG is configuration checked into source control or managed outside of that?
Typically all configuration changes are made by deploying an updated application package. As you suggest, this is usually done by a CI/CD setup which builds and deploys the application package from a git repository whenever that changes.
This way it is easy to ensure changes have been reviewed (before merge), track all changes that have been made and roll them back if necessary. It is also easy to verify that the same changes which have been deployed and tested (preferably by automated tests) in a development / test environment are the ones that are deployed to production - because the same application package is deployed through each of those environments in order.
It is however also possible to update files in a deployed application package and create a new session from this, which may be useful if your application package has some huge resources. See https://docs.vespa.ai/documentation/cloudconfig/deploy-rest-api-v2.html#use-case-modify

Multiple camelcontexts in one osgi module - what to be careful about?

The application has several camel contexts, each doing its own thing and as such do not need to communicate with each other. They are in the same module because they share some classes.
Are there any issues one needs to watch out for in the case of multiple contexts in a single osgi module ?
What is the recommendation and best-practice in this case ?
It is fairly subjective. IMHO: The two big things to consider are process control and upgrade impacts. Remember-- during a bundle upgrade all the contexts will stop and then restart.
You still have the ability to do fine grain process control (start, stop, pause, resume) at the Camel Context and Route level without having to rely on bundle start | stop.
If you wanted fine grain upgrade ability, you could put the Java classes in their own bundle, export the packages. Then put Camel Contexts in their own bundles and import the Java classes from the shared bundle. You then have the ability to do individual upgrades of the Camel Contexts w/o having to upgrade all the Contexts at once (and force them all to stop).
One single recommendation: have stateless beans/processors/aggregators.
All the state related information about the processing of your body must live in the Exchange headers/properties.
static final constants are good.
Configuration read only properties are fine too.

What is the general practice for express and react based application. Keeping the server and client code in same or different projects/folders?

I am from a microsoft background where I always used to keep server and client applications in separate projects.
Now I am writing a client-server application with express as back-end and react js as front-end. Since i am totally a newbie to these two tools, I would like to know..
what is the general practice?:
keeping the express(server) code base and react(client) code base as separate projects? or keeping the server and client code bases together in the same project? I could not think of any pros & cons of either of these approaches.
Your valuable recommendations are welcome!.
PS: please do not mark this question as opinionated.. i believe have a valid reason to ask for recommendations.
I would prefer keeping the server and client as separate projects because that way we can easily manage their dependencies, dev dependencies and unit tests files.
Also if in case we need to move to a different framework for front end at later point we can do that without disturbing the server.
In my opinion, it's probably best to have separate projects here. But you made me think a little about the "why" for something that seems obvious at first glance, but maybe is not.
My expectation is that a project should be mostly organized one-to-one on building a single type of target, whether that be a website, a mobile app, a backend service. Projects are usually an expression of all the dependencies needed to build or otherwise output one functioning, standalone software component. Build and testing tools in the software development ecosystem are organized around this convention, as are industry expectations.
Even if you could make the argument that there are advantages to monolithic projects that generate multiple software components, you are going against people's expectations and that creates the need for more learning and communication. So all things being equal, it's better to go with a more popular choice.
Other common disadvantages of monolithic projects:
greater tendency for design to become tightly coupled and brittle
longer build times (if using one "build everything" script)
takes longer to figure out what the heck all this code in the project is!
It's also quite possible to make macro-projects that work with multiple sub-projects, and in a way have the benefits of both approaches. This is basically just some kind of build script that grabs the output of sub-project builds and does something useful with them in a combination, e.g. deploy to a server environment, run automated tests.
Finally, all devs should be equipped with tools that let them hop between discreet projects easily. If there are pains to doing this, it's best to solve them without resorting to a monolothic project structure.
Some examples of practices that help with developing React/Node-based software that relies on multiple projects:
The IDE easily supports editing multiple projects. And not in some cumbersome "one project loaded at a time" way.
Projects are deployed to a repository that can be easily used by npm or yarn to load in software components as dependencies.
Use "npm link" to work with editable local versions of sub-projects all at once. More generally, don't require a full publish and deploy action to have access to sub-projects you are developing along with your main React-based project.
Use automated build systems like Jenkins to handle macro tasks like building projects together, deploying, or running automated tests.
Use versioning scrupulously in package.json. Let each software component have it's own version# and follow the semver convention which indicates when changes may break compatibility.
If you have a single team (developer) working on front and back end software, then set the dependency versions in package.json to always get the latest versions of sub-projects (packages).
If you have separate teams working on front and backend software, you may want to relax the dependency version to be major version#s only with semver range in package.json. (Basically, you want some protection from breaking changes.)

What is the best strategy to externalise database properties in a multi-module maven project?

I have a multi-module maven based project which has a number of Spring Boot applications, a couple of which (lets call them A and B) connect to a database (I have a separate module with the database related code on which both applications depend.) I am also using Flyway to maintain the database versioning and maintain the database structure.
What is the best approach to maintain the database properties? At the moment I have 3 places where I am repeating the same thing. I have the application.yml of module A and application.yml of module B, since both are separate Spring Boot applications. Then I have the Flyway plugin configuration again which needs the properties in the pom.xml to be able to perform its tasks, like clean, repair and migrate.
What is the proper approach to centralise and externalise this information, like the database URL, username and password? I am also facing the issue that each time I pull the new code onto the test system I have to update the same data again because it gets overwritten, and the database configuration on the test system is different from my local development environment.
What is the best strategy to manage this?
Externalize your configuration into a configuration module. This, of course, depends on how flexible Flyway / Spring Boot are from using classpath based properties.
Look at something like archaius and make your configuration truly externalized, centralized and dynamic by having it backed by, say, an external datastore. More work involved here but gives you additional benefits, like being able to change config in one place and have them dynamically picked up in running applications everywhere.
It's not an easy problem to solve and definitely involves some work to make your tools cooperate by hooking into their lifecycle.
For your flyway you could use the maven-properties-plugin. In that way you can externalize the credential to a properties file. An example is described here.
For the spring-boot application I will recommend the spring cloud config . With the spring cloud config you can externalize your config to a git repository which can be discovered over an Eureka Service, e.g. like here. I will consider to restructure the modules to independent microservices. A good infrastructure for a microservice based architecture provides the JHipster project.

What are the development differences between Apache products and Redhat Fuse?

We have been using the Apache ActiveMQ and Camel products for a while now but want to look at a good base ESB. I've been reading the Redhat site about Fuse but have been unable to find a good summary of the significant differences between Fuse and Apache for coders.
From a designer's/developer's point of view what are the significant differences between Fuse and the Apache Camel and ActiveMQ that we have been using? I get the lovely overview stuff, FuseIDE and the ESB management tools. But I really just want to know of the differences at the code level, i.e. does it introduce more useful Camel endpoints? are there additional libraries of genuinely useful things that will make my life as a designer/coder easier? are there any pitfalls to look out for?
I just need a few pointers to help me in my search, not a tome. Or better still a quick link to a document that goes over all this (ever hopeful :o) !) I have a short time to form a view to go forward on or the opportunity will pass me by.
Thank you.
SK
At the code level there is "no" difference. The process is that we develop on the Apache projects, and sync the code changes to Red Hat / Fuse git repos. There we cherry pick the changes we want to go into our branches, to keep the product stable. As well backport fixes to older branches if our customers need that / etc (eg you can influence that)
The products from Red Hat is also supported on a much longer timespan than the community support from Apache. There is a guranteed lifetime which you can find here: https://access.redhat.com/support/policy/updates/jboss_notes/
There is only a few additional Camel components from Fuse / JBoss Fuse products, which is part of the open source project Fuse Fabric (http://fuse.fusesource.org/fabric/) which is part of the JBoss Fuse products. Fuse Fabric is in the process of being donated to Apache ServiceMix, so it can benefit that community as well, allowing ServiceMix to bundle Fabric out of the box as well. Fabric has a few Camel components that allows sending messages to a any Camel endpoint that load balances automatic in a clustered environment / cloud environment. And there is another Camel component for selecting a master, and only run the route on the master node, and if the master dies, then another node takes over.
I also think that this move is a testimony of the open source
willingness the Fuse team has and continues to have. We do as much as possible
in the opening. For example the new project - hawtio (http://hawt.io/)
is also fully open source, ASL license, github project, anyone can contribute/fork etc.
And the JBoss Fuse product allows to patch itself in production. So if you need a hotfix asap, we can provide a fix as a .zip file which can be patched using a built-in patch tool in the product. This isn't possible from Apache.
A few links for further material (from our old site and the jboss community site)
http://fusesource.com/enterprise-support/support-offerings/
http://fusesource.com/community/apache-committers-and-fuse/
http://www.jboss.org/products/fuse
http://www.davsclaus.com/2013/04/apache-camel-web-dashboard-with-hawtio.html
Disclosure: I work for Fusesource / Red Hat.
On a code level, the difference is very small, if any at all.
What you get from the commersial RedHat package is support, a package that has been tested and operational benefits (that you mention).
It's all about what happends after the code is made - when you put your things to production and the coder is not still around to handle incidents.
Apache ActiveMQ and Camel are open source projects. Redhat fuse bundles them and possibly many other components into one package and so it can be used as one ESB package. I see the biggest difference as the support that you can get. You can get support for something that your organization has not produced. And the tools that comes with the package does help during development and maintenance in my view.

Resources