How to implement continuous delivery on a platform consisting of multiple applications which all depends on one database and each other? - database

We are working on old project which consists of multiple applications which all use the same database and strongly depend on each other. Because of the size of the project, we can't refactor the code so they all use the API as a single database source. The platform contains the following applications:
Website
Admin / CMS
API
Cronjobs
Right now we want to start implementing a CI/CD pipeline using Gitlab. We are currently experiencing problems, because we can't update the database for the deployment of one application without breaking all other applications (unless we deploy all applications).
I was thinking about a solution where one pipeline triggers all other pipelines. Every pipeline will execute all newly added database migrations and will test if the pipeline is still working like it should. If all pipelines succeeds, the deployment of all applications will be started.
I'm doubting if this is a good solution, because this change will only increase the already high coupling between our applications. Does anybody know a better solution how to implement CI/CD for our platform?

You have to stop thinking about these as separate applications. You have a monolith with multiple modules, but until they can be decoupled, they are all one application and will have to deployed as such.
Fighting this by pretending they aren't is likely a waste of time, your efforts would be better spent actually decoupling these systems.

There are likely a lot of solutions, but one that I've done in the past is create a separate repository for the CI/CD of the entire system.
Each individual repo builds that component, and then you can create tags as they are released or ready for CI at a system level.
The separate CI/CD repo pulls in the appropriate tags for each item and runs CI/CD against all of them as one unit. This allows you to specify which tag for each repo you want to specify, which should prevent this pipeline from failing when changes are made on the individual components.

Ask yourself why these "distinct applications" are using "one and the same database". Is that because every single one of all of those "distinct applications" all deal with "one and the same business semantics" ? If so, as Rob already stated, then you simply have one single application (and on top of that, there will be no decoupling precisely because your business semantics are singular/atomic/...).
Or are there discernable portions in the db structure such that a highly accurate mapping could be identified saying "this component uses that portion" etc. etc. ? In that case what is it that causes you to say stuff like "can't update the database for the deployment of ..." ??? (BTW "update the database" is not the same thing as "restructure the database". Please, please, please be precise.) The answer to that will identify what you've got to tackle.

Related

Is it a good practice to have the database within the same container as the app?

We have several sites running under a CMS using virtual machines. Basically we have three VM running the CMS and a SQL instance to store data. We plan to transition to containers, but to be honest I'm don't have much idea about it and my boss plans to have the full app (CMS and DB) within an image and then deploy as many containers as needed (initially three). My doubt here is that as far as I know containers work better separating the different parts and using them as microservices, so I don't know if it's a good idea to have the full app within the container.
Short answer is: No.
It's best practice with containers to have one process per container. The container has an entrypoint, basically a command that is executed when starting the container. This entrypoint will be the command that starts your process. If you want more than one process, you need to have a script in the container that starts them and puts them in the background, complicating the whole setup. See also docker docs.
There are some more downsides.
A container should only consist of everything it needs to run the process. If you have more than one process, you'll have one big container. Also your not independent on the base image, but you need to find one, that fits all processes you want to run. Also you might have troubles with dependencies, because the different processes might need different version of a dependency (like a library).
You're unable to scale independently. E.g. you could have 5 CMS container that all use the same database, for redundance and performance. That's not possible when you have everything in the same container.
Detecting/debugging fault. If more than one process runs in a container, the container might fail because one of the processes failed. But you can't be sure which one. If you have one process and the container fails, you know exactly why. Also it's easier to monitor health, because there is one health-check endpoint for that container. Last but not least, logs of the container represent logs of the process, not of multiple ones.
Updating becomes easier. When updating your CMS to the next version or updating the database, you need to update the container image of the process. E.g. the database doesn't need to be stopped/started when you update the CMS.
The container can be reused easier. You can e.g. use the same container everywhere and mount the customer specifics from a volume, configmap or environment variable.
If you want both your CMS and database together you can use the sidecar pattern in kubernetes. Simply have a pod with multiple containers in the manifest. Note that this too will not make it horizontal scalable.
That's a fair question that most of us go through at some point. One tends to have everything in the same container for convenience but then later regret that choice.
So, best to do it right from the start and to have one container for the app and one for the database.
According to Docker's documentation,
Up to this point, we have been working with single container apps. But, we now want to add MySQL to the application stack. The following question often arises - “Where will MySQL run? Install it in the same container or run it separately?” In general, each container should do one thing and do it well.
(...)
So, we will update our application to work like this:
It's not clear what you mean with CMS (content/customer/... management system). Nonetheless, milestones on the way to create/sepearte an application (monolith/mcsvs) would propably be:
if the application is a smaller one, start with a monolithic structure (whole application as an executable on a application/webserver
otherwise determine which parts should be seperated (-> Domain-Driven Design)
if the smaller monolithic structure is getting bigger and you put more domain related services, you pull it apart with well defined seperations according to your domain landscape:
Once you have a firm grasp on why you think microservices are a good
idea, you can use this understanding to help prioritize which
microservices to create first. Want to scale the application?
Functionality that currently constrains the system’s ability to handle
load is going to be high on the list. Want to improve time to market?
Look at the system’s volatility to identify those pieces of
functionality that change most frequently, and see if they would work
as microservices. You can use static analysis tools like CodeScene to
quickly find volatile parts of your codebase.
"Building Microservices"-S.Newman
Database
According to the principle of "hiding internal state", every microservice should have its own database.
If a microservice wants to access data held by another microservice,
it should go and ask that second microservice for the data. ... which allows us to clearly separate functionality
In the long run this could be perfected into completely seperated end-to-end slices backed by their own database (UI-LOGIC-DATA). In the context of microservices:
sharing databases is one of the worst things you can do if you’re trying to achieve independent deployability
So the general way of choice would be more or less:

Rolling back changes in Salesforce

I am just getting started with doing moderate web development work in Salesforce for my company, and I'm looking for some feedback/insight into the deployment process. Right now it's looking like we will be doing a fair amount of custom work using visual force and apex. What I am wondering is if I screw something up in my production org (data or metadata) is there a way to roll back to a snapshot or previously released version of my org that still works? With the mediocre development tools I am worried that when bugs do arise that I won't have a good fast way of addressing the situation.
I was reading about different ways to set up source control here:
How can multiple developers efficiently work on one force.com application?
But I haven't found anyone walking through the process of essentially reverting a change set or changing branches. Are the protections built into salesforce good enough that I just won't have to worry about bugs in production? Should I just not worry about having to revert a change set?
One of the ways this is handled is through the proper use of sandbox orgs associated with your production org. You can always keep a sandbox org that has the "blessed" instances of everything while you use another sandbox org to do major development destined for deployment to production. In the event that something seriously wrong occurs between the new development in your dev sandbox being deployed to production, you can roll-forward from your blessed sandbox to revert to what was completely working previously.
That being said, you're on to something when you ask about not worrying about bugs in production. Not that they won't happen, because they will, but rather that you'll soon begin to get a different sense for what broken means. A change set is only one way to get changes from one org to another, and a rather recent development on the platform. They have some limitations like not moving custom setting data, but generally work really well.
But it's true that when you've got good unit tests in place, coupled with all of the rest of the imposed referential integrity checks, it's really not that common to "break the build" so to speak, and wish to revert to some global snapshot of everything at a different point in time. More frequently, in my experience, you will revert isolated units back to previous versions and can do this with sandboxes or source control by pushing an earlier version forward until a fix is found.
Adam
I've been researching an app on the app exchange that at least looks like it will give me what I want. The product is Snapshot by Dreamfactory. Interestingly the sales people I talked to at Dreamfactory told me that salesforce uses their app internally to manage changes. I find it kind of unfortunate that this capability isn't included with my license but... here are the specifics of what I found that will be helpful for my specific question:
The ability to take a snapshot of your orgs meta data and copy or deploy it to another org. This will allow me to deploy/rollback changes.
The ability to diff 2 different snap shots (from different orgs) and see the details of what changed. This will help me to track down the cause of problems when they do arise.

force.com ISV development, deployment, support

We're an ISV that's completed our first app on force.com. It's an xRM-like app with extended workflow to build out complex campaigns (not simple marketing-like campaigns) and integration with on-premise software. The platform brings enormous value, and at the same time some challenges. Interested in other ISV experiences around the following:
Application upgrade process. Customers expect cloud app upgrade to "just happen". Reality is that there's inevitable manual pre- and post-upgrade steps that can fill many pages. We don't want to burden the customer with this, and at the same time while we're happy to do the upgrade work for the customer, we don't want access to customer data and the need for elaborate security assurances that come along with that access. A conundrum.
Development environment. Agile/scrum development relies on achieving full test automation and continuous integration, yet full automation beyond unit test seems difficult or impossible.
Background processing. Constraints on scheduled jobs, callouts, and futures, and issues with transaction management present challenges to traditional software development.
Curious what other ISVs have found.
Thanks!
I am now working at my second Force.com ISV and so have a fair amount of experience in releasing products on the platform (have seen 4 separate products releases, 1 which included 3 version releases and 1 including another version update).
If possible, you should try to remove any pre/post install steps that the user requires to do. It sounds tough, and it is, but its the biggest reason people don't adopt a product. The idea is that it is quick and easy to install, one click, and any extra effort detracts from the user experience. Ensuring your system is data independent is a good way of getting around the data security issues you referred to, and obviously you can offer a consultancy to do the upgrade work. A sensible idea might be to have a list of all the objects and fields that are affected by your products installation and then to do a check of the customer org before installing. I would also say that installing in sandbox and doing a couple of weeks user testing can highlight any problems you may have in future very effectively.
It is not true that full test automation beyond unit tests cannot occur and is actually very simple. The key is having the necessary framework setup. So you would have a central version control system where your code is stored (a key agile part). Then you create a script so that when code is committed, it runs an install on a SFDC org, running all tests and reporting back. You can then get this script to run a set of apex classes or upload a bunch of CSV files to put data in with either further fuller apex tests to run functionality or selenium running to do a set of tests. You can then also use this test data and script for knocking out demo environments for sales guys.
The governor and background processing limits are a bit tight, but they keep on being increased. Maybe you should integrate with Heroku or similar to do some larger external processing? I will say though I think it improves programming abilities in general, making you think about what it is your doing and the best way to do it. This then leads to a more pleasant end user experience. Batch apex jobs area a good way of doing this processing and you can use the asyncapexjob object to report back on the status f a run to users.
Hope that helps and gives you a different perspective!
Paul

CMS Database design - Master database or Multi-Db per site

I am in process of designing my CMS that I am about to create. I was thinking about the database and how I want to go by approaching it.
Do you think its best to create 1 master database for all my clients websites? or Should I have 1 database per site?
What is the benefits and negatives on both approaches? I am always thinking about the future so I was thinking about implementing memcache or APC cache to the project, to offer an option to my client.
Just trying to learn the best practices and what other developers apporach would be
I've run both. My business chooses to separate client-specific data into separate tables so that if one happens to go corrupt, not all are taken down. In an ideal world this might never happen, but murphy's law....It does seem very easy to find things with them separated. You will know with 100% certainty that one client's content will never show up on another's page.
If you do go down that route, be prepared to create scripts that build and configure databases for you. There's nothing fun about building a great system and having demand for it, only to spend your time manually setting up DB's and installs all day long. Also, setting db names is one additional step that's not part of using a single db table--it's a headache that will repeat itself seemingly over and over again.
Develop the single master DB. It will take a small amount of additional effort and add a little bit more complexity to the database design, but will give you a few nice features. The biggest is being able to share data between sites.
Designing for a master database means that you have the option to combine sites when it makes sense, but also lets you install a master per site. Best of both worlds.
It depends greatly upon the amount of customization each client will require. If you forsee clients asking for many one-off features specific to their deployment, separate databases based off of a single core structure might make sense. I would highly recommend trying to make any customizations usable by all clients though, and keep all structure defined in one place/database instead of duplicating it across multiple databases. By using one database, you make updating the structure straightforward and the implementation consistent across all sites so they can all use the same CMS code.

Using a common database for collaborative development

Some of the people in my project seem to think that using a common development database with everyone connecting to it is the best thing. I think that it isn't and each developer having his own database (with periodic updated data dumps) is the best. Am I right or wrong? Have you encountered any problems in any of these approaches?
Disk space and CPU should be cheap enough that every developer can run their own instance of the database, with an automated build under version control. This is needed to allow developers to be bold in hacking on the database, in isolation from any other developer's concurrent hacking.
The caveat being, of course, that any changes they make to their private instance are useless to anyone else unless it can be automatically applied during the build process. So there needs to be a firm policy that application code can't depend on any database state unless that state is represented by version-controlled, unit-tested changes to the DDL.
For an excellent guide on the theory and practice of treating the database definition as another part of the project code, and coordinating changes and refactorings, see Refactoring Databases: Evolutionary Database Design by Scott W. Ambler and Pramod Sadalage.
I like having my own copy of the database for development, because it gives you the flexibility to rapidly change things without worrying how it will impact others.
However, if all the developers are hacking away on their own copy of the database, it becomes more and more difficult to merge everyone's work together in the end.
I think you can get the best of both worlds by letting developers work on a local copy during day-to-day development, but each developer should probably merge their work into a common copy on a pretty regular basis. Writing a lot of unit tests helps too.
We share a single database amongst all our developer (20-odd) but we've got it structured so that everyone has their own tables.
You don't need a separate database per developer if you structure the application right. It should be configurable which database or table-prefix it uses anyway so you can easily move it between instances (unit test, system test, acceptance test, production, disaster recovery and so on).
The advantage to using a single database is that the cost of maintenance is amortized. You don't have your DBAs trying to handle a lot of databases (or, if you're a small-DB shop, you don't have every developer trying to maintain their own database when they're better utilized in developing).
Having a single point of Failure is not a good thing isn't it?
I prefer a single, shared database. But it's very dependent on the situation and the applications being developed.
What works for me may not work for you. Go with your gut.
If you are working with Hibernate or any hibernate-based platform you can configure your database to be created when you start your server (create-drop option). This is very useful when you are adding new attributes to your classes. If this is the case each developer must have his own copy of the DB.
If you are not changing the DB structure at all then you can use a single shared DB.
In this second case is not a must. I prefer to have my own DB where I can do whatever I want. On the other hand remember that some queries can take a lot of time and this will affect your whole team if you are sharing a DB.

Resources