How to Organize related applications into git repo's? - database

What is the decision tree to know when to split a suite of related and/or cohesive applications into git repo's and/or branches? Should I keep each app in a repo? Or all app's & dependencies in a single repo? Or something in-between?
answer How should I organize multiple related applications using git? claims that a repository per project is appropriate, but does not give clues as to what a project would be.
And then there's the question of dev, test, integration test, and production checkouts when the git repo's are split. Answer how do you organize your programming work lists some branch/tag options, but ignores the multi-app details.
There's also the DB schema! incremental definition of the schema helps, but again, where would one keep this definition if the DB spans back-end and front-end app's?
Some examples I've been pondering:
a front-end web app and it's back-end CGI/DB: one repo or two?
a set of web back-ends that use features from other back-ends
a set of front-end app's that share CSS and jquery plug-ins
selenium scripts that test front-end features across dependent code - in the front-end app repo or the dependent code repo?
If I want to work on a single app, it's hard (well, tedious and error prone) to check out a directory of a repo, so I have to check out the entire git tree (or at least clone the whole tree), so that implies that git is not really built for keeping all the app's & dependencies in a single tree.
But if I want to keep each of the projects (app's, frameworks, dependencies, doc trees, CSS) in it's own repo, then I run into chasing my tail for dependency resolution, that is, I don't know which version of each app are compatible. I think git tags are a good way to go, if only I could move them to newer versions that maintain compatibility.
When app's split or merge -- as happens often with refactoring models down to baser models -- can i move the git history of just those files to another git? I don't see how to do this, so that leans towards a single repo for it all.
If I develop a new feature across app's, it would be nice for branches to represent features.
I think I want a repo of repo's -- does that exist?

This is about using a component approach: a component being a coherent set of files which have their own history (own set of branches, tags and merges).
It should include only what cannot be generated (although the db schema can sometime be added to the repo, as seen in "What is the right approach to deal with Rails db/schema.rb file in GIT?". You still can generate it though, as shown in "What is the preferred way to manage schema.rb in git?", to avoid needless conflicts)
A component can evolve without another one having to evolve. See "Structuring related components in git".
That is the main criteria which allows you to answer: "X and Y: one or two repos?".
You can split a repo into two later, but be aware that will change their history: other contributor will need to reset their own repo to that new history.
You can group those different components repos in one with submodules, as explained here (that is the "repo of repos", or, if you want to have only one repo, in subtree, as illustrated here.

Related

Is more granular versioning in a monorepo with a container possible?

My team has a monorepo written with React, built with Webpack, and managed with Lerna.
Currently, our monorepo contains a package for each screen in the app, plus a "container" package that is basically a router that lazily serves each screen. The container package has all the screens' packages as dependencies.
The problem we keep running into is that, by Lerna's convention, that container package always contains the latest version of each screen. However, we aren't always ready to take the latest version of each screen to production.
I think we need more granular control over the versions of each screen/dependency.
What would be a better way to handle this? Module Federation? peerDependencies? Are there other alternatives?
I don't know if this is right for your use case as you may need to stick with a monorepo for some reason, but we have a similar situation where our frontend needs to pull in different screens from different custom packages. The way we handle this is by structuring each screen or set of screens as its own npm package in its own directory (this can be as simple as just creating a corresponding package.json), publishing it to its own private Git repository, and then installing it in the container package via npm as you would any other module (you will need to create a Git token or set up SSH access if you use a private repo).
The added benefit of this is that you can use Git release tags to mark commits with versions (we wrote our own utility that manages this process for us automatically using Git hooks to make this easier), and then you can use the same semver ranges that you would with a regular npm package to control which version you install.
For example, one of your dependencies in your container package.json could look something like this: "my-package": "git+ssh://git#github.<company>.com:<org or user>/<repo>#semver:^1.0.0 and on the GitHub side, you would mark your commit with the tag v1.0.0. Now just import your components and render as needed
However, we aren't always ready to take the latest version of each screen to production.
If this is a situation that occurs very often, then maybe a monorepo is not the best way to structure the project code? A monorepo is ideal to accommodate the opposite case, where you want to be sure that everything integrates together before merging.
This is often the fastest way to develop, as you don't end up pushing an indeterminate amount of integration work into the future. If you (or someone else) have to come back to it later you'll lose time context switching. Getting it out of the way is more efficient and a monorepo makes that as easy as it can be.
If you need to support older versions of code for some time because it's depended on by code you can't change (yet), you can sometimes just store multiple versions on your main branch. React is a great example, take a look at all the .new.js and .old.js files:
https://github.com/facebook/react/tree/e099e1dc0eeaef7ef1cb2b6430533b2e962f80a9/packages/react-reconciler/src Current last commit on main
Sure, it creates some "duplication", but if you need to have both versions working and maintained for the foreseeable future, it's much easier if they're both there all the time. Everybody gets to pull it, nobody can miss it because they didn't check out some tag/branch.
You shouldn't overdo this either, of course. Ideally it's a temporary measure you can remove once no code depends on the old version anymore.

What are my options for merging subsets of repositories?

This question is more of an application architecture and source control type of question.
I have 2 Github repositories, one is a React single page application and the other is for a React website. For my single page application, I am making the code publicly available and the application links to its repository. For my website, I want to keep the repository private but incorporate the single page application into it so people can use it without having to download and build code.
Can I get some options on how to merge changes to the single page application repository with the website repository?
So far I am just merging code to the website manually by copying it over and pushing the code, but that is a problematic way of doing things. Neither repo is completely up and running yet, so there is still time for me to make architecture changes. Maybe there are git commands to handle everything?
Any help is appreciated, including suggested architecture/repo changes.
I think the best option here would be to use git cherry-pick, but in an automated way:
Build a simple script that listens for push events via git webhooks, from your single page repo. That way you can get the merge-in-master event automatically
Get the hash of that commit
Plug that hash into the git cherry-pick command applied on your private website repo. You can apply this commit on a separate branch in this repo, and merge it in master when you think it's appropriate

What is the general practice for express and react based application. Keeping the server and client code in same or different projects/folders?

I am from a microsoft background where I always used to keep server and client applications in separate projects.
Now I am writing a client-server application with express as back-end and react js as front-end. Since i am totally a newbie to these two tools, I would like to know..
what is the general practice?:
keeping the express(server) code base and react(client) code base as separate projects? or keeping the server and client code bases together in the same project? I could not think of any pros & cons of either of these approaches.
Your valuable recommendations are welcome!.
PS: please do not mark this question as opinionated.. i believe have a valid reason to ask for recommendations.
I would prefer keeping the server and client as separate projects because that way we can easily manage their dependencies, dev dependencies and unit tests files.
Also if in case we need to move to a different framework for front end at later point we can do that without disturbing the server.
In my opinion, it's probably best to have separate projects here. But you made me think a little about the "why" for something that seems obvious at first glance, but maybe is not.
My expectation is that a project should be mostly organized one-to-one on building a single type of target, whether that be a website, a mobile app, a backend service. Projects are usually an expression of all the dependencies needed to build or otherwise output one functioning, standalone software component. Build and testing tools in the software development ecosystem are organized around this convention, as are industry expectations.
Even if you could make the argument that there are advantages to monolithic projects that generate multiple software components, you are going against people's expectations and that creates the need for more learning and communication. So all things being equal, it's better to go with a more popular choice.
Other common disadvantages of monolithic projects:
greater tendency for design to become tightly coupled and brittle
longer build times (if using one "build everything" script)
takes longer to figure out what the heck all this code in the project is!
It's also quite possible to make macro-projects that work with multiple sub-projects, and in a way have the benefits of both approaches. This is basically just some kind of build script that grabs the output of sub-project builds and does something useful with them in a combination, e.g. deploy to a server environment, run automated tests.
Finally, all devs should be equipped with tools that let them hop between discreet projects easily. If there are pains to doing this, it's best to solve them without resorting to a monolothic project structure.
Some examples of practices that help with developing React/Node-based software that relies on multiple projects:
The IDE easily supports editing multiple projects. And not in some cumbersome "one project loaded at a time" way.
Projects are deployed to a repository that can be easily used by npm or yarn to load in software components as dependencies.
Use "npm link" to work with editable local versions of sub-projects all at once. More generally, don't require a full publish and deploy action to have access to sub-projects you are developing along with your main React-based project.
Use automated build systems like Jenkins to handle macro tasks like building projects together, deploying, or running automated tests.
Use versioning scrupulously in package.json. Let each software component have it's own version# and follow the semver convention which indicates when changes may break compatibility.
If you have a single team (developer) working on front and back end software, then set the dependency versions in package.json to always get the latest versions of sub-projects (packages).
If you have separate teams working on front and backend software, you may want to relax the dependency version to be major version#s only with semver range in package.json. (Basically, you want some protection from breaking changes.)

How to version and manage angularjs components for different projects

This is more of a curious question than a technical one. In my company we have an MVP with lots of angularjs components, but now, we are offering the MVP to different companies with specific needs.
Here's what it will look like in real life scenario:
Company 1
Module 1
Module 2
Module 3
Company 2
Module 1 (with a specific feature or change)
Module 3
Company 3
Module 2
Module 3
Module 4 (only for this project)
And we were looking for a versionning system that could fit in our future business model, because as we speak, we are using branches for different companies and other branches for specific component features.
You can see the hell this has become. It's really hard to maintain and it's even harder to deploy the different versions of the application.
I'll be glad to share my findings if we come up with a solution for this case. I'll write a blog post if that's the case.
Thanks!
Are you looking for management of process guidance, or tools?
From a tools standpoint you could use npm, with their private package service or just directed at some private git repo. Bower can do the same.
In the Windows space there's NuGet which you can host your repositories for or there's services out there for that, too.
Git has support for submodules and subtrees, but I don't personally recommend them. Making dependencies part of your actual git history is complicated.
The biggest thing from a process perspective is probably just avoid breaking changes. Put the effort into design of shared components up front so you're not having to redesign everything around the shared component when it changes drastically because it didn't work right the way it was built the first time around.
Treat your shared modules as if they're open source projects. Keep good documentation, clean code, and adhere to semantic versioning. Apply version numbers to stable builds (git tag them so they're easy to check out). Put someone in charge of accepting changes to the component so they can keep track of what everyone else is doing with it and guide it's development.
Fork it into a new package of the requirements one project has is wildly different than the others. Maintaining a component with too many different requirements can become a nightmare.

Manage cross-platform projects on Github

I'm looking for a tidy way to manage my cross-platform HTML+JS projects in github.
Here's my typical working process:
I complete developing my app for ios
I start working on Android platform version
I start working on XXXXXXX platform
...
From step 2 and further I come out with:
commits that can be merged in the Head repository
commits that can not be merged, so I have at least 2 versions of some of the files that compose the project
My problem is that forking/branching for each platform force me to duplicate changes on the shared part of the project too. Maybe there's something that I'm missing in both branching and forking.
Which method you use to organize your code on github so as to preserve both the differences and the unity of the project?
It sounds like branches might be the way to go: create an android branch, etc., and if you need to branch those further then create android/branch1, android/branch2 and so forth.
When you need to merge files between branches you might want to use the git cherry-pick command to select the commits to merge. I would also probably do this on a temporary local branch before pushing, to make it easy to recover from screw-ups!

Resources