I happen to start using the submodule approach for a recent project, basically inside one repo, couple of more repos wires and behaves like sub folders. So I’d like to share some thoughts in using it based on my first impression. I'm intending to improve my understanding of this approach along the way.
In order to understand what is submodule, we need to have a solid understanding what is a repository, though I believe everyone is quite good at doing that.
A repository (aka repo) is a physical code storage location synced between local and server through terminal commands. Believe it or not, the repo is a physical layer, such as hard disk. If multiple repos need to play along, we use what’s called a management system such as
npm, where we use a script called
package.json to specify the dependency in between.
Since a repo is a physical entity, it needs to be atomically built, tested, and deployed thus later can be consumed by developers. An great example is Github, where it pushes this idea to one of the extreme in a great way.
When you install a dependency (repo), a local copy is fetched and stored under
node_modules of your project. You could go change the source code there, but normally we don’t do that. In order to make a change to your dependency, you pull the repo as a new project and do everything through there. Essentially a repo is a stand-alone software unit. A straightforward computer science concept that the industry has been using for the past thirty years as far as I remember.
So what is the submodules? Interestingly each submodule is also a repo exactly as the repo we talked about earlier. However under your current project, you can store a local copy of it as a sub-folder. Remember I said earlier, a dependency is stored under
node_modules , but a submodule is stored under a regular sub folder, just like you can have a
src folder, you can have a
react-modal folder inside your project folder.
You might think submodule folder is an alias, but actually it’s not a directory alias, it’s a copy of the repo. When you commit the change, you don’t commit the entire submodule folder either, instead your commits can merely refer to a commit hash of that repo, which is a string. From that sense, people call it alias.
The reason why submodule is introduced is to make sure we can install a snapshot of repo as a sub folder without declaring it as a dependency. For instance the sub folder could be pointing to a particular commit of the repo either merged or not pushed. This does create quite a bit flexibility in terms of developing a project with lots of moving parts at the the same time. For instance, early development work with the speculation that it can split into multiple projects or debugging session.
However, when it comes to merge the new work into the trunk through pull request, IMHO it’s a different story that Submodules can’t handle without special consideration. Let’s walk it through.
As pointed out, a submodule is a repo, you can get in and fix the bug, and commit it and raise a pull request waiting to be approved. This is the developer experience as the author, how about the person who’s going to review this pull request (PR)? He takes a look at the change, and he can’t blindly approve it just because you say the bug is fixed. He could do one of the following:
- He can duplicate everything that the author was doing, especially the parent project in another PR, and then pull this submodule PR and put them together, and see whether everything is working as stated. Difficult path to take as the PR approval.
- The submodule repo is written in a good way, with all testing cases, both unit and visual. So approving this PR is only matter of agreeing upon the addition of features inside the submodule level. So he approves after all tests are passed. Happy path.
- The author doesn’t need approval of his submodule PR, he is the god, he commits at the same time of approving it. Happy path.
Now you see one of the issue, maybe starting from an observation of a fact: submodule is a repo! Which means, you (can’t) shouldn’t bypass the regular workflow of committing and asking for approval unless you can. In this way, I conclude, submodules can be only used for special occasions successfully.
Quote from other people opinions on this, “In most cases, Git submodules are used when your project becomes more complex, and while your project depends on the main Git repository, you might want to keep their change history separate.” — SweetCode
I think from the early stage of the understanding of the submodule, I’m convinced submodule can’t fit into the regular pull request process for team with various of experience level. Because the efficiency gained via submodules can’t be justified by asking everyone to hop on this train and enforce this style of working while at the risk when one of the repo could be under development. This is my two cents, if you expect all trains are moving fast without collision, we then are talking about a mafia business where you could gain some god power. Because in reality, it can’t exist a timeline where all things can move very fast, not in a democratic world at least.