1) Comparing a photo storage app to the Linux kernel doesn't make much sense. Just because a much bigger project in an entirely different (and more complex) domain uses monorepos, doesn't mean you should too.
2) What the hell is a monorepo? I feel dumb for asking the question, and I feel like I missed the boat on understanding it, because no one defines it anymore. Yet I feel like every mention of monorepo is highly dependent on the context the word is used in. Does it just mean a single version-controlled repository of code?
3) Can these issues with sync'ing repos be solved with better use of `git submodule`? It seems to be designed exactly for this purpose. The author says "submodules are irritating" a couple times, but doesn't explain what exactly is wrong with them. They seem like a great solution to me, but I also only recently started using them in a side project
One of my repos has a dependency on another repo (that I also own). I initialized it as a git submodule (e.g. my_org/repo1 has a submodule of my_org/repo2).
Git submodules have some places where you can surprisingly lose branches/stashed changes.
This concerns me, as git generally behaves as a leak-proof abstraction in my experience. Can you elaborate or share where I can learn more about this issue?
"The other main caveat that many people run into involves switching from subdirectories to submodules. If you’ve been tracking files in your project and you want to move them out into a submodule, you must be careful or Git will get angry at you. "
Though apparently newer versions of git are better about not losing submodule branches, so my concerns were outdated.
> Does it just mean a single version-controlled repository of code?
Yeah- they idea is that all of your projects share a common repo. This has advantages and drawbacks. Google is most famous for this approach, although I think they technically have three now- one for Google, one for Android, and one for Chrome.
> They seem like a great solution to me
They don't work in a team context because they're extra steps that people don't do, basically. And did some reason a lot of people find them confusing.
https://github.com/google/ contains 2700+ repositories. I don't know necessarily how many of these are read-only clones from an internal monorepo versus how many are separate projects that have actually been open-sourced, but the latter is more than zero.
I've never worked for Google. But my understanding is that their deployment code doesn't really on any of those 2700+ repositories. I believe it doesn't rely on anything that isn't checked into the monorepo.
If they spin out an open-source project, they either (1) continue development internally and (maybe) do periodic releases by exporting that directory from the monorepo; or (2) allow development to occur externally and periodically import changes when upgrading the version used by the monorepo.
Either way, the point is that to build any Google service, you checkout the monorepo and type whatever their equivalent of 'make' is. No external dependencies.
> What the hell is a monorepo? I feel dumb for asking the question, and I feel like I missed the boat on understanding it, because no one defines it anymore. Yet I feel like every mention of monorepo is highly dependent on the context the word is used in. Does it just mean a single version-controlled repository of code?
In my mind, a mono repo is one company, one (or a very small number of) source code repository. When I started working at Yahoo, everything was in CVS on one hostname (backed by a NetApp Filer), that was a mono repo; when you got into the weeds, there were actually a couple separate repo; Yahoo Japan had a separate repo, and DNS and prod ops had a separate repo, and a couple more, but mostly everything in one; just organized by directories so most people only checked out part of the repo, because not many people needed to look at all the code (or had disk space for either). That evolved into separate SVN repos for each group that wanted to move to SVN. I assume they moved to git at some point after I left.
Same deal when I was at Whatsapp. When I started, we had one SVN repo that everyone shared --- that was a mono repo; when we moved to git, each client had their own repo, server had a repo, and there was a common repo for docs and other things that needed sharing. Facebook had a lot of repos, but one repo was very large and requires 'monorepo' style tools. As a first step for monorepo style tools; in a large company with a large git repo; you need something to sequence commits, because otherwise everyone gets stuck on the git pull; git push loop until they manage to win the race. This wasn't an issue with a large CVS repo, because commits are file based, and while you might have conflicts within your team, you didn't need a global lock; I don't remember having issues with it in SVN either, but my memory is fuzzy and the whole company SVN repo I had was a lot smaller than the whole company CVS repo.
Maybe, I'd say a monorepo is a large repo where the majority of users/developers aren't going to need or want most of the tree.
> Can these issues with sync'ing repos be solved with better use of `git submodule`? It seems to be designed exactly for this purpose. The author says "submodules are irritating" a couple times, but doesn't explain what exactly is wrong with them. They seem like a great solution to me, but I also only recently started using them in a side project
I don't use submodules often, and I'm not sure if some of the irritations have been fixed, but in my use I run into two things: a) git clone requires additional work to get a full checkout of all the submodules; b) git pull requires additional work to update the submodules. I'm sure there's some other issues with some git features; but I was actually fine with CVS and don't really care about git features :P
1) Comparing a photo storage app to the Linux kernel doesn't make much sense. Just because a much bigger project in an entirely different (and more complex) domain uses monorepos, doesn't mean you should too.
2) What the hell is a monorepo? I feel dumb for asking the question, and I feel like I missed the boat on understanding it, because no one defines it anymore. Yet I feel like every mention of monorepo is highly dependent on the context the word is used in. Does it just mean a single version-controlled repository of code?
3) Can these issues with sync'ing repos be solved with better use of `git submodule`? It seems to be designed exactly for this purpose. The author says "submodules are irritating" a couple times, but doesn't explain what exactly is wrong with them. They seem like a great solution to me, but I also only recently started using them in a side project