Git submodules provide a way to consider any repository as a versioned "package" that can be included in any other git repo. This provides you a way to re-use the same code in many different projects. The cost of this is understanding the ins and outs of submodules, and hopefully this page can be a guide for that.
clone a repo that has submodules
git clone --recurse-submodules -j8 git://github.com/foo/bar.git
Note: -j8 is an optional performance optimization that became available in version 2.8, and fetches up to 8 submodules at a time in parallel — see man git-clone
setting up submodules after a regular git clone
If you did not follow the command above to clone a repository, and instead you did something like git clone git://github.com/foo/bar.git, then git will not automatically clone in the contents of any submodules, but it will clone an empty directory with the name of that repository, in order to actually get the contents of those submodules you have to do the following:
git submodule update --init --recursive
This will recursively initialize and update submodules.
Note that you might see elsewhere to use the following command, but we do not use the following command because it only initializes and then updates submodules found in the current git project and not recursive, so the command above works for all possible git repos whereas the one below does not.
git submodule init
git submodule update
Why are init and update different commands
The init commands makes a submodule "active", when a submodule is active, then by running update, it will actually go and get the contents of that repo. If you want the contents of all submodules then you'd always initialize everything and then run update. But if for some reason you only wanted a subset of all the submodules you'd first only initialize the ones that you want.
adding submodules
By default you usually use the command git submodule add <URL>. By default this command does not clone nested submodules inside the added submodule repository. To ensure all nested submodules are cloned and initialized, you need to run git submodule update --init --recursive after adding the submodule.
git submodule add <URL>
cd newly_added_dir
git submodule update --init --recursive
For me I just automatically want to do this everytime, so you can add this bash alias to your .bashrc if you like:
alias git-subadd='f(){ git submodule add "$1" && cd "$(basename "$1" .git)" && git submodule update --init --recursive; & cd .. }; f'
This alias defines a function and calls it immediately
how the active working directory effects git
Suppose you have a project with these directories A/B/C. A is a regular git repository, B is a submodule, and C is another submodule (ie it's a submodule of B). Git behaves differently depending on what directory git is run from.
When you are in A, everything acts regularly, changes made to the actual sources files of A will be detected when you run git status, and changes to what commit the submodule B is being pointed to area also picked up.
When you are in B, git acts as B was a regular git repository and so git status and all git commands run with respect to the B directory, so any changes made to A's source files are completely ignored, you are completely within the context of the B project. Also for any subpath of B which is not another submodule this context remains.
When you are in C, the same thing occurs again, but now within the context of C. The general logic is that the git command is run from the context of the first submodule encountered while going back up the file tree towards the base git project.
git pulling in a git directory with submodules
the git pull introduces a new submodule
If this happens the submodule will be left in the same state as if you did a regular clone, so refer to this section for next steps.
the git pull updates an existing submodule
Updates to submodules via git pulls just change which commit the submodule is pointing to in the git metadata. You then need to run git submodule update. Note that if the new commit introduces a new sub-submodule, ie a new submodule in that submodule, then a regular git submodule update will not go and grab that new submodule, and so you have to run git submodule update --init --recursive to initialize and then update those sub-submodules.
git pulling inside of a submodule (note this is different from the previous section)
If you're in a git directory which contains a submodule, say you're in a directory D, and it contains a submodule S. Suppose you know that S has been updated externally and there are new commits, if you'd like to get these new changes in this project, then you would change directory to D/S and because the git context automatically changed to that of S, then you can run git pull and update to the newest commit.
Now you'll go back to the D directory and run git status, you'll see something like this:
$ git status
On branch main
Your branch is up to date with 'origin/main'.
Changes not staged for commit:
(use "git add <file>..." to update what will be committed)
(use "git restore <file>..." to discard changes in working directory)
modified: S (new commits)
no changes added to commit (use "git add" and/or "git commit -a")
This is telling you that the git metadata associated with the submodule S, has changed in that it is now pointing to a new commit, hence the (new commits) message, the pluaral is used because it needs to be a generic message.
All you have to do now is run git commit in D and now that will update the git repository to know that you are using the new version of the submodule S.
going deep
By default most of the git submodule commands only operate one layer deep, so for example if you're in a git directory which has submodules which themselves contain submodules, then running git submodule update won't updated the nested submodules, so in that case run this:
git submodule update --init --recursive
To do this on each submodule in your project run
git submodule foreach --recursive 'git submodule update --init --recursive'
mistakes
ssh submodules, github organizations and collaborators
When adding submodules from an organization using ssh links, then there comes a problem that people who are not part of the organization will not be able to clone in submodules as they do not have access for ssh as only collaborators can do this (even with public repositories). One fix to this is to add people as collaborators to the organization, but eventually adding everyone to an organization just so they can clone in the submodules becomes a little unwieldy. Here's a way we can fix this:
-
If you haven't added your submodule yet, add it regularly using the methods described above, eg
git submodule add git@github.com:username/repo.git path/to/submodule -
Now edit
.gitmodulesto look like this
[submodule "path/to/submodule"]
path = path/to/submodule
url = git@github.com:username/repo.git
pushUrl = https://github.com/username/repo.git
git submodule sync --recursive to apply the .gitmodules configuration to your local .git/config
HEAD detached from 54b9bf8
If you see that you are on a detached head, this means that you cannot commit any changes here, if you have uncommitted changes you can switch to main with git checkout main, if you do have committed changes then you have to do this:
git switch -c temp-work
git switch main
git merge temp-work
git branch -d temp-work
To avoid this problem in the future, we have to realize why this usually occurs, it happens when you clone a repository and initialize its submodules such as by doing git clone --recursive URL, it makes sense that git puts every submodule at its respective commit, this is so that you can have reproducible behavior when you clone in submodules, but sometimes you know what you want the most up-to-date version of a submodule, in that case run this after the fact:
git submodule foreach --recursive git checkout main
copying a directory with submodules
In a Git repository, I have a subdirectory (e.g., client/) that contains a mix of regular files and nested submodules. I want to duplicate this entire directory to a new location within the same repository (e.g., single_player/). However, simply copying the directory with cp doesn't properly register the submodules in the new location — Git doesn't update .gitmodules or .git/modules, and the new submodule paths aren't tracked. How can I correctly duplicate the directory and ensure that all nested submodules are properly re-added and recognized by Git in their new location?
import os
import shutil
import subprocess
from pathlib import Path
import configparser
def get_repo_root():
result = subprocess.run(['git', 'rev-parse', '--show-toplevel'],
stdout=subprocess.PIPE, text=True, check=True)
return Path(result.stdout.strip())
def parse_gitmodules(repo_root):
config = configparser.ConfigParser()
gitmodules_path = repo_root / '.gitmodules'
if not gitmodules_path.exists():
return {}
config.read(gitmodules_path)
submodules = {}
for section in config.sections():
if not section.startswith("submodule"):
continue
path = config[section].get("path")
url = config[section].get("url")
if path and url:
submodules[path] = url
return submodules
def copy_directory_with_submodules(src, dst):
repo_root = get_repo_root()
submodules = parse_gitmodules(repo_root)
src = Path(src).resolve()
dst = Path(dst).resolve()
submodules_to_add = []
for root, dirs, files in os.walk(src):
rel_root = Path(root).relative_to(src)
dst_root = dst / rel_root
# Check if any submodules exist at this level
to_remove = []
for d in dirs:
full_path = (Path(root) / d).resolve()
rel_path = full_path.relative_to(repo_root).as_posix()
if rel_path in submodules:
dst_submodule_path = (dst / rel_root / d).relative_to(repo_root)
submodules_to_add.append((dst_submodule_path.as_posix(), submodules[rel_path]))
to_remove.append(d)
# Prevent os.walk from descending into submodules
for d in to_remove:
dirs.remove(d)
# Copy files and dirs
os.makedirs(dst_root, exist_ok=True)
for f in files:
src_file = Path(root) / f
dst_file = dst_root / f
shutil.copy2(src_file, dst_file)
# Re-add submodules
for rel_path, url in submodules_to_add:
print(f"Adding submodule: {url} -> {rel_path}")
subprocess.run(['git', 'submodule', 'add', url, rel_path], check=True)
print("✅ Done.")
if __name__ == "__main__":
import sys
if len(sys.argv) != 3:
print("Usage: python copy_directory_with_submodules.py ")
sys.exit(1)
copy_directory_with_submodules(sys.argv[1], sys.argv[2])