Fixing Persistence Issues with Git Inside Docker Containers
Published:
TL;DR: Cloned Git directories inside Docker containers can persist across builds or runs, causing branching and committing issues. Explicitly removing the directory before cloning prevents these errors.
Background
Recently, I have been working again on the MITE (Minimum Information about a Tailoring Enzyme) database webpage. This database stores experimentally verified substrate- and reaction-specificities of natural product-acting tailoring enzymes and is usable as a knowledge base, a reference database, or for machine-learning applications.
All data in MITE is sourced from the scientific literature, hand-curated, and peer reviewed, a time-consuming process, only possible due to the community-driven nature of the database. So far (mite_data v1.18), 50 scientists from 13 countries have contributed to the MITE database, recently published in Nucleic Acids Research.
To invite community participation, we want to allow researchers to perform data submission directly via the MITE web application. To facilitate maintenance and peer review, these entries are committed to the mite_data GitHub repository. You can read about the exact procedure in our publication.
Initially, this system worked well, but after a few webpage updates, branching started to fail, leading to 500 internal server errors upon submission, which were also reproducible locally. What was going on here?
Workflow overview
Before diving into the issue, a brief technical overview of the workflow. The MITE webpage runs a Docker Compose stack consisting of containers for the Flask application, a PostgreSQL database, and a Nginx server.
The data submission procedure looks like this:
- Users submit data through the web form.
- The data is saved as a MITE data schema-formatted JSON file on disk.
- The file is copied into a local
mite_datadirectory, which is cloned (usinggit) into the Flask container on startup. - Flask uses the Python
subprocessmodule to create a feature branch for the submission, commit and push changes, open a pull request on GitHub, checkout the main branch, and delete the local feature branch to be ready for the next submission.
This error occurred in mite_web v1.6.3 and was fixed in v1.6.4.
The Problem
The logs resulting from the issue were vague, but entering the docker container with an interactive shell and inspecting the cloned mite_data directory provided a hint: files from recent releases were missing. Running git fetch revealed that the local main branch was over 100 commits behind the remote. Apparently, rebuilding the container and cloning mite_data had pulled an outdated tip, which could be fast-forwarded. Even more puzzling: on container startup, git failed to clone the mite_data repo because the directory already existed. This persisted despite attempts to fully tear down the containers:
docker-compose down -v --rmi all
docker-compose build --no-cache
docker-compose up --build --force-recreate
There were no mounted volumes, and the mite_data directory was not part of the mite_web repository, meaning that it could not have been accidentally copied into the container upon building. A build with --no-cache should rebuild the container from scratch, including the mite_data directory. My current guess is that the folder was somehow, somewhere cached on disk and pulled into the container during rebuilds.
The Fix
While it is still unclear to me why the mite_data directory persisted in the container, the workaround was relatively simple: remove the mite_data directory on startup, and clone a fresh copy (possibly inspired by XKCD).
MITE_DATA_DIR="/mite_web/mite_web/mite_data"
if [ -d "$MITE_DATA_DIR" ]; then
rm -rf "$MITE_DATA_DIR"
fi
gh repo clone https://github.com/mite-standard/mite_data.git "$MITE_DATA_DIR"
Conclusion
Regardless of the cause for this behavior, my suggestions to anybody running into similar issues: always check the status of the Git directory inside the container to catch stale clone or branches before they cause runtime errors.