Work in progress
Documenting the process of creating a Python monorepo1 with tooling that supports an enjoyable developer experience.
Most applications consist of more than a single deployable component. For example, a public API for a web app, a private API for internal consumption and job processors just to name a few.
These components are often closely related within a domain2. One option is to define this domain once and distribute it to the components as a package (e.g. NPM3, PyPI4, etc.). Any changes to the package will need to be redistributed to the components as a new version, requiring an update to those components. If the package changes often, the developer experience of making a change and then updating the components quickly becomes toilsome.
Alternatively we could keep all the deployable components and the domain definition together in a single repository. This reduces the friction of sharing the domain with the components. The support for monorepos differs between programming languages, with this post focusing on to achieve such with Python.
The repository will be structured around Clean Architecture5. This approach defines a core consisting of the domain models, interfaces and business logic handlers. The core interfaces are then implemented in an infrastructure layer, where specific technologies and dependencies are chosen. Finally, the deployable components (e.g. API) depend on the infrastructure and core layers to provide the functionality for the presentation layer.
The example domain will be an address book where we can save addresses. The structure will be as follows:
The monorepo will support the following features:
- Simple and predictable dependency management
- Easy to add, remove and document packages
- Consistent package versions installed
- Debugging should be easy
- Enforce consistent best practices
- Sorting imports
- Automated fixes
- Enforce consistent style for readability
All the above should support both the command line and an integrated development environment (IDE), Visual Studio Code6 in this case. Command line support is important to enable scripting for use in CI/CD pipelines as well provide an escape hatch to prevent IDE lockin.
Simple and predictable dependency management
A simple approach to Python package management would be to use
pip7. This is fine if you work on a single Python repository on your machine. Dependency conflicts can arise once you start working on multiple Python repositories as dependencies are installed into the global scope. To isolate the dependencies of each repository, we can use virtual environments (venv)8 - which requires some ceremony to ensure we are installing and running Python within the venv.
Poetry9 is a dependency management tool for Python that abstracts away the above complexity, providing a consistent interface for installing and running Python code.
A new project can be created using:
poetry new [project-name]
The example has two projects
Each has a
pyproject.toml that uses Poetry as the
build-system as Poetry and defineds the dependencies used.
[tool.poetry] name = "libs" version = "0.1.0" description = "" authors = ["tsukiy0 <[email protected]>"] [tool.poetry.dependencies] python = "^3.9" [tool.poetry.dev-dependencies] black = "^21.12b0" Faker = "^11.3.0" flake8 = "^4.0.1" isort = "^5.10.1" mypy = "^0.931" pytest = "^6.2.5" pytest-asyncio = "^0.17.0" [build-system] requires = ["poetry-core>=1.0.0"] build-backend = "poetry.core.masonry.api"
By default, Poetry does not install packages in a local venv folder in the project. This can be an issue for IDEs that resolve tools from a local venv, resulting in broken tool intergrations like linting, formatting and testing. To fix this, we can configure Poetry to install to a local venv10 in a
[virtualenvs] in-project = true
poetry add [dependency-name]
poetry remove [dependency-name]
poetry env remove .venv poetry install
poetry run pytest
poetry run flake8 ./[project-name] ./tests poetry run isort --profile black .
poetry run mypy .
poetry run mypy .