In my previous two posts, I’ve gone over what package types python has, and how the package building works, especially with the introduction of the PEP-517/518. Although the changes were primarily to make things more robust, we did run into a few issues while implementing it and releasing it. This post will go over a few, hopefully serving as lessons learned for all of us and presenting some interesting problems to solve in the future.
Looking at the changes of PEP-517 and PEP-518, one can identify those build backends (aka setuptools, flit) had very little to do, only also to expose their functionality via a Python module. Most heavy work is on the build frontend, which now needs to generate the isolated Python and then invoke the build backends in a new way. When we’re talking about build frontends nowadays, our options are mostly pip or poetry (and tox for developers).
These projects are maintained by the community, by a handful of active developers, in their free time. They are not getting paid for it, and they need to be careful to consider the myriad ways these tools are used. Considering this, it’s not much of a surprise it took almost two years after the PEP acceptance to come out with a first implementation. Planning, testing, and implementation have been going on for over a year in the background.
Against all the preparations, though, inevitably, the first release did break a few packages, mostly where people performed some operations that caught the maintainers by surprise. Let’s try to understand a few of these examples and how did they get addressed.
PEP-518 Link to heading
The PEP introduces the TOML file format. A format specially created to be easy to
read/write configurations. While packaging configuration is exposed under the
build-system section, other tools are
free to put their configuration under the
tool:name section if they own the PyPi namespace for the name. Various tools
started to take advantage of this right away (such as towncrier,
When pip 18.0 (released 2018 July 22) added support for PEP-518 packages
pyproject.toml initially broke, as the PEP-518 mandated that all packages having the
must specify the
build-backend section. But packages beforehand used it only as a configuration file for these
other projects since they didn’t specify it pre-emptively; when pip ran into these files, it just raised errors
complaining of invalid
PEP-517 Link to heading
The pip wheel cache issue. Link to heading
The way pip installs in a PEP-517 world is first to generate a wheel and then extract that. To be in PEP-517 world one
must specify the
build-backend key. Otherwise, all frontends per specification need to fallback to using the
When pip builds wheels, it does it by default via a caching system. This is a speed-up mechanism so that if multiple virtual environments need the same wheel, we don’t keep rebuilding it but instead re-use it. The PEP-517 wheel build operation also takes advantage of this.
This becomes troublesome, though, when you disable the cache. Now there’s no target folder where to build the wheel. So the build itself fails see the attached issue. The problem manifested early, though, but en masse, as most CI systems run with this option turned on. Just a day later pip 19.0.1 fixed this.
pyproject.toml not being included into setuptools
Link to heading
It turns out there’s more work on the build backend than just expose their API as described in PEP-517. The backend also
needs to ensure that
pyproject.toml is attached to the built source package. Otherwise, the build backend on the user
machine will not be able to use it. setuptools 1650 will fix this for
setuptools. One can include
pyproject.toml by specifying it inside
MANIFEST.in on earlier versions.
Importing the built package from within
Link to heading
Another unexpected issue was when a package was importing from within
setup.py. The version of the package by
convention is exposed both as metadata for the package (in case of setuptools inside the
setup.py as the
argument to the
setup function), but also under the
__version__ variable at the root of the package. One could
specify the content of the variable in both places, but then it becomes troublesome to keep it in sync.
As a workaround, many packages started putting it inside a
version.py at the root of the package, and then import it
from mypy.version import __version__ as version from both the
setup.py and the package root. This worked because
when someone calls a python script, the current working directory is automatically attached to the
sys.path (so you
can import stuff exposed underneath it).
This behavior of adding the current working directory though was never mandated. It was more of a side-effect as calling
the build via
python setup.py sdist. As this behavior is a side-effect (not a guarantee) all projects that import from
setup.py should explicitly add the scripts folder to the
sys.path, at the start of the build.
It’s up for debate if importing the built package during the packaging (when it’s not yet built/distributed) is a good
idea or not (though the Python Packaging group is leaning towards it’s not). Nevertheless, the fact of the matter is
setuptools exposed its interface via the
setuptools.build_meta, it chooses not to add the current working
directory to the system path.
The PEP never mandates for the frontend to do this addition, as most build backend (declarative by nature) will never
need this. Such functionality was deemed to be under the responsibility of the front end.
think if users want this functionality, they should be explicit in their
setup.py and prepend the respective path to
To simplify the pip code base pip decided to opt in into PEP-517 all people having a
pyproject.toml into the
setuptools backend. Now with this issue even packages that haven’t opted in to PEP-517 started to break. To fix this,
setuptools added a new build backend (
setuptools.build_meta:__legacy__) designed to be used by frontends as a
default when the build backend is left unspecified; when projects add the
build-backend key, they will have to also
setup.py to either add the source root to their
sys.path or avoid importing from the source root.
self bootstrapping backends Link to heading
Another interesting problem was raised that has a much tighter user base but exposes an interesting problem. In case we
don’t want to use wheels, and we provide only via source distribution, how should we resolve the problem of how we
provide the build backend’s build backend? For example,
setuptools packages itself via
setuptools. So were
setuptools specify this via PEP-517, the build frontend would be put inside an infinite loop.
To install the library
pugs it would first try to create an isolated environment. This environment needs
so the build frontend will need to build a wheel to satisfy it. The wheel build would itself trigger the creation of an
isolated environment, which has build dependency again
How to break this loop? Mandate all build backends must be exposed as wheels? Allow backends that can build themselves? Should these self-build backends allow to take on dependencies? There’s a long discussion with various options, pros and cons, so if you’re interested, make sure to head over the python Discourse board and give your opinion.
Conclusion Link to heading
Packaging is hard. Improving a packaging system without any breakage where users can write and run arbitrary code during the packaging in their free time is probably impossible. With PEP-518, though now build dependencies are explicit and build environments easy to create. With PEP-517, we can start moving into a more declarative packaging namespace that allows less space for users to make mistakes and provide better messages when things inevitably go wrong. Granted, as we go through these changes, some packages might break, and we might disallow some practices that worked until now. We (PyPa maintainers) don’t do it in bad faith, so when errors do pop up please do fill in a detailed error report with what went wrong, how you tried to use it, and what is your use case.
We’re trying to improve the packaging ecosystem genuinely. As such, we’ve created the integration-test repository, as an effort to ensure that in the future, we can catch at least some of these edge cases before they land on your machine. If you have any suggestion or requirements for any part of the packaging feel free to start a discussion on the Discuss Python forums packaging section, or open an issue for the relevant tool at hand.
That’s all for now. Thanks for reading through it all! I would like to thank Paul Ganssle for reviewing the packaging series post, and Tech At Bloomberg for allowing me to do open source contributions during my working hours.