0020 Sources for Python Packaging

Sources for Python packaging

Summary

Default to not using PyPI for Python package sources and only use the platform if there is no other alternative.

Motivation

Historically, Arch Linux has relied upon source distribution (sdist) tarballs hosted on PyPI for its Python packages.

However, over the years the use of the platform became more and more burdensome for packagers:

  • stable, predictable download links were no longer publicly advertised (instead download links with hashes are advertised over the web UI)
  • the availability of OpenPGP signatures for sources were deemed to obscurity by not showing them in the download section
  • eventually existing OpenPGP signatures were no longer available since they were deprecated in May 2023 (https://blog.pypi.org/posts/2023-05-23-removing-pgp/)

Moreover, several issues exist with sdist tarballs, that do not occur with upstream provided sources:

  • some upstreams suffer from having sdist tarballs that can not be used for packaging (e.g. missing license file, missing tests). These problems are often introduced by the multitude of tooling to configure for creating sdist tarballs (e.g. MANIFEST.in files). Although easy to solve, in practice the solution to these issues depends on upstream's availability and willingness.
  • the contents of sdist tarballs are the product of an arbitrary process, that is governed by rulesets which may differ among tools and are run on arbitrary machines (or a pipeline).
  • some files in sdist tarballs, such as setup.py and setup.cfg are formatted unconditionally by tooling (e.g. setuptools#3672). This makes it very cumbersome to patch the files using upstream-available patches, duplicating work for packagers.

To ease the workload on packagers, which try to get functional sdist tarballs marshalled through PyPI, we should switch to relying on upstream sources as a default instead and actively discourage the use of sdist tarballs.

Specification

By default, packagers are advised to rely upon the sources of the respective upstream projects directly. The use of sdist tarballs hosted on PyPI is strongly discouraged.

Upstream sources may be for example:

  • auto-generated or special-purpose tarballs and optional signature files
  • specifically pinned commits (e.g. those of tags), which may also be signed

Exceptions to the above should only be considered in rare cases, after contacting upstreams about issues and finding no resolve:

  • upstream does not provide source tarballs or commits
  • sdist tarballs contain specifically crafted files, that can not be obtained through upstream sources

Drawbacks

Some upstreams are either not responsive, have no freely accessible or usable sources or are outright unmaintained. In those cases it may be very hard to get workable upstream based sources in a timely manner.

Unresolved Questions

Alternatives Considered

We may continue using sdist tarballs from PyPI, that did not provide signature files. However, this does not provide a clear policy for packagers and increases their workload in cases where the sdist tarballs are broken.