buildbtw build service#

Summary#

This RFC proposes a first vertical slice of a build service to streamline rebuilding packages with many reverse dependencies. It tackles problems like repetitive manual work, blocking staging repositories, missed reverse dependencies and lack of shared context, like build logs. Goals include automating rebuilds and build ordering, enabling unattended builds, and providing isolated repos for parallel work. Builds run in secure virtual machines via a GitLab executor, organized into namespaces with histories and automatic dependency graphs. Interaction happens through CLI (pkgctl) and a web UI with SSO.

Terms#

Reverse dependencies, or dependents of a given package are any other packages that directly or indirectly depend on that package.

A build namespace is a group of specific package version that need to be built. For more details, refer to the Specification section.

A dependency cycle is any non-empty path through a dependency graph where the first and last node are equal. A cyclic dependency graph is a graph that contains at least one cycle.

A pacman staging repository contains packages which are not intended for general public consumption. Packages from staging repositories are installed as dependencies when building new versions of other packages.

Motivation#

Changes in packages with lots of reverse dependencies can result in huge todo lists where lots of packages need to be rebuilt. Large parts of this are trivial, but manual and repetitive busy work.

Builds run on machines of package maintainers, or on a build server. In the second case, the tooling currently requires a persistent SSH connection while the build is running. As a consequence, package maintainers need to keep their machines running during builds.

Since all builds that go through [staging] or [testing] use the same shared pacman repositories for installing dependencies, large overlapping rebuilds can block other releases of packages while they are in progress.

It is possible to miss reverse dependencies when creating todo lists for rebuilds. The rebuild order is manual, may be non obvious and can be nontrivial to follow.

Collaborating on changes in package sources happens via GitLab, but build logs are not shared between collaborators by default, which makes investigation of output, warnings or errors cumbersome. Furthermore, each build needs to be dispatched manually.

The goal of buildbtw is to alleviate these problems by:

  • Creating a configurable and self-documenting process for determining which reverse dependencies to rebuild when a package changes
  • Automatically determining build order for any set of builds
  • Create infrastructure for unattended builds, allowing package maintainers to work asynchronously
  • Allow collaboration via package source diffs, build logs and discussions linked to build context
  • Provide an isolated pacman repository for each rebuild to allow working on builds in parallel
  • Increase confidence by building reverse dependencies that don’t need to be released in order to detect unexpected build and check failures

Scope#

To focus the discussion, this RFC covers only a subset of the functionality we envision and represents a vertical slice. Excluded topics are: GitOps, unattended signing & releasing, auto-resolving of complex dependency cycles, and explicit support for other architectures. These topics will be covered in future RFCs and are not planned for implementation in the near future.

Usually, RFCs should cover goals rather than the planned implementation. However, in this RFC we cover specific workflows and technical requirements because they will impact package maintenance. Some of these are technically implementation details, but we think it is important to discuss how they influence what the system can do and how users will interact with it.

Current State#

After extensive interviews with package maintainers we established a comprehensive set of user stories describing planned features, which led to the goals outlined above.

We’ve concluded the proof-of-concept phase, in which we built an experimental version of the concepts covered in the specification below. Our impression is that the workflows and technology will work well to fulfill the goals outlined in the motivation and collected from our user stories.

For ease of reading, the RFC uses present tense to describe the planned end result we want to build. Apart from basic scaffolding, the specification has not been implemented yet.

Specification#

Running Builds#

The system allows for building untrusted package sources without compromising security. Besides providing defense in depth, this will allow automatic CI builds for merge-requests opened by non-staff contributors in the future. To accomplish this, builds run inside a vmexec virtual machine which uses QEMU under the hood. To further increase confidence and reduce attack surface, build nodes can get labels which allow us to physically isolate and select nodes that build officially releasable packages, and nodes that build untrusted merge-requests to check if builds succeed.

buildbtw uses a GitLab runner custom executor (we’ll call this “GitLab executor” for short). This provides a web interface for following build logs in real time, and can distribute workload across multiple build servers. When builds run for merge-requests, this integrates with GitLab’s review UI as well.

Build Namespaces#

A “build namespace” is the basic unit of packaging work in buildbtw. It references an arbitrary number of “changesets”, which are git branches of the same name across all package source repositories.

For example, a namespace for upgrading git to a new version might contain a changeset referencing the branch git-2.51.1 in the git repository. If the upgrade breaks another package, e.g. git-crypt, the git-2.51.1 branch in the git-crypt repository might be added as a second changeset to the same namespace to incorporate a fix.

By default, a namespace builds all its changesets, as well as all reverse dependencies of its changesets. Not all results of these builds need to be released - these builds are run to detect unexpected build and check failures introduced by changes in the namespace.

In the example git namespace from above, this means the namespace would build dependents like forgejo, git-branchless, etc.

Each build attempt in a namespace is called an “iteration” containing a fixed, ordered set of packages to build. Each build inside an iteration references a fixed git commit. This provides an immutable history of build attempts over time, with pinned build instructions for each point in time.

New iterations are automatically created in the following cases:

  • Any of the packages involved get a new commit containing changed build instructions. This includes new commits for reverse dependencies.
  • New reverse dependencies which weren’t built before are added.
  • Changesets are added to or removed from the namespace.

For the example git namespace from above, an iteration would build git, git-crypt, and all their reverse dependents. If a build fails and a package maintainer pushes a new commit to fix the failure, a new iteration is created.

When any of the changesets in a namespace get new commits, a new iteration is created to check if the new sources build successfully. New iterations can be created manually to retry failed builds.

Each iteration builds against an isolated pacman repository into which build artifacts are uploaded. This means that parallel rebuilds will be independent of each other, and may be incompatible with each other. When uploading releases from namespaces to official repositories, package maintainers need to make sure the packages were built against the latest versions available in those repositories. To aid with this, buildbtw will show warnings for namespaces built against outdated dependencies, and for overlapping packages with other ongoing namespaces.

Namespaces record by which user they were created, but are open to modifications by all package maintainers.

Automatic Build Graphs#

When building an iteration in a namespace, buildbtw needs to determine two things:

  • Which packages might break due to the changes in this namespace and thus need to be rebuilt,
  • In what order to build the packages.

For this, a dependency graph with check dependencies, make dependencies and runtime dependencies is created. It consists of all packages explicitly listed in the changesets of the namespace, as well as all of their direct and indirect reverse dependencies of any kind.

Dependency graphs without cycles are built in topological order. For cyclic dependency graphs, there are tentative concepts for automatically breaking up the cycles, which will be discussed in a future RFC. Until then, buildbtw will not support building cyclic graphs.

Since these graphs can get very large and take considerable time to build, buildbtw provides options for pruning them, e.g. by excluding certain packages and their dependents. These options will also allow users to manually remove cycles from build graphs until this can be done automatically.

User Interface#

The primary mode of interaction with buildbtw happens via pkgctl. For browsing on-the-go, and viewing complex build graphs, a complementary web UI is presented. Both CLI and web UI show the statuses of currently running and past builds.

Web authentication works via single-sign-on provided by Keycloak. The CLI authenticates itself using API tokens managed via the web UI.

Compatibility with existing workflows#

buildbtw aims to be compatible with existing packaging workflows. While designed to eventually replace pkgctl’s current offload flag with a backwards-compatible improved version, packaging infrastructure should continue to allow building and releasing packages locally.

This means that buildbtw needs to take into account that packages manually released by developers can conflict with builds from buildbtw namespaces. The future solution for this is the release queue, which automatically keeps rebuilding packages slated for release on top of the most recent target repository state until no conflicts remain. Until the release queue is specified in a future RFC and implemented, existing strategies for preventing conflicts will need to remain, and adapt when working with per-namespace pacman repositories. Without the release queue, releasing packages will work like the current offload functionality: Artifacts are downloaded to the local package maintainer’s machine, signed, and uploaded to the dbscripts servers. Because rebuilds in namespaces are not uploaded directly into official repositories as before, overlapping rebuilds have increased chances of conflicting with each other. Our hope is that automating the build step for large rebuilds will already significantly reduce the time a pacman repository needs to be locked. One solution to this is to provide adequate visibility and warn package maintainers of packages overlapping in namespaces. They can then coordinate to schedule a new iteration for namespace B to trigger a full rebuild, once namespace A has been released to the official repositories. Another solution is to explicitly combine overlapping rebuilds into a single build namespace.

Building for Volunteer Maintenance#

The initial work on buildbtw is sponsored by Valve Corporation. As a critical piece of infrastructure, buildbtw can’t rely on continued sponsorship.

As such, one of the goals of the implementation is to attract volunteer contributors and create a welcoming environment:

  • Up-to-date and extensive documentation
  • Extensive, easy to run test suite
  • Straightforward setup for the development environment
  • Frequent outreach inside and outside the community
  • Guidance for getting started as a contributor, with good first issues, mentoring and extensive reviews
  • Remain as simple (KISS) as possible in respect to the complexity of a fully automated build service

Drawbacks#

Building inside virtual machines results in some performance overhead.

Building all transitive dependents by default requires considerable CPU time and memory. We hope to alleviate this by prioritizing builds that are slated for release over builds that only check for failures. Providing configuration options for pruning build graphs or disabling these checks altogether is an escape hatch, should the load prove to be too high, the time needed be too long, or other undesired impacts occur.

Unresolved Questions#

These topics are not planned for implementation in the near future, and will be covered in future RFCs. Nonetheless, all of these have been thoroughly considered in the specification of this RFC, and we’ve linked the corresponding GitLab issues or milestones for more information.

Alternatives Considered#

Building our own software involves a significant maintenance burden and commitment for the future. As such, we evaluated several existing solutions and tried to make them fill the requirements we have.

Open Build Service#

The Open Build Service (OBS) is a mature, actively maintained project which is used, among others, to build packages for openSUSE. It can run builds in KVM virtual machines and use PKGBUILDs to build Arch Linux packages. While OBS can automatically rebuild dependents of a package, there is no way to see a graphical representation of the dependency graph. Dependency cycles are shown as a list of packages, without showing the dependency relations between packages.

OBS supports many packaging ecosystems, but features for some specific distributions are more extensive. For example, support for rpmlint and lintian is built in, while support for makepkg lints or namcap is not. Arch Linux packages may require modification to build on OBS.

Of course, OBS is extensible, and most of Arch Linux tooling can be integrated in some way or another, but a software focused on non-Arch Linux use cases will still result in friction for Arch Linux package maintainers in day-to-day usage. For the same reason a build system geared towards the RPM ecosystem makes sense for other distributions, using a tailor-made solution will provide a better experience for Arch Linux package maintainers.

Buildbot#

Buildbot is a mature project used, among others, to build packages for Void Linux. It is described as a “framework for automating software build, test, and release processes”. It emphasizes flexibility and extensibility through python code. The initial stages of buildbtw’s proof-of-concept were built on top of buildbot. However, we quickly discovered a few properties that didn’t match our vision.

Buildbot is extremely flexible at configuration time: you can run arbitrary python code to create a complex configuration with a huge number of watched source repositories, dependencies between builds, multiple targeted architectures, and distribution onto multiple build workers. This configuration is created once, when the system starts. Our problem with this model is that we only know which packages we want to build at runtime, when a specific commit is created. The REST API provides no way to create new builds dynamically, and dependencies can not be declared during runtime at all. Mapping ad-hoc, dynamically created pacman repositories into this model can only be accomplished using a host of workarounds.

As a real-world example, buildbot’s static configuration model is reflected in the way Void Linux has configured their buildbot instance. Builds are only run after merge requests are merged, and builds for all packages are run in a single build job, so there is no way to easily see the build logs or status for an individual package. As far as we could determine, Void Linux does not build against per-rebuild staging repositories, but uses global staging repositories.