Building and testing at scale often involves codebases with millions of lines of code, worked upon by thousands of developers. In this environment, it is a challenge to ensure that:
- Every change made by every developer is built and tested, through a continuous integration process.
- The work performed by one team is seamlessly integrated with work from all other teams in a given organisation.
- That the time elapsed for every developer to obtain feedback on the change that they’ve made is as short as possible.
Creating a distributed build and test framework
One solution to the problem of building as quickly as possible is simply delegating that responsibility — not to the machine that the developer is working on, or even to a server that the CI job has been dispatched to, but to a server farm. These machines all pool their resources to collectively perform build tasks faster than a single machine could, whilst caching their results so that compiler actions do not need to be repeated.
Such an idea is not new, with tools such as ccache and distcc existing for over 15 years. However, these tools have a fairly narrow focus, restricted to the dispatch of compilation tasks from clang or gcc to a battery of servers, and the caching of those results. This is an important part of build, but is by no means the only consideration. For this to be useful for developers in large projects there needs to be:
- Consistency in environment: The build and test environment must be well defined and trusted. There must be no ambiguity in the results of a change, because of differences in developer and/or CI environment.
- Consistency in performance: The distributed build and test environment must be elastic enough in scale, such that, irrespective of the location of the change in the dependency stack, the time to build and test that change is predictable and bounded.
- Consistency in infrastructure: As the complexity of the codebase increases, the testing requirements increase, as does the amount of infrastructure required. Maintaining separate build and test harnesses is challenging and often results in duplication of infrastructure. Ideally, a distributed build and test environment should be a single environment, maintained by a single team.
To fulfil the above requirements, we require a toolkit, not just a single tool.
Creating a toolkit with the remote execution API
The remote execution API (REAPI) is a protobuf-based, open source API that provides a consistent way to manage the execution of binaries on a remote system. With a community of contributors, the remote execution API has created an ecosystem of clients and servers that can serve as a toolkit for a variety of build and test problems. However, what is interesting is that these clients focus on different use cases.
If your use-case is relatively simple, and you are simply concerned about making your C/C++ compilation faster, then tools such as recc and goma occupy that problem space, similar to tools such as ccache and distcc. However, where the complexity of the application being developed is large, a tool that has support for fast, incremental builds may be important, to facilitate consistency of performance. Bazel is the most well established and mature tool that exists in this space, but there are a variety of other tools, including pants, buck, and please. Finally, if you are concerned about the consistent construction of environments, then an integration tool such as BuildStream is useful, with support for multiple build systems, to create systems in multiple output formats, in a sandboxed environment.
To serve these clients with different use-cases we want a single, consistent infrastructure. This should be able to be elastically scaled, and be flexible in deployment, either to cloud services, or via an on premises deployment. In terms of implementations, there exist several, including Buildbarn, Buildfarm and Buildgrid.
Follow our news about Build Engineering
Complete the form and receive in your inbox our latest updates about Build Engineering, BuildStream and Open Source.
Related to the blog post:
- We are hiring: Software Engineers >>
- Introducing the Remote Execution API Testing Project: Testing Bazel's Remote Execution API >>
Other Content
- A new way to develop on Linux - Part II
- GUADEC 2024
- Developing a cryptographically secure bootloader for RISC-V in Rust
- Philip Martin, Meet the Team
- Improving systemd’s integration testing infrastructure (part 1)
- A new way to develop on Linux
- RISC-V Summit Europe 2024
- Safety Frontier: A Retrospective on ELISA
- Codethink sponsors Outreachy
- The Linux kernel is a CNA - so what?
- GNOME OS + systemd-sysupdate
- Codethink has achieved ISO 9001:2015 accreditation
- Outreachy internship: Improving end-to-end testing for GNOME
- Lessons learnt from building a distributed system in Rust
- FOSDEM 2024
- Introducing Web UI QAnvas and new features of Quality Assurance Daemon
- Outreachy: Supporting the open source community through mentorship programmes
- Using Git LFS and fast-import together
- Testing in a Box: Streamlining Embedded Systems Testing
- SDV Europe: What Codethink has planned
- How do Hardware Security Modules impact the automotive sector? The final blog in a three part discussion
- How do Hardware Security Modules impact the automotive sector? Part two of a three part discussion
- How do Hardware Security Modules impact the automotive sector? Part one of a three part discussion
- Automated Kernel Testing on RISC-V Hardware
- Automated end-to-end testing for Android Automotive on Hardware
- GUADEC 2023
- Embedded Open Source Summit 2023
- RISC-V: Exploring a Bug in Stack Unwinding
- Adding RISC-V Vector Cryptography Extension support to QEMU
- Introducing Our New Open-Source Tool: Quality Assurance Daemon
- Long Term Maintainability
- FOSDEM 2023
- Think before you Pip
- BuildStream 2.0 is here, just in time for the holidays!
- A Valuable & Comprehensive Firmware Code Review by Codethink
- GNOME OS & Atomic Upgrades on the PinePhone
- Flathub-Codethink Collaboration
- Codethink proudly sponsors GUADEC 2022
- Tracking Down an Obscure Reproducibility Bug in glibc
- Web app test automation with `cdt`
- FOSDEM Testing and Automation talk
- Protecting your project from dependency access problems
- Porting GNOME OS to Microchip's PolarFire Icicle Kit
- YAML Schemas: Validating Data without Writing Code
- Deterministic Construction Service
- Codethink becomes a Microchip Design Partner
- Hamsa: Using an NVIDIA Jetson Development Kit to create a fully open-source Robot Nano Hand
- Using STPA with software-intensive systems
- Codethink achieves ISO 26262 ASIL D Tool Certification
- RISC-V: running GNOME OS on SiFive hardware for the first time
- Automated Linux kernel testing
- Native compilation on Arm servers is so much faster now
- Full archive