Tue 15 December 2020

Building the abseil-hello Bazel project for a different architecture using a dynamically generated toolchain

Bazel and BuildStream are tools which are commonly used to organize and delegate the building of software. There are a few key differences between them but largely they are concerned with the same development problems. A more complete discussion of how the two can work together was presented by Daniel Silverstone at BazelCon 2019.

In order to execute builds, Bazel requires a statically defined toolchain which must be built, deployed, and configured independently of the project. This is actually a strength of Bazel since it is toolchain-agnostic. However it necessarily introduces a burden on developers: they must develop some external mechanism to control, produce, and configure the toolchain with an acceptable level of confidence in the provenance and correctness to ensure a homogenous development experience across their organization. That must be duplicated for every toolchain required.

Buildstream was designed with a focus on system integration by leveraging existing build systems such as Bazel. In this way it is situated such that it can be used to integrate products from many different build systems. One of the main use cases from the Bazel developers perspective is the consumption of products from external builds without the need to develop bespoke rules such as rules_foreign_cc, change the build of that project, or provide custom Bazel packaging for it. Toolchains provide a ready example of this type of problem.

The abseil-hello project provides a simple 'hello world' using Abseil's string-implementation library. This is provided with packaging for Bazel. A fork of abseil-hello with Bazel packaging has been developed that is built and executed remotely on an x86_64 host using a dynamically configured toolchain (bazel-toolchains-fdsdk) for both aarch64 and x86_64 targets. The toolchains themselves are specified by a BuildStream project; building and configuring them for Bazel is handled with minimal developer interaction by leveraging a new Bazel extension, rules_buildstream. Note that these specific toolchains are used for demonstration purposes only: any toolchain used for an actual project should produce binaries that are ABI-compatible with the execution platform and ABI-compatibility would be required with any linked libraries.

rules_buildstream

rules_buildstream is a Bazel extension providing a repository rule for loading BuildStream artifacts as an external repository. For example the BuildStream element hello.bst can be built and checked out in the path of Bazel external repositories by invoking bst_element in the WORKSPACE:

bst_element(
    name = "my-bst-project",
    element = "hello.bst",
    project_dir = "/path/to/my-bst-project"
)

The Bazel label of the artifacts above is my-bst-project and the absolute path to the BuildStream project providing hello.bst is passed as a parameter.

When this workspace is loaded, Bazel delegates the build and artifact checkout of the hello.bst element to BuildStream via a shell. The output is then avaialable on the local filesystem in the path of the external repositories (addressable by @my-bst-project//my-bst-project) without requiring additional intervention. Additional configurations can also be passed to BuildStream via the rule parameters.

bazel-toolchains-fdsdk

The bazel-toolchains-fdsdk project is an attempt to provide dynamic rules_cc toolchain configuration for Bazel for both aarch64 and x86_64 targets. The toolchains are defined as BuildStream projects and can be automatically built, deployed to the local system and configured for Bazel consumption.

The toolchain itself is defined by BuildStream elements which package the necessary binaries. The elements defining the toolchain are based on elements of the freedesktop-sdk project which is consumed here as a BuildStream junction element. Because the entire toolchain is defined in BuildStream, specific versions of build tools can be controlled in source control methods such as git. Additionally the toolchain can be easily changed without affecting the Bazel configuration. For example executing a BuildStream build with bst build toolchain-x86_64-deploy.bst is sufficient to build the x86_64 toolchain with both good guarantee of provenance and correctness.

The configuration necessary for registering the toolchain with Bazel is provided by the gen_build_defs() rule:

load("@fdsdk_toolchain_repo//toolchain:gen_build_defs.bzl", "gen_build_defs")

gen_build_defs(
  name = "fdsdk",
  arch = "x86_64",
  archive = "@bazel-toolchain//bazel-toolchain",
  parent_repo = "@fdsdk_toolchain_repo",
  build_template_file = "@fdsdk_toolchain_repo//toolchain:BUILD.in",
  wrapper_template_file = "@fdsdk_toolchain_repo//toolchain:wrapper.in",
)

register_toolchains("@fdsdk//:cc-toolchain")

In the above invocation, the archive is the label of the external repository providing the toolchain files. The toolchain can then be specified for a Bazel project using --crosstool_top=@fdsdk//:fdsdk --cpu=x86_64 on the bazel cli or in a .bazelrc configuration.

Building the abseil-hello project locally

The abseil-hello Bazel WORKSPACE specification has been modified to make use of both the dynamic toolchains and the BuildStream extension. The demo can be locally built for an x86_64 target using --cpu=x86_64 --crosstool_top=@fdsdk//:fdsdk and for an aarch64 target with --cpu=aarch64 --crosstool_top=@fdsdk_aarch64//:fdsdk.

Both bst_element and gen_build_defs are loaded via from external repositories using git_repository from the standard @bazel_tools repository.

load("@bazel_tools//tools/build_defs/repo:git.bzl", "git_repository")

git_repository(
    name = "rules_buildstream",
    branch = "master",
    remote = "https://gitlab.com/celduin/buildstream-bazel/rules_buildstream.git",
)

load("@rules_buildstream//bst:defs.bzl", "bst_element")

git_repository(
  name = "fdsdk_toolchain_repo",
  tag = "0.1.1",
  remote = "https://gitlab.com/Celduin/buildstream-bazel/bazel-toolchains-fdsdk",
)

load("@fdsdk_toolchain_repo//toolchain:gen_build_defs.bzl", "gen_build_defs")

Additionally, the bazel-toolchains-fdsdk project has been deployed to /tmp. Within that project, the toolchains are defined by the elements toolchain-x86_64-deploy.bst and toolchain-aarch64-deploy.bst for the x86_64 and aarch64 toolchains respectively. These elements are built and checked out as external repositories by invocation of bst_element:

bst_element(
  name = "bazel-toolchain",
  element = "toolchain-x86_64-deploy.bst",
  timeout = 14400,
  quiet = False,
  project_dir = "/tmp/bazel-toolchains-fdsdk",
)

bst_element(
  name = "aarch64-toolchain",
  element = "toolchain-aarch64-deploy.bst",
  timeout = 14400,
  quiet = False,
  project_dir = "/tmp/bazel-toolchains-fdsdk-arm",
)

The toolchains are subsequently available at @<name>//<name> (where <name> is the value of the name parameter). Invocation of gen_build_defs automatically provides configuration for the toolchains which can be registered with Bazel as cc-toolchains:

gen_build_defs(
  name = "fdsdk",
  arch = "x86_64",
  archive = "@bazel-toolchain//bazel-toolchain",
  parent_repo = "@fdsdk_toolchain_repo",
  build_template_file = "@fdsdk_toolchain_repo//toolchain:BUILD.in",
  wrapper_template_file = "@fdsdk_toolchain_repo//toolchain:wrapper.in",
)

gen_build_defs(
  name = "fdsdk_aarch64",
  arch = "aarch64",
  archive = "@aarch64-toolchain//aarch64-toolchain",
  parent_repo = "@fdsdk_toolchain_repo",
  build_template_file = "@fdsdk_toolchain_repo//toolchain:BUILD.in",
  wrapper_template_file = "@fdsdk_toolchain_repo//toolchain:wrapper.in",
)

register_toolchains("@fdsdk//:cc-toolchain")
register_toolchains("@fdsdk_aarch64//:cc-toolchain")

remote builds

The demo and the toolchains can be built remotely by a REAPI worker which has support for the Bazel and BuildStream clients. The remote endpoints in use in the pipeline were provided by the celduin-infra project. At the time of writing this, these endpoints no longer exist. These can be reproduced in a container using kind by following the instructions at celduin-infra.

The local BuildStream client is configured for remote-execution. The local Bazel client consumes a similar configuration specifying the worker:

build:rex --remote_executor=grpcs://push.public.aws.celduin.co.uk:443
build:rex --remote_instance_name=remote-execution

# The available platform properties are defined at https://gitlab.com/celduin/infrastructure/celduin-infra/-/blob/master/kubernetes/clusters/cluster.libsonnet#L123
build:rex --remote_default_exec_properties=OSFamily=Linux
build:rex --remote_default_exec_properties=container-image=docker://marketplace.gcr.io/google/rbe-ubuntu16-04@sha256:6ad1d0883742bfd30eba81e292c135b95067a6706f3587498374a083b7073cb9

Executing the build for the rex configuration then delegates the toolchain build and the construction of the hello-world project to the worker:

bazel build --tls_client_key=${GITLAB_CAS_PUSH_KEY} --tls_client_certificate=${GITLAB_CAS_PUSH_CERT} --config=${ARCH} --config=rex :hello_main

The log of this build for the aarch64 target can be seen at https://gitlab.com/celduin/buildstream-bazel/abseil-hello-fork/-/jobs/654643209. The execution of the binary produced is then demonstrated at https://gitlab.com/celduin/buildstream-bazel/abseil-hello-fork/-/jobs/654643212.

The BuildStream/Bazel ecosystem

In this example, Buildstream is used to dynamically produce and configure an entire toolchain for Bazel. The definition of these toolchains is in a human readable form which is easily controlled using common SCMs such as git. Consequently the additional burden of building and configuring the toolchain is removed from the developer. Additionally, this example demonstrates that it is possible to delegate the toolchain build (and in fact the build of the entire project) to a remote worker. Combining this with remote caching can remove many of the issues of the heterogeneous developer environments across organizations: the remote worker nodes can be replicated and retrieve toolchains (and other build artifacts) from federal remote caches. This not only can greatly accelerate builds but also adds assurance in toolchain provenance and so in build correctness.

In the future hybrid BuildStream/Bazel build techniques may be used to manage all third party dependencies: Use of the bazelize plugin can be used to provide some level of native Bazel packaging for BuildStream artifacts. This can be used immediately to provide packaging for C/C++ libraries which are a common external dependency for Bazel projects. It's possible that this may also be adapted to provide additional toolchain handling. You can read more about this plugin at the Codethink blog.

Related blog posts:

Other Articles

Get in touch to find out how Codethink can help you

sales@codethink.co.uk +44 161 660 9930