clang-tidy is a wonderful tool to run static analysis on C/C++ code. The main downside is that it takes quite a long time to run. For our project, it requires around 25 minutes to do a full build and more than 40 minutes to do a full clang-tidy check. This year, we began to move our build system to Bazel. Naturally, we want to run clang-tidy with Bazel. Leveraging Bazel’s great dependency tracking and remote cache support to reduce the clang-tidy check time.

There is a great lib for running clang-tidy with Bazel named bazel_clang_tidy. It’s very easy to set up and easy to use. The only downside of it is that the check is not reproducible. In this article, we will share how we update it to support reproducible check.

Before introducing our modification we will first explain how bazel_clang_tidy works briefly. bazel_clang_tidy uses aspects to extract source files and compilation flags from a C/C++ target. Then call clang-tidy to check the source files with the compilation flags.

The main reason it’s not reproducible is that it uses the clang-tidy in PATH and people may have different versions of clang-tidy installed. In order to have reproducible results, we need to know which version of clang-tidy we are using and make sure if the version changed the Bazel would detect it.

Inspired by android_ndk_repository rule, we wrote a llvm_repo repository rule in a file named llvm.bzl. The llvm_repo requires an attr named llvm_version . This is the clang-tidy version we would expect. And an environment variable named LLVM_HOME, which points to the directory that contains bin/clang-tidy.

_llvm_repo_attrs = {
    "llvm_version": attr.string(doc='LLVM version', mandatory=True)
}

llvm_repo = repository_rule(
    implementation = _llvm_repo_impl,
    attrs = _llvm_repo_attrs,
    doc = "Setup llvm repo.",
    environ = ["LLVM_HOME"],
)

In the implementation of llvm_repo , we will first check if the clang-tidy in LLVM_HOME is the same asllvm_version. If it doesn’t match, we will call fail.

def _llvm_repo_impl(ctx):
    """Implementation of the llvm_repo rule."""
    llvm_home = ctx.os.environ["LLVM_HOME"]
    llvm_version = ctx.attr.llvm_version
    clang_tidy_version = ctx.execute(["{0}/bin/clang-tidy".format(llvm_home), "--version"])
    if clang_tidy_version.return_code != 0:
        fail("Failed to run clang-tidy.")
    if "LLVM version {0}".format(llvm_version) not in clang_tidy_version.stdout:
        fail("LLVM version not match.")

If the version matches, we will create a symlink inside llvm_repo points to the LLVM_HOME.

def _llvm_repo_impl(ctx):
    """Implementation of the llvm_repo rule."""
    ...
    ctx.symlink(llvm_home, "llvm-"+llvm_version)

In the origin bazel_clang_tidy lib, it has a sh_binary to run clang-tidy. The wrapper script is needed to create the declared output file for Bazel even if there is no error reported. We still need this script, but we need to update it so it will use the clang-tidy in llvm_repo .

We first declare the clang-tidy in llvm_repo as a data dependence of run_clang_tidy .

sh_binary(
    name = "clang_tidy",
    srcs = ["run_clang_tidy.sh"],
    data = [":llvm-{llvm_version}/bin/clang-tidy", "//:clang_tidy_config"],
    tags = ["no-sandbox"],
    visibility = ["//visibility:public"],
    deps = ["@bazel_tools//tools/bash/runfiles"],
)

Then we use the runfiles library to get the real path of clang-tidy

set -ue

# Usage: run_clang_tidy <OUTPUT> [ARGS...]

OUTPUT=$1
shift

# clang-tidy doesn't create a patchfile if there are no errors.
# make sure the output exists, and empty if there are no errors,
# so the build system will not be confused.
rm -f $OUTPUT
touch $OUTPUT

$(rlocation "{repo_name}/llvm-{llvm_version}/bin/clang-tidy") "$@"

Note in order to use the runfiles library you need to append the initialization script of runfiles library before your script. It’s omitted here.

For runfiles library to find the file in an external repo, the repository name is needed. So we use the ctx.file to dynamically create the build file and the run_clang_tidy.sh in llvm_repo. We put the full llvm.bzl in the end.

Now we can tell the bazel_clang_tidy to use the clang-tidy from llvm_repo. Assuming we put the llvm.bzl file in root directory of the repo(you can put it anywhere you like), we load it in WORKSPACE and set up llvm_repo.

load("//:llvm.bzl", "llvm_repo")

llvm_repo(
    name = "llvm",
    llvm_version = "13.0.1",
)

Then tell bazel_clang_tidy to use clang-tidy from llvm_repo with the @bazel_clang_tidy//:clang_tidy config flag.

build:clang-tidy --@bazel_clang_tidy//:clang_tidy=@llvm//:clang_tidy

If we want a hermetic check, we can package clang-tidy and upload it to a server, then use ctx.download_and_extract in llvm_repo implementation to download it. But so far we’re happy with creating a symlink to a pre-installed clang-tidy.

Another problem is that the generated result file contains the absolute paths of the source codes and build directories. We use a very simple way to fix this. Since all Bazel actions are running in execution root and it’s the execution root part we want to remove from the paths in result files. We simply call pwd to get the current execution root path and use sed to replace it to a reproducible value in file.

We replace it with the workspace name but you can choose whatever you want.

$(rlocation "{repo_name}/llvm-{llvm_version}/bin/clang-tidy") "$"
EXECUTION_ROOT="$(pwd)"
sed -i '' "s=$EXECUTION_ROOT=$WORKSPACE_NAME=g" "$OUTPUT"

That’s it. The modified bazel_clang_tidy is far from perfect but works great in our project. The average analysis time reduced dramatically. Hope this can help your project too. Thanks!

llvm.bzl

BUILD_CONTENT = """
sh_binary(
    name = "clang_tidy",
    srcs = ["run_clang_tidy.sh"],
    data = [":llvm-{llvm_version}/bin/clang-tidy", "//:clang_tidy_config"],
    tags = ["no-sandbox"],
    visibility = ["//visibility:public"],
    deps = ["@bazel_tools//tools/bash/runfiles"],
)

filegroup(
    name = "clang_tidy_config_default",
    data = [
        ".clang-tidy",
        # '//example:clang_tidy_config', # add package specific configs if needed
    ],
)

label_flag(
    name = "clang_tidy_config",
    build_setting_default = ":clang_tidy_config_default",
    visibility = ["//visibility:public"],
)
"""


RUNFILES_INIT = """
# --- begin runfiles.bash initialization ---
# Copy-pasted from Bazel's Bash runfiles library (tools/bash/runfiles/runfiles.bash).
set -euo pipefail
if [[ ! -d "${RUNFILES_DIR:-/dev/null}" && ! -f "${RUNFILES_MANIFEST_FILE:-/dev/null}" ]]; then
  if [[ -f "$0.runfiles_manifest" ]]; then
    export RUNFILES_MANIFEST_FILE="$0.runfiles_manifest"
  elif [[ -f "$0.runfiles/MANIFEST" ]]; then
    export RUNFILES_MANIFEST_FILE="$0.runfiles/MANIFEST"
  elif [[ -f "$0.runfiles/bazel_tools/tools/bash/runfiles/runfiles.bash" ]]; then
    export RUNFILES_DIR="$0.runfiles"
  fi
fi
if [[ -f "${RUNFILES_DIR:-/dev/null}/bazel_tools/tools/bash/runfiles/runfiles.bash" ]]; then
  source "${RUNFILES_DIR}/bazel_tools/tools/bash/runfiles/runfiles.bash"
elif [[ -f "${RUNFILES_MANIFEST_FILE:-/dev/null}" ]]; then
  source "$(grep -m1 "^bazel_tools/tools/bash/runfiles/runfiles.bash " \
            "$RUNFILES_MANIFEST_FILE" | cut -d ' ' -f 2-)"
else
  echo >&2 "ERROR: cannot find @bazel_tools//tools/bash/runfiles:runfiles.bash"
  exit 1
fi
# --- end runfiles.bash initialization ---
"""


RUN_CLANG_TIDY_SH = """
set -ue

# Usage: run_clang_tidy <OUTPUT> [ARGS...]

OUTPUT=$1
shift

# clang-tidy doesn't create a patchfile if there are no errors.
# make sure the output exists, and empty if there are no errors,
# so the build system will not be confused.
rm -f $OUTPUT
touch $OUTPUT

$(rlocation "{repo_name}/llvm-{llvm_version}/bin/clang-tidy") "$@"
EXECUTION_ROOT="$(pwd)"
sed -i '' "s=$EXECUTION_ROOT=$WORKSPACE_NAME=g" "$OUTPUT"
"""


DOT_CLANG_TIDY = """
UseColor: true

Checks: >
    bugprone-*,
    cppcoreguidelines-*,
    google-*,
    performance-*,
HeaderFilterRegex: ".*"

WarningsAsErrors: "*"
"""


def _llvm_repo_impl(ctx):
    """Implementation of the llvm_repo rule."""
    llvm_home = ctx.os.environ["LLVM_HOME"]
    llvm_version = ctx.attr.llvm_version
    clang_tidy_version = ctx.execute(["{0}/bin/clang-tidy".format(llvm_home), "--version"])
    if clang_tidy_version.return_code != 0:
        fail("Failed to run clang-tidy.")
    if "LLVM version {0}".format(llvm_version) not in clang_tidy_version.stdout:
        fail("LLVM version not match.")
    ctx.file("BUILD",
             BUILD_CONTENT.format(llvm_version=llvm_version),
             executable=False)
    ctx.file("run_clang_tidy.sh",
             RUNFILES_INIT+RUN_CLANG_TIDY_SH.format(repo_name=ctx.name,
                                                    llvm_version=llvm_version),
             executable=True)
    ctx.file(".clang-tidy", DOT_CLANG_TIDY, executable=False)
    ctx.symlink(llvm_home, "llvm-"+llvm_version)


_llvm_repo_attrs = {
    "llvm_version": attr.string(doc='LLVM version', mandatory=True)
}


llvm_repo = repository_rule(
    implementation = _llvm_repo_impl,
    attrs = _llvm_repo_attrs,
    doc = "Setup llvm repo.",
    environ = ["LLVM_HOME"],
)