Bazel vs. CMake: Discover the advantages of Bazel migration
Krzysztof Romanowski
Head of Scala & Dev Tooling
Published: Apr 29, 2024|10 min read10 minutes read
While CMake has been a longstanding and widely used build system in the software development community, it does have its limitations that Bazel addresses and solves.
The decision to replace a vital piece of software like a build system demands rigorous examination and assessment of various factors. As experts in creating a seamless developer experience, we wanted to share our knowledge with you. We’ll delve into the primary drawbacks of CMake and how Bazel tackles these challenges head-on.
Let’s take a sneak peek.
Feature
CMake
Bazel
Cache
No (underlying tool-dependent)
Full (local and shared)
Incremental builds
No (underlying tool-dependent)
Yes
Remote execution
No
Yes (self-services and commercial)
Dependency management
Complicated & Confusing
Explicit & Methodic
Supported languages
C/C++, C#, Swift, ASM, and more
Wide variety such as C++, Java, Python, Go, Rust, Scala, Kotlin, Haskell, Swift
If you have ever found yourself wondering whether CMake suits your needs, chances are you have encountered its limitations and shortcomings firsthand. In such circumstances, you might be willing to investigate alternative solutions.
One of those alternatives is Bazel - a modern, open-source build system developed by Google.
Bazel supports many programming languages and - courtesy of Starlark, a Python-like language - offers flexibility and extensibility, enabling developers to customize and extend Bazel’s functionality to suit their project’s unique requirements.
A cache is an essential component of any modern build system. It's critical for enhancing build performance, especially in medium to large-scale projects where builds can be time-consuming.
By caching build results, redundant work is avoided, saving both time and money, and enabling more frequent software releases, thus delivering faster.
CMake itself lacks a built-in cache mechanism, as it is a build system framework that provides a platform-independent way to describe build process, which abstracts away the details of underlying build system (i.e. make on Unix, Visual Studio solution on Windows).
Consequently, the presence of caching functionality relies on the capabilities of the underlying build system invoked by CMake during the build.
While third-party tools like ccache can be integrated to introduce cache-like capabilities to an extent, such functionality is not available out-of-the-box.
One of the many impressive features of Bazel is its capability to cache build artifacts and intermediate build results, utilizing them for subsequent builds. Bazel uses two types of cache - local and shared (remote).
The concept of a remote cache, in particular, is intriguing as it addresses some limitations of local caching:
Unlimited storage space: The cache in Bazel can occupy significant disk space, sometimes reaching tens of gigabytes. Having a dedicated remote resource, such as in the cloud, allows for adjustable storage space based on current demand.
Efficiency: With a shared cache populated by each developer, there's no need to wait for the initial local build to populate the cache if the remote cache already contains the necessary build.
The support for incremental builds depends on the chosen build tool used to execute builds after CMake generates build recipes for a target configuration. Consequently, the effectiveness and performance of incremental builds in CMake projects can vary depending on the specific build system utilized.
On the other hand, Bazel's incremental builds represent a core feature that significantly improves development efficiency. This is achieved through a sophisticated graph, allowing Bazel to identify the components impacted by recent changes compared to the previous build.
Subsequently, it selectively rebuilds only the affected targets within the relevant subgraph, while retrieving unaffected parts from either the local or remote cache.
Bazel closely observes changes by conducting granular dependency analyses at the file and target levels. Therefore, when a source file is modified, Bazel accurately identifies the specific components that require rebuilding.
This meticulous approach ensures that Bazel builds remain robust and deterministic, providing the hermeticity that is lacking, among others, in CMake.
Distributed (also known as remote) builds revolutionized the software development process by distributing build tasks across multiple machines, harnessing the power of parallelization to improve build times and enhance scalability and productivity.
CMake lacks inherent support for remote execution. Similarly to incremental builds, the accessibility of distributed builds relies heavily on the underlying build tool utilized by CMake for compiling and linking.
In the case of Visual Studio, commercial solutions like Incredibuild offer distributed execution capabilities integrated into the build process.
Additionally, community-driven solutions and plugins exist to integrate CMake with remote execution platforms like BuildStream. However, these integrations are not part of the official CMake distribution but rather unofficial extensions.
Remote execution, like many other features of top-notch build systems, is a core functionality of Bazel, seamlessly integrated into its build architecture.
Its capabilities allow build tasks to be distributed to remote execution servers, which execute the tasks in isolated environments and return the results to the local machine. That, combined with remote caching of build artifacts, minimizes redundant work and boosts the build speed greatly.
Bazel offers multiple remote execution services, both commercial and self-service.
Dependency management lies at the heart of any build system, dictating how a project's external libraries, headers, and resources are handled during the build process. In this regard, Bazel's approach stands head and shoulders above CMake's traditional methods.
Dependency management in CMake often feels like navigating a labyrinth of manual configurations and workarounds. While CMake provides mechanisms for specifying dependencies within CMakeLists.txt files, the process is error-prone, cumbersome, and lacks the robustness needed for large-scale projects.
Developers are left grappling with intricacies such as finding and linking external libraries, managing header paths, and ensuring consistency across different platforms.
Dependency management in Bazel, amplified by tools like bzlmod, transforms the handling of external dependencies within projects. Bazel's methodical approach involves explicitly declaring dependencies in BUILD files, ensuring transparency and predictability, so crucial in medium and large-scale projects with tens of externals.
Bzlmod further streamlines this process by providing a developer-friendly interface for managing dependencies, allowing developers to specify them using familiar package management conventions like version constraints.
Its flexibility extends to handling dependencies from various sources, simplifying integration from remote repositories, local files, or custom sources.
Moreover, in languages like C++, it’s up to developers whether these dependencies shall be built from sources or re-used as pre-built binaries.
Additionally, bzlmod enhances discoverability through a centralized repository of commonly used libraries (Bazel Central Registry) and automates dependency updates, ensuring projects remain current.
This combination empowers developers to efficiently manage dependencies, enabling a focus on delivering high-quality software instead of spending time on resolving conflicts and investigating internals of a build tool.
In today's polyglot programming landscape, projects often span multiple programming languages, each with its own set of tools, conventions, and dependencies. Here again, Bazel emerges as a champion of interoperability and efficiency.
While CMake does support multiple languages (such as C/C++, C#, Swift or ASM), its approach to handling this diversity can feel disjointed. Developers encounter language-specific nuances, leading to potential inconsistencies and increased complexity. Integrating components written in different languages may require navigating separate conventions, resulting in added maintenance overhead.
In many cases, software written in various languages is built using different build systems, only to be combined during the packaging and deployment stage. Maintaining more than one build system poses risks and costs that can be reduced by migrating to Bazel.
Bazel's approach to multi-language support reflects its commitment to simplicity and unity. It offers an integrated interface for defining build targets and dependencies across languages, streamlining development efforts. What's more, adding support for a new language in Bazel is remarkably straightforward.
Bazel's extensible architecture allows developers to define custom rules for new languages with relative ease, enabling rapid integration of additional languages into the build system. Whether working with C++, Java, Python, Go, Rust, Scala, Kotlin, Haskell, Swift, or any other language, Bazel provides a consistent and intuitive experience. This unified approach facilitates seamless cross-language integration, empowering teams to harness the strengths of each language while maintaining a unified codebase.
The expanding Bazel community has developed rules for nearly every recognized programming language, ensuring that it's exceptionally rare to encounter a language that cannot be supported in Bazel. This demonstrates the inclusive nature of the entire ecosystem.
An instance is the provision of support for the Scala language, which is absent in CMake. For Bazel, there’s a pre-existing module that facilitates Scala integration effortlessly with just a few straightforward steps.
The ability to customize and extend functionality is essential for meeting the diverse needs of modern software projects.
Implementing custom rules and extensions in CMake requires writing CMake scripts or macros, which can be verbose and complex. While CMake provides mechanisms for defining custom commands and targets, the process may involve low-level details and manual configuration, leading to increased development time and maintenance overhead.
Bazel takes extensibility to the next level with its native support for Starlark, a powerful and expressive scripting language. Starlark allows developers to define custom rules and extensions using a concise and readable syntax, providing a higher level of abstraction compared to CMake's imperative scripting approach.
With Starlark, implementing custom build rules, transformations, and toolchains becomes a straightforward and intuitive process. Bazel's extensibility through Starlark enables developers to tailor the build system to their specific requirements, whether it's integrating with proprietary tools, defining domain-specific build logic, or extending Bazel's capabilities to support new languages and platforms.
Syntax plays a main role in ensuring clarity and simplicity in code, which in turn facilitates easier maintenance. A well-structured syntax enables developers to express their intentions clearly, making the code easier to understand for both themselves and others who may work with it in the future.
While CMake and Bazel are popular choices for managing build configurations, a closer examination of their syntax reveals distinct differences in complexity and clarity.
CMake, with its imperative scripting language, often presents challenges for developers due to its verbose and idiosyncratic syntax. Let's consider a simple example of defining a C++ executable target in CMake:
In this example, we see several CMake commands and constructs that may be unfamiliar to developers, such as cmake_minimum_required, project, set, and add_executable.
Additionally, managing dependencies and configuring build options in CMake can involve further complexities, further adding to the learning curve for developers, as above, where find_package(Boost REQUIRED COMPONENTS filesystem) locates the Boost package, specifying the required component (filesystem).
Then, target_link_libraries(MyExecutable PRIVATE Boost::filesystem) links the Boost filesystem library to our executable target.
In contrast to CMake's imperative approach, Bazel adopts declarative and intuitive syntax, powered by the Starlark scripting language.
Let's rewrite the previous example using Bazel's BUILD file and Starlark syntax:
1# BUILD file
2cc_binary(
3 name = "my_executable",
4 srcs = ["main.cpp"],
5 deps = [“@boost//:filesystem”],
6)
In this example, we define a C++ executable target using the cc_binary rule, specifying the source files directly, where “@boost//:filesystem" declares the dependency on the Boost filesystem library.
Bazel uses labels to reference external dependencies, where "@boost" refers to the Boost library's workspace directory.
The syntax is clean, concise, and intuitive, leveraging familiar Python-like constructs.
Furthermore, Bazel's rule-based approach promotes modularity and code reuse, enhancing maintainability and scalability in larger projects.
While Bazel offers countless benefits for software development projects, it is not without its drawbacks.
Learning Curve: advanced features and declarative syntax, especially with Starlark, can have a steep learning curve for newcomers. Developers accustomed to other build systems may find it challenging to grasp Bazel's concepts and best practices. Although, at this point it’s important to highlight that introducing any new tool requires developers to invest time on learning it.
Resource Intensive: caching mechanisms and distributed build features can consume significant system resources, particularly memory and CPU. Running Bazel builds on resource-constrained environments or on machines with limited hardware may lead to performance degradation or out-of-memory errors.
Bazel Is Strict and Explicit: It’s hard to consider it as a real flaw, however it’s important to know that Bazel is restrictive. This strictness ensures reliability and reproducibility, as Bazel leaves little room for ambiguity or deviation from defined standards.
Versioning and Compatibility: Bazel's rapid development pace and frequent releases may pose challenges for version compatibility and stability, especially in environments where strict version control is required. Upgrading Bazel to newer versions may introduce compatibility issues with existing configurations and third-party tools.
It's worth noting that the drawbacks of Bazel may be subjective and depend on specific project requirements and wider context.
However, if you are in search of a cutting-edge, holistic build system that tackles well-known issues from other build tools, while also introducing various innovative solutions, taking the entire build and test process to a next, higher level, Bazel is the tool to explore. This is also true if you are looking to enhance your project with monorepo expertise.
The advantages of Bazel include native support for remote execution and distributed builds, built-in caching mechanism for faster builds, support for multiple programming languages and platforms, declarative build language for ease of use, and advanced dependency management system for reproducibility and scalability.