How to set up a Bazel testing configuration: The comprehensive guide for Scala and Java
Łukasz Wawrzyk
Tooling Expert
Published: Jun 29, 2023|15 min read15 minutes read
Bazel, with its extensive capabilities, provides a comprehensive framework for efficient and reliable testing, making it an ideal choice for ensuring the quality and stability of your codebase. In addition, Bazel’s focus on hermeticity makes it particularly suitable for utilizing large sets of tests, both locally and on CI servers. In short, Bazel testing provides developers with a reliable approach to validate their code.
Now, we will define, run, and configure tests in Bazel. Whether you’re new to Bazel or an experienced user looking to enhance your Bazel testing workflows, this guide will provide you with the essentials to harness the capabilities effectively.
In this article, we will delve into the key concepts of Bazel testing configurations. We will cover target definitions, test rules, and test suites, providing you with a solid foundation. Additionally, we will explore advanced techniques like test filtering and parallelization, which greatly enhance the efficiency of your test runs.
Whether you’re handling a small project or a sizable codebase, grasping these concepts will empower you to streamline your testing workflows and attain quicker feedback loops. Let’s explore Bazel’s testing capabilities and unlock the potential for seamless, efficient testing in your projects.
Tip: The complete example of the test setup described in this article can be found in an example repository here.
No matter the language, Bazel testing holds in most cases the same process:
Configure BUILD files: Inside your project’s directory, create a BUILD file or modify an existing one. The BUILD file defines the build targets for your project, including test targets. Specify the target definitions, dependencies, and any specific configurations for your tests.
Write Tests: Create test files using the testing framework of your choice, such as JUnit for Java or ScalaTest for Scala. Write test cases to cover different scenarios and validate the functionality of your code. Place the test files in an appropriate directory within your project.
Define Test Targets: In the BUILD file, define test targets for your tests. Specify the source files and dependencies required for each test target. You can also include additional configurations, such as test timeouts or test environment variables.
Run Tests: Use the Bazel command-line interface (CLI) to execute your tests. Navigate to the root of your project and run the appropriate Bazel command to trigger the test execution. For example, use “bazel test //path/to:testTarget” to run a specific test target.
Analyze Test Results: Bazel provides detailed test result output, including information on test successes, failures, and coverage reports. Review the test output to identify any failures or issues that need to be addressed.
Iterate and Refine: Make necessary code changes based on the test results and iterate on the testing process. Fix any failing tests and ensure that your code passes all the desired test cases.
We will start out with creating a simple test in Java since it is a popular language. We will then dive into more advanced Bazel testing using Scala since rules_scala has more features to offer. Even if Scala is not your primary language, most of the knowledge shared here applies to all Bazel rules, with some sections being JVM-specific.
Let’s start with defining our first test target for Java. For simplicity we will use JUnit 4, which is supported by Bazel out of the box. You need to have at least an empty WORKSPACE file.
Then you can create your Java file with a test class and accompanying BUILD.bazel file.
1# example/src/test/java/com/example/BUILD.bazel
2
3java_test(
4 name = "ExampleTest",
5 srcs = ["ExampleTest.java"]
6)
You can run this test with the following command:
1bazel test //example/src/test/java/com/example:ExampleTest
The process is straightforward. The java_test rule shares similarities with java_binary, allowing to specify dependencies (deps), runtime dependencies (runtime_deps), and resources as usual. Note that you don’t need to add a dependency on JUnit; it is provided automatically by the java_test rule.
The setup is so straightforward because we rely on conventions and inference. The rule’s name should match the class name and the path to the *.java file should include a java or javatests component, followed by the package’s directory structure. Not adhering to these conventions requires explicitly specifying the test_class attribute with the fully-qualified class name.
To include additional tests, you can follow a similar approach of adding more Java files and targets. However, this process can be repetitive and prone to errors. Automating the generation of these targets or executing multiple tests from a single target is a separate topic that warrants its own blog post. If you’re looking for more information, you can refer to stackoverflow question for test grouping and this github issue for JUnit 5.
Configuring multiple tests within a single target is simpler and readily supported in Scala. In this blog post, we will primarily focus on Scala tests as an example, as they are easier to configure with Bazel compared to JUnit or Spock.
The following code imports the necessary libraries and defines a Bazel test class “ExampleTest” that extends the “AnyFunSuite” trait from the “org.scalatest” library. Within the test class, there is a single test case defined using the “test” method, which asserts that an empty set has a size of 0:
The following code snippet represents the BUILD file (named “BUILD.bazel”) for the same project. It utilises the “scala_test” rule from the “@io_bazel_rules_scala//scala:scala.bzl” library to define a test target named “tests”.
If we consider the //example/src/main as a scala_library target containing the code we intend to test, we establish a dependency on this target by utilising the deps attribute. This dependency allows us to access and use the code within our tests. The required attributes, such as name and srcs, are specified similarly for library targets.
NOTE: The following configurations and best practices for Bazel testing are general to Bazel and can be used with any language, Java or Scala. We will specifically mention the language, if we deviate from the general perspective.
In most projects, a comprehensive test plan includes various types of tests, such as unit tests, integration tests, smoke tests, performance tests, stress tests, and linting. Each test type often requires distinct handling. For instance, integration tests may demand additional environment setup and longer timeouts, while running resource-intensive stress tests on a local machine might be impractical.
Tagging Test Targets for Enhanced Bazel Test Filtering
Tags attribute is a common approach in Bazel testing to designate a target as a specific test type or category. Let’s have a look at an example in Scala, though it works for any test targets used in any language:
To run only the tests that have a certain tag, use the –test_tag_filters flag.
Here are some examples:
1$ bazel test //... --test_tag_filters=unit # run only unit tests
2$ bazel test //... --test_tag_filters=unit,integration,lint,-external # run unit, integration and lint tests except those tagged as 'external'
3$ bazel test //... --test_tag_filters=-stress # run all tests except the stress tests
4
The selection of tag names and their usage is flexible and dependent on your specific needs. It’s important to note that tags are assigned to individual Bazel targets and are applicable across all *_test rules, ensuring consistency across different test frameworks such as Java, JUnit, or any other testing framework.
Use in continuous integration (CI)
Once the test targets are tagged according to their categories, it becomes effortless to configure the continuous integration (CI) system and execute each category as a separate step or independent job. This setup allows you to allocate the appropriate resources for specific use cases, schedule and monitor them separately, ensuring a streamlined testing process.
Take a look at the following example to understand how this can be implemented:
Quick Check Job: Providing Rapid Feedback for Pull Requests
The Quick Check Job aims to swiftly provide developers with feedback immediately after submitting a pull request. To facilitate convenient inspection of results, this job is divided into two stages, allowing for an independent evaluation of each stage:
Stage 1: Linting Tests – Ensuring Code Style Compliance In the first stage, all linting tests are executed to verify if the code adheres to the specified style requirements. To run these tests, use the following command:
1bazel test //... –test_tag_filters=lint
Stage 2: Unit Tests – Swiftly Detecting Regressions The second stage involves running all unit tests to identify any potential regressions promptly. Use the following command to execute the unit tests:
1bazel test //... –test_tag_filters=unit
Slow Job: Running Integration Tests with Careful Consideration
The Slow Job focuses on running integration tests, which typically require a longer execution time, additional resources, and a specific environment setup. While running these tests is necessary, it’s important to note that the feedback obtained from this job may not be as immediate as with other tests due to the nature of integration testing.
1# setup environment here
2bazel test //... –test_tag_filters=integration
Setting Test Timeouts for Efficient Resource Utilization
Bazel incorporates a timeout mechanism for each test target to prevent unnecessary resource consumption caused by tests that get stuck or take too long. You can configure timeouts for individual targets by utilizing the timeout attribute, which is an enum. For instance, it is recommended to assign a short timeout for unit tests to ensure their efficient execution.
You’ll find example timeout margins for Bazel testing in this table:
timeout
time limit
short
1 min
moderate
5 min
long
15 min
eternal
60 min
Customizing Test Timeouts for Enhanced Control
To have more control over test timeouts, Bazel offers the flexibility to override all timeouts using the –test_timeout=seconds flag. Additionally, you can specify individual timeouts for four different categories using the –test_timeout=seconds,seconds,seconds,seconds format. Here are a couple of examples:
Setting a uniform 30-second timeout for all tests, regardless of their size:
1--test_timeout=30
Establishing more conservative timeout values: 5 seconds for short tests, 1 minute for moderate tests, 3 minutes for long tests, and 10 minutes for eternal tests:
1--test_timeout=5,60,180,600
Pro Tip: To receive detailed warnings about tests with excessively generous timeouts, add the test --test_verbose_timeout_warnings to your .bazelrc file. This will enable Bazel to highlight tests that might have incorrect timeouts. It is crucial to adjust timeouts based on these warnings, as incorrect timeouts can obscure regressions. For instance, a test that was previously fast but has become slow might go unnoticed without proper timeout management.
Bazel test sizes for resource allocation
To enhance the management of test targets, Bazel introduces the concept of test sizes. Each test target can be assigned a size, indicating its “heaviness” in terms of time and resources required for execution.
Bazel categorizes test sizes into three levels: small (unit tests), medium (integration tests), large or enormous (end-to-end tests)
By default, Bazel sets a timeout based on the test size, which can be customized using the timeout attribute discussed earlier. It’s important to note that the timeout applies to all tests within the target, rather than individual tests. How the test sizes will fit in your workspace is also up to you.
When running tests locally, Bazel utilizes the test size for scheduling purposes, ensuring that the execution respects the available CPU and RAM resources specified with the --local_cpu_resources and --local_ram_resources options. This prevents overloading the local machine by running an excessive number of resource-intensive tests simultaneously.
Here are the default timeouts and assumed peak local resource usages corresponding to each test size:
Size
RAM (MB)
CPU (cores)
Default timeout
small
20
1
short
medium
100
1
moderate
large
300
1
long
enormous
800
1
eternal
When defining test targets in Bazel, specifying the test size alone is often sufficient, as it automatically implies a corresponding timeout duration. For instance, if you designate a test as small, Bazel will assign a short timeout by default, unless you explicitly specify a different timeout.
Tip: You can further filter tests based on their size by utilizing the --test_size_filters flag. This flag follows the same syntax as the --test_tag_filters flag, allowing you to run tests based on their size selectively.
Efficient Resource Allocation in Bazel
To ensure optimal performance, Bazel strives to utilize all available computing resources, including CPU cores and RAM. By default, it aims to maximize speed and efficiency by utilizing as much CPU capacity as possible and 2/3 of the available RAM. However, in certain cases, Bazel may encounter challenges in accurately detecting available resources within a container, which can result in out-of-memory errors.
To address this issue, it is beneficial to explicitly specify the available resources using the --local_cpu_resources and --local_ram_resources flags. By specifying these flags, you can precisely allocate the desired amount of CPU cores and RAM for Bazel to utilize during the build process. For instance, using the syntax --local_cpu_resources=2 --local_ram_resources=1024 assigns 2 CPU cores and 1GB of RAM to Bazel.
Additionally, Bazel provides the flexibility to set resource allocation relative to the detected resources. For example, you can allocate half of the available CPU cores by specifying the appropriate value.
By explicitly specifying the available resources, you can enhance Bazel’s performance, mitigate out-of-memory errors, and ensure smooth and efficient execution of your build processes.
To ensure correct usage and syntax, refer to these links:
Managing test targets in large projects often involves repetitive and boilerplate code. However, with the BUILD file language Starlark, a Python dialect, we can create reusable functions to streamline the process. Let’s consider an example and explore how to prepare presets for test targets.
In the following example, we define a target for typical unit tests in a Scala project:
To avoid repetitive code and simplify test target definitions, we can create a separate file called test_rules.bzl under tools/build_rules/ directory. Make sure to include an empty BUILD.bazel file in the same directory (tools/build_rules/BUILD.bazel) for Bazel to recognize it as a package.
By defining a function like unit_test, we can eliminate repetitive code and simplify the test target definitions. This function sets default values for the size and tags attributes, as well as adds the necessary dependency on the common test utilities.
If you’d use a different approach such as named args and default arguments, you’d have to make sure to enumerate and pass through all the attributes you’d use. In the main BUILD.bazel file, we can now use the unit_test function:
With this approach, we only need to specify the different attributes for each test target, while the kind of test is encoded in the function name. We no longer have to add the dependency on common test utilities manually, and it becomes easier to rename tags, change default sizes, or adjust timeouts.
In case a specific test target requires a longer timeout while still being categorized as a unit test, we can easily override the timeout:
It’s worth noting that the attributes used here (timeout, size, deps, etc.) are not specific to Scala. They are applicable to all *_test rules and are compatible with Java, JUnit, and other languages and frameworks.
By utilizing presets and reusable functions, we can streamline the process of defining test targets, reduce code duplication, and ensure consistency across test configurations in our projects.
Simplifying the process of creating and maintaining BUILD files is essential to improve productivity in Bazel testing. Here are a few more tricks that can help simplify your BUILD files.
Prelude
To avoid repetitive load statements in each BUILD file, you can create a tools/build_tools/prelude_bazel file, with a corresponding BUILD.bazel file as mentioned earlier, and add commonly used load statements there. This file acts as a prelude that will be automatically included at the beginning of each BUILD file in your workspace. This way, you can skip redundant load statements.
Default target name
It is common to have only one target per package and name it after the enclosing directory. This allows for shorter labels when declaring dependencies or running commands from the CLI. For example, instead of referring to //example/src/main:main, you can simply use //example/src/main.
To automatically compute the default target name, you can use the following function:
We can then extend the unit_test function to fill in the name parameter if it is not specified.
With these changes, your BUILD file can be simplified to:
1# example/src/test/scala/com/example/BUILD.bazel
2
3unit_test(
4 srcs = ["ExampleTest.scala"],
5 deps = ["//example/src/main"]
6)
If you have a specific project structure, you can even eliminate the need to specify the srcs attribute by using default sources as globs. For example, you can use globs(["*.scala"]) to automatically include all Scala files.
When running builds and tests, it can be beneficial to include the --keep_going flag, particularly in continuous integration (CI) environments. This flag allows Bazel to continue executing tasks and provides a comprehensive overview of all encountered issues that require attention. By default, Bazel operates in a fail-fast manner, which means it stops at the first error it encounters.
By using the --keep_going flag wisely, you ensure a smoother development and testing process, facilitating faster issue resolution and overall project stability.
Google’s powerful build and automation tool caches all test runs, allowing you to rerun tests without any code changes in a shorter time period. Bazel displays all the logs replayed along with a test summary, marked with the expression “(cached)”. Bazel reruns only potentially affected tests when you change the code, and omits caching failed tests by default.
To disable caching for a specific test target, add an external tag to indicate dependency on non-deterministic external resources.
You can control test caching globally using the --cache_test_results=(yes|no|auto) flag.
Using the default option ‘auto’, Bazel will re-run a test only if one of the following conditions is met:
Bazel detects changes in the test or its dependencies
The test is marked as external
Multiple test runs were requested with --runs_per_test
The test previously failed
Setting the option to no, all tests will be executed unconditionally.
Setting it to yes, the caching behavior will be the same as auto except that it caches test failures and test runs with --runs_per_test.
Occasionally, certain tests may be flaky, which essentially means they fail unpredictably. It’s worth debugging each failed test and fixing the root cause. Bazel offers a helpful runs_per_test flag for this matter.
For example: --runs_per_test=50 would run each requested test 50 times. Adding the flag --runs_per_test_detects_flakes marks tests that only fail unpredictably as flaky rather than failed. If you lack time to fix some flaky tests, you can mark them with the attribute flaky = True.
1unit_test(
2 srcs = ["ExampleTest.scala"],
3 deps = ["//example/src/main"],
4 flaky = True
5)
Bazel runs the test up to 3 times and it will mark it as failed only if it fails every time. You can also use a flag to mark all tests as flaky with the desired number of attempts: --flaky_test_attempts=2. Some developers may find it useful to enable this flag in a CI environment to cover for random, rarely failing tests.
Occasionally you have to run tests one after another, using a resource that multiple tests share. For instance, if you have tests that use services in docker containers, which are set up before the tests start. These particular tests should be marked with the exclusive tag. This generally works, but in case of any problems, running with just 1 worker can be enforced with --local_test_jobs=1 flag, for example: bazel test //... --test_tag_filters=exclusive --local_test_jobs=1
When tests involve network usage, it often introduces a potential cause for unpredictable failures. Fortunately, with Bazel’s sandboxing feature, it becomes possible to deactivate network usage as the default setting. To activate it, add this code to the .bazelrc file:
1build --sandbox_default_allow_network=false
2test --sandbox_default_allow_network=false
By enabling this feature, you can prevent unintended non-hermetic actions or tests that rely on remote services. To specifically deactivate it for a particular target, use the required network tag.
Bazel conveniently packages all resource files within a jar, making them inaccessible through standard file system operations. To read these files, it is advisable to utilize the standard JVM resources mechanism, such as getResourceAsStream. For tasks like dynamically listing files in a directory or performing similar operations, the recommended approach is to create a JarFileSystem using the FileSystems API.
Inspecting a jar file to copy multiple files can be challenging. To assist you in getting started smoothly, here is an example provided. You can find the complete example in this project.
1// Path to the empty directory that Bazel prepared for this test target
When running tests with Bazel, each test is assigned a writable location, and its path can be accessed through the environment variable TEST_TMPDIR. This location serves as a suitable place for unpacking files from resources or creating temporary files.
Generally, it is permissible to use the /tmp directory as well, especially when there is a need to avoid exceeding the path length limit. However, when utilizing the /tmp directory, it becomes the responsibility of your tests to clean up after themselves, prevent conflicts with other files, and handle these files appropriately. It is also acceptable to employ mechanisms provided by test frameworks, such as the TemporaryFolder rule in JUnit, as these methods effectively manage files and ensure test isolation.
It’s important to note that there is no direct way to access the path of your Bazel workspace within tests for reading or writing files. If you find yourself in a situation where this is necessary, it is likely that you require a binary target rather than a test target. Binary targets can be executed with a working directory located within the repository.
If, for any specific reason, you truly need to disable sandboxing and allow writes to the entire filesystem, you can do so by using the --spawn_strategy=local option. However, it is recommended to refer to the documentation to learn more about sandboxing and its implications.
Generally, Bazel performs most effectively with specific targets with explicit and finely defined dependencies. This allows us to minimize the number of targets that need to be built or tested in the event of code changes, resulting in improved utilization of the cache. Moreover, having more targets enables better parallel processing. When dealing with tests that generate a significant amount of logs, and the continuous integration (CI) system only displays logs for failed tests, it becomes much easier to navigate through the output if each target contains only one test.
Likewise, timeouts play a crucial role. If each test can take up to 10 seconds and there are 100 tests in a target, you would need to set a timeout of more than 15 minutes. If one test happens to encounter a deadlock, the CI system would waste 15 minutes waiting, and you would be uncertain about which of these 100 tests would pass or fail. However, with an approach of having one target per test, each test could have a timeout of 10 seconds. As a result, 99 tests would run quickly and pass, while the problematic one would fail due to a timeout after just 10 seconds.
It may be a burden to define a separate target for each test file, yet in some cases it could be worth it. Fortunately, when using Scala, you don’t have to go through the manual process if you use Scala. Besides the standard scala_test rule, it also provides the scala_test_suite. This rule allows you to generate targets for individual files based on globs or a list of files and then consolidate them under a test_suite.
Tip: It’s important to note that not all programming languages have an equivalent rule like scala_test_suite. For instance, there is no such rule available for Kotlin. However, implementing a similar rule is relatively straightforward, and you can refer to the implementation in rules_scala as a guide.
This approach has a drawback, particularly in the JVM ecosystem, because Bazel runs a separate JVM for each target without persistent workers. Initializing the JVM and test framework takes time, and the code is not optimized by the JIT (Just-In-Time) compiler. If multiple tests are grouped within a single target, they will all run in the same JVM.
An experiment demonstrated that running the same unit tests divided into 53 targets and 131 targets resulted in a time difference of 1 minute versus 2 minutes and 10 seconds.
For intensive tests, it is more reasonable to split them to leverage parallelism and caching, as the impact of spawning a new JVM is less noticeable. However, I recommend measuring the performance impact on your project to determine if the trade-off is acceptable considering the performance penalty.
Note: The automatic test splitting provided by scala_test_suite doesn’t contribute significantly to caching, as all tests share dependencies. So, when editing production code for one test, all tests will be rerun. It is beneficial when you are only modifying the test itself or dealing with intermittent failures or timeouts. In such cases, a rerun will only execute the failed tests.
Bazel provides the capability to filter tests using the --test_filter flag. In the case of Scala, only class-level filtering is supported. Wildcards can be used to avoid typing the fully qualified name of a test or to run a group of tests with similar names or packages (Please refer to the Wildcard logic for more details).
1$ bazel test //location/src/test --test_filter="*.InMemoryRepositorySpec" # tests with class named "InMemoryRepositorySpec"
2$ bazel test //location/src/test --test_filter="*.oauth.*" # tests that contain "oauth" component in their package path
It’s important to note that the --test_filter flag in Bazel does not support method-level filtering. To achieve greater granularity in filtering, it is necessary to rely on the flags provided by the ScalaTest runner. For example the -z argument will filter by test name substring.
As this is a Scala specific issue, we want to make you also aware of method-level filtering in Java tests. You can filter methods using the path.to.Class#method syntax.
1$ bazel test //location/src/test --test_arg='-z' --test_arg='when popped' # run only tests with a name that contains 'when popped'
By combining both of these flags, you can further refine the scope of your test execution. If you find that the results are not as expected, you have the flexibility to experiment with different filters and observe whether the desired tests are being executed or not. The –test_output flag can be particularly helpful in providing significant insights into the test execution process.
By default, Bazel provides limited output from tests. Even if a test fails, you will only see the file path containing the logs, without the exact error message. This can be frustrating as it requires manually copying the path and opening the file each time.
However, this behavior can be adjusted using the --test_output flag. There are different levels of output that can be specified:
summary level: This is the default behavior described above.
errors level: This level includes the log output of failed tests directly in the overall output. To make this level the default behavior, I recommend adding the line --test_output=errors to your .bazelrc configuration file.
streamed level: This is the most verbose level. It displays the output from both passing and failing tests as they are being executed. It also includes logs from the test discovery mechanism, making it useful for debugging purposes.
By setting the appropriate –test_output flag, you can control the amount of output displayed during test execution in Bazel.
Bazel offers support for running tests in various popular languages and test frameworks. The level of support may vary in terms of configuration requirements and granularity of test cases, but continuous improvement efforts are underway, and Bazel serves as a platform for implementing missing features.
Parallel testing is a strength of Bazel, although the granularity of test tasks may not be as fine-grained as some other tools like Maven’s Surefire plugin. However, Bazel provides options to disable parallelism entirely or selectively for specific test targets that may contend for shared resources.
Tests can be filtered based on their size, including duration and memory requirements, as well as their purpose, such as unit tests or integration tests. Moreover, arbitrary grouping of tests can also be utilized for filtering purposes.
Bazel incorporates test result caching, which avoids re-running tests when their dependencies haven’t changed. However, this behavior can be disabled if desired. Additionally, Bazel allows for the configuration of retrying flaky test cases a specified number of times, enabling more robust test execution.