Deep dive into Bazel queries: from basics to advanced use cases
Łukasz Wawrzyk
Senior Software Engineer
Published: Nov 13, 2024|23 min read23 minutes read
Whether you're a seasoned developer or just dipping your toes into software building, understanding how to use Bazel Query can transform how you manage and optimize your build processes.
This article will guide you through a notable subset of Bazelʼs querying features backed by examples. These are useful for debugging, exploration, and tooling. We’ll start with basic Bazel queries with target patterns, then we will move on to demonstrating Bazel aquery and cquery. At the end, we will provide some tips specifically for developing tools that utilize these queries.
What is a Bazel Query? It’s a feature that allows exploring and analyzing the project’s build graph within the Bazel ecosystem. Software engineers use Bazel Queries to better understand the relationships between dependency charts and achieve better build times and better debugging.
In other words, Bazel Query is a tool that helps software engineers see how different parts of their projects fit together so they can build and optimize their code faster.
With its ability to delve into complex dependency relationships, Bazel Queries provide insights crucial for improving build times and debugging efficiency. They act as a window into your project’s structure, helping you see how different components interact.
Understanding Dependency Relationships
At the heart of any software project lies a web of dependencies. Using Bazel Query, developers can unravel these connections, gaining clarity on which parts of the project rely on others. This understanding helps in identifying potential bottlenecks or conflicts early in the development process. Queries are crafted using specific commands that pinpoint targets or rules within the codebase.
For instance, by executing bazel query "deps(//path/to:target)", one can list all direct and indirect dependencies associated with a particular target. Such insights allow developers to make informed decisions about optimizations or refactoring tasks necessary for efficient builds.
Additionally, Bazel Queries can lay the groundwork for custom tools that automate and enhance your workflow. A lot of other improvements can be achieved with a single query.
Tools like bazel-diff use queries to identify the smallest set of targets affected by code changes, which significantly accelerates build times.
bazel-bsp server, which uses cquery to find the runtime classpath before running a JVM app
Headronʼs compile-commands-extractor provides C language support for LSP-compatible editors such as VS Code and Vim.
luminartech/dev-tools repository, recently shared on Bazel Slack, sets up launch configurations in VS Code for quick debugging of C targets.
fastpass which enables importing Scala projects to Bloop for IntelliJ and VS Code integration.
Bazel Query is a versatile tool that developers can use in various scenarios to enhance their productivity and efficiency. Its ability to navigate complex dependency structures makes it an indispensable asset when optimizing builds, troubleshooting issues, and more.
Optimizing Build Processes
One of the primary uses of Bazel Query is in optimizing build processes. When developers make changes to code, they need to ensure that only relevant parts of the project are rebuilt, saving both time and resources. This is where Bazel Query shines:
By executing commands like "bazel query 'deps(//path/to:target)'", teams can identify which components depend on modified files.
These insights allow them to focus rebuild efforts solely on affected targets, minimizing unnecessary work and speeding up development cycles.
Furthermore, by integrating these queries into continuous integration systems, tests can be automatically triggered only for impacted areas, maintaining high quality while reducing testing time.
Troubleshooting Dependency Issues
Another critical application for Bazel Query lies in troubleshooting dependency problems within projects. Software often involves intricate networks of dependencies that might lead to conflicts or errors if not managed properly:
Developers use queries such as "bazel query 'allpaths(//path/to:start,//path/to:end)'" to trace paths between specific targets, revealing hidden relationships causing issues.
This functionality helps pinpoint problematic dependencies quickly before they escalate into larger problems during runtime.
By understanding how different parts interact through these detailed outputs, engineers can address root causes efficiently rather than merely fixing symptoms.
Managing Large Codebases
For those managing large codebases with numerous interconnected elements—Bazel Query proves invaluable once again:
Executing commands like "bazel query '//...'" provides an overview of all targets within a repository at any given point in time.
In the following chapter, we will use TARGET_PATTERN syntax to demonstrate how Bazel queries work. These examples will show you how to navigate and understand the build targets within a project's structure.
Bazel queries and TARGET_PATTERN
Target Pattern is a syntax used to specify a set of targets. In fact, Bazel Queries are a perfect tool to demonstrate how the syntax of target patterns works.
Letʼs start with simple examples. When you issue a query in Bazel using a target pattern, Bazel interprets that pattern and gives you back a list of all the actual build targets that the pattern refers to, ensuring you have precise control over which parts of your build you're interacting with.
//foo:bar - target bar in package foo
1% bazel query //src/main/scala/lib:greeting
2//src/main/scala/lib:greeting
That was our first query. Didnʼt do all that much, we got the same label back. At least it confirms that such target exists.
//foo - shorthand for //foo:foo
1% bazel query //core
2//core:core
We can see that Bazel expanded the target pattern to a target label
//foo/... - recursive wildcard, all targets in package foo and any subpackages
bazel query
1% bazel query //src/main/scala/...
2//src/main/scala/cmd:runner
3//src/main/scala/lib:greeting
We found 2 targets here, in subpackages cmd and lib . There was nothing directly in package //src/main/scala. In fact it is not even a package (no BUILD file there), but with recursive wildcard, we can still use it.
//foo:all - all targets in package foo
1% bazel query //src/main/scala/cmd:all
2//src/main/scala/cmd:runner
We had just one target here with an all-encompassing wildcard, now we have multiple components. We have the BUILD file, the Runner.scala , which is the source file, we have the deploy jar as well. To better understand what each of these items are, letʼs introduce a new concept, which is the output format.
1% bazel query //src/main/scala/cmd:*
2zsh: no matches found: //src/main/scala/cmd:*
It works like :all but it returns also files, so both files and targets in package.
1% bazel query //src/main/scala/cmd:*
2zsh: no matches found: //src/main/scala/cmd:*
We found 2 targets here, in subpackages cmd and lib . There was nothing directly in package //src/main/scala. In fact it is not even a package (no BUILD file there), but with recursive wildcard, we can still use it.
//foo:all - all targets in package foo
1% bazel query '//src/main/scala/cmd:*'
2//src/main/scala/cmd:BUILD
3//src/main/scala/cmd:Runner.scala
4//src/main/scala/cmd:runner
5//src/main/scala/cmd:runner.diagnosticsproto
6//src/main/scala/cmd:runner.jar
7//src/main/scala/cmd:runner.sdeps
8//src/main/scala/cmd:runner.statsfile
9//src/main/scala/cmd:runner_MANIFEST.MF
10//src/main/scala/cmd:runner_deploy.jar
That was our first query. Didnʼt do all that much, we got the same label back. At least it confirms that such target exists.
//foo - shorthand for //foo:foo
Output Format
There are multiple ways in which Bazel can format the query output. Valid values are label, label_kind, build, minrank, maxrank, package, location, graph, xml, proto, streamed_jsonproto, streamed_proto
Below, you can see the example in the output format:
The action graph query, or aquery, command allows you to query for actions in the build graph. It operates on the action graph and exposes information about Actions, Artifacts and their relationships.
This command is particularly useful when you need to better understand how the build works, i.e. what commands Bazel is going to run. It is an excellent tool to learn how different rules work.
Additionally, aquery command runs on top of a regular Bazel build and inherits the set of options available during a build.
Example of basic aquery
Below you can see a query a cc_library target with 2 sources. There are 2 compile actions, one for each source file and linking step. Note that in order to make this output shorter, this example excluded a list of input and output artifacts as well as a command line.
Filter Actions by Mnemonic mnemonic (PATTERN, TARGETS)
In this example of Bazel Query, we will see the commands and artifacts. It is very useful to debug compilation issues, especially with include paths or classpath for JVM languages. Note that I filtered output using menmonic function, exclusive for a query.
There are 2 more new functions in aquery : inputs and outputs . We can use them to look for actions that produce or consume specific files (with a regex).
Letʼs use it to find a header file. We see a header file included with
path contrib/kafka/filters/network/source/external/request_metrics.h find its contents to read. Let’s say that we want to find its contents to read.
1% cat contrib/kafka/filters/network/test/metrics_integration_test.cc | head
The configured query, or a cquery, is a variant of the Bazel query that correctly handles select() calls and build options' effects on the build graph. It achieves this by running over the results of Bazel's analysis phase. As a comparison, a regular Bazel query runs over the results of Bazel's loading phase, before options are evaluated.
Code that you can write here is similar to what you can do in aspects. As some resources, I suggest to look at cquery docs, to see how Starkark dialect differs, and for aspects, look at aspects developed in IntelliJ Bazel Plugin and/or bazel-bsp from JetBrains.
The IDE integrations have to extract a lot of things to import projects so it can be a valuable resource. Also note that you can just use dir function to inspect what properties or functions are there on target, providers, etc. and explore it like this.
Here are 5 best practices that we like to implement whenever working with Bazel Queries.
Use --keep_going for robustness
Especially rdeps tends to fail on some targets during analysis. You may want to warn the user that the Bazel command partially failed.
Use machine-friendly output formats
Formats like label_kind or label are usually enough, but for more complex use cases we suggest:
query: proto , streamed_proto , streamed_jsonproto , xml
aquery: proto , streamed_proto , jsonproto , textproto
cquery: proto , streamed_proto , jsonproto , textproto
We would like to highlight XML output format for query. For your manual explorations, it has a similar purpose as the build format, but lists have entry per line, on the other hand, there is more XML clutter, so some may prefer one or the other in terms of readability. Note that you have rule inputs and outputs here and select s are flattened. As this is XML it is easy to parse for tools.
1% bazel cquery //src/main/scala/cmd:runner --output=starlark --starlark:expr='"\n".join([f.path for f in providers(target)["JavaInfo"].transitive_runtime_jars.to_list()])'
Code that you can write here is similar to what you can do in aspects. As some resources, I suggest to look at cquery docs, to see how Starkark dialect differs, and for aspects, look at aspects developed in IntelliJ Bazel Plugin and/or bazel-bsp from JetBrains.
The IDE integrations have to extract a lot of things to import projects so it can be a valuable resource. Also note that you can just use dir function to inspect what properties or functions are there on target, providers, etc. and explore it like this.
Note that you can write more complex code and pass in a file with format function that accepts target as an argument. Use --starlark:file=PATH option in that case.
Be careful as cquery does not automatically wipe the build graph from previous commands and is therefore prone to picking up results from past queries. The official documentation provides more details on this.
Debugging Transitions `--transitions=full|lite`
A cquery can be used to debug transitions, as you can see in the example below:
Here are 5 best practices that we like to implement whenever working with Bazel Queries.
Use --keep_going for robustness
Especially rdeps tends to fail on some targets during analysis. You may want to warn the user that the Bazel command partially failed.
Use machine-friendly output formats
Formats like label_kind or label are usually enough, but for more complex use cases we suggest:
query: proto , streamed_proto , streamed_jsonproto , xml
aquery: proto , streamed_proto , jsonproto , textproto
cquery: proto , streamed_proto , jsonproto , textproto
We would like to highlight XML output format for query. For your manual explorations, it has a similar purpose as the build format, but lists have entry per line, on the other hand, there is more XML clutter, so some may prefer one or the other in terms of readability. Note that you have rule inputs and outputs here and select s are flattened. As this is XML it is easy to parse for tools.
When the query is generated and is very long, you may use this as a workaround for the command size limit.
Consider the performance requirements of your tool
Consider the performance requirements of your tool. For example executing multiple queries in the loop doesnʼt scale that well or generally: calling many queries doesnʼt scale well. If the query is too slow for your use case, consider using aspects to dump all info in a single bazel build run. This is a useful addition to Bazelʼs tooling developer toolbox right next to queries, but this is a topic for another presentation.
Use Custom Output Base
Call Bazel with bazel - -output_base=/tmp/my_base query … to not disrupt user integration with Bazel if you run your tool in the background, for example as part of an IDE extension. There is a downside to it as it uses more resources and overrides convenience symlinks.
For those who barely know anything about querying, we hope that you will more eagerly use this tool to help in your daily work. And for those who already worked with Bazel Queries, we hope you still learned something new.
Editor’s note: this article is based on the lecture originally presented by Łukasz Wawrzyk during the 2024 BazelCon in Mountain View, California. You can watch the full version on YouTube.