How to build a simple Scala application with Bazel

Q: What is Bazel?

It is an artifact-based build system that uses a declarative list of artifacts to build, specifies dependencies, and offers limited options for the build process.

The build time of a project has a significant impact on a team’s development efficiency. The larger the code base, the longer it takes to build. And the longer the build time, the worse the developer experience becomes.

While SBT is a great build tool, its design (in particular the lack of reliable content-addressable build cache) makes it not very suited for large projects.

In this blog post I will share my expertise in Bazel, a build system to achieve fast builds, even in Google-scale repositories, through building simple Scala applications.

What is Bazel?

What is Bazel? It is an artifact-based build system that uses a declarative list of artifacts to build, specifies dependencies, and offers limited options for the build process.

This design allows Bazel to achieve its qualities, as in its motto "{Fast, Correct} - Choose two."

In this section, we will learn about the nature of Bazel by looking at what an artifact-based build system is and what “{Fast, Correct} choose two” means!

Artifact-based build system

Traditional build systems like Ant and Maven are called task-based build systems. In the build configuration for task-based build systems, we describe an imperative set of tasks like do task A, then do task B, and then do task C.

On the other hand, in artifact-based build systems such as Buck, Pants, and Bazel, we describe a declarative set of artifacts to build, a list of dependencies, and limited options for the build.

So, the basic idea of Bazel is your build is a pure function:

Sources and dependencies are input.
The artifact is an output.
There are no side effects.

For more details about the concept of artifact-based build systems, I recommend reading through Chapter 18 of Software Engineering at Google.

{Fast, Correct} choose two

To better understand this claim, it is necessary to comprehend Bazel’s Hermeticity property.

In hermetic build systems like Bazel, when given the same input sources and configuration, it returns the same output.

Thanks to hermeticity, Bazel is able to provide reproducible builds: given the same inputs, Bazel always returns the same output on everyone’s computer.

And thanks to the reproducible build, Bazel can provide a remote cache feature to share the build cache within a team. With remote cache, we can build large projects fast by using the build cache shared across team members. Build reproducibility is also a critical aspect of supply chain security, required to achieve the highest level of SLSA compliance.

Bazel basics

In this tutorial, we are building a simple Scala application with Bazel. The complete source code is available here: https://github.com/tanishiking/bazel-tutorial-scala/tree/main/01_scala_tutorial

The project structure looks like this:

1|-- WORKSPACE
2`-- src
3    `-- main
4        `-- scala
5            |-- cmd
6            |   |-- BUILD.bazel
7            |   `-- Runner.scala
8            `-- lib
9                |-- BUILD.bazel
10                `-- Greeting.scala

The Bazel configuration files are WORKSPACE and BUILD.bazel files.

The WORKSPACE file is about getting stuff from the outside world into your Bazel project.
BUILD.bazel files are about what is happening inside your Bazel project.

The WORKSPACE file

The WORKSPACE file contains the external dependencies (for both Bazel and JVM). For example, we download rules_scala, a Bazel extension for compiling Scala in the WORKSPACE file.

1load("@bazel_tools//tools/build_defs/repo:http.bzl", "http_archive") #Import a rule from the Bazel's "standard library"
2 ...
3http_archive( #This rule can download and import an archived repo
4    name = "io_bazel_rules_scala", #The name that will be used to reference the repo
5    sha256 = "77a3b9308a8780fff3f10cdbbe36d55164b85a48123033f5e970fdae262e8eb2",
6    strip_prefix = "rules_scala-20220201", #Only files from this directory will be unpacked and imported
7    type = "zip",
8    url = "https://github.com/bazelbuild/rules_scala/releases/download/20220201/rules_scala-20220201.zip",
9)

For more details, see the project README.

BUILD.bazel files

To define the build in Bazel, we write BUILD.bazel files.

Before jumping into the BUILD.bazel files, let’s take a quick look at Scala files to build. This project consists of two packages, each containing a single Scala file.

1// src/main/scala/lib/Greeting.scala
2package lib
3object Greeting {
4  def sayHi = println("Hi!")
5}

1// src/main/scala/cmd/Runner.scala
2package cmd
3import lib.Greeting
4object Runner {
5  def main(args: Array[String]) = {
6    Greeting.sayHi
7  }

As you can see, lib.Greeting is a library module that provides the sayHi method, and cmd.Runner depends on lib.Greeting.

Next, let’s see how to write BUILD.bazel files to build these Scala sources.

scala_library

To build lib.Greeting in this example, we put the BUILD.bazel file adjacent to Greeting.scala, and we define a build target using the scala_library rule provided by rules_scala.

A rule in Bazel is a declaration of a set of instructions for building or testing code. For example, there’s a set of rules for building Java programs (that is natively supported by Bazel). rules_scala provides a set of rules for building Scala programs.

scala_library compiles the given Scala sources and generates a JAR file.

1# src/main/scala/lib/BUILD.bazel
2load("@io_bazel_rules_scala//scala:scala.bzl", "scala_library")
3scala_library(
4    # unique identifier of this target
5    name = "greeting",
6    # list of Scala files to build
7    srcs = ["Greeting.scala"],
8)

The load statement imports a scala_library rule to the BUILD.bazel file.
scala_library is one of the build rules in Bazel; we describe what to build using rules.
The required attributes of scala_library are name and srcs.

bazel build

Now we have Scala sources to build and a BUILD.bazel configuration. Let’s build it using the bazel command line.

1$ bazel build //src/main/scala/lib:greeting
2...
3INFO: Found 1 target...
4Target //src/main/scala/lib:greeting up-to-date:
5  bazel-bin/src/main/scala/lib/greeting.jar
6INFO: Elapsed time: 0.152s, Critical Path: 0.00s
7INFO: 1 process: 1 internal.
8INFO: Build completed successfully, 1 total action

Build succeeded 🎉, but wait, what is //src/main/scala/lib:greeting?

Label

//src/main/scala/lib:greeting is something called a label in Bazel, and it points to the greeting in src/main/scala/lib/BUILD.bazel. In Bazel, we use a label to uniquely identify a build target.

A label consists of 3 components. For example, in @myrepo//my/app/main:app_binary,

@myrepo// specifies a repository name. If we omit this part, the label refers to the repository that contains this BUILD.bazel file. Therefore, we can omit the @myrepo part when referring to a label defined within the same repository.
my/app/main represents a path to the package (BUILD.bazel file) relative to the repository root.
:app_binary is a target name.

That being said, //src/main/scala/lib:greeting points to a target that is in the same workspace, defined in a BUILD.bazel file located at src/main/scala/lib, and the target name is greeting.

Dependencies

Next, let’s build cmd.Runner that depends on lib.Greeting. This time, cmd.Runner depends on lib.Greeting, so we introduce a dependency between targets using a deps attribute.

1# src/main/scala/cmd/BUILD.bazel
2load("@io_bazel_rules_scala//scala:scala.bzl", "scala_binary")
3scala_binary(
4    name = "runner",
5    main_class = "cmd.Runner",
6    srcs = ["Runner.scala"],
7    deps = ["//src/main/scala/lib:greeting"],
8)

The differences from the previous example are:

We use scala_binary instead of scala_library
- scala_binary is an executable rule. Executable rules define how to build an executable from sources. This process might contain linking dependencies or listing the classpaths of dependencies.
- For example, the scala_binary rule builds an executable script from the sources and dependencies.
- Once the executable has been built, you can run it using the bazel run command. This will execute the executable file.
We add the deps attribute to list all the dependencies.
- In this example, we add a label //src/main/scala/lib:greeting because cmd.Runner depends on lib.Greeting.

Now, we should be able to build the application by bazel build //… but it fails!

1$ bazel build //src/main/scala/cmd:runner
2
3ERROR: .../01_scala_tutorial/src/main/scala/cmd/BUILD.bazel:3:13:
4in scala_binary rule //src/main/scala/cmd:runner:
5target '//src/main/scala/lib:greeting' is not visible from
6target '//src/main/scala/cmd:runner'.

Visibility

Bazel has a concept of visibility. By default, all targets’ visibility is private, meaning only targets within the same package (i.e. same BUILD.bazel file) can access each other.

To make lib:greeting visible from cmd, add the visibility attribute to greeting.

1 scala_library(
2     name = "greeting",
3     srcs = ["Greeting.scala"],
4+    visibility = ["//src/main/scala/cmd:__pkg__"],
5 )

//src/main/scala/cmd:__pkg__ is a visibility specification that grants access to the package //src/main/scala/cmd.

Now we can build the app:

1$ bazel build //src/main/scala/cmd:runner
2...
3INFO: Found 1 target...
4Target //src/main/scala/cmd:runner up-to-date:
5  bazel-bin/src/main/scala/cmd/runner.jar
6  bazel-bin/src/main/scala/cmd/runner
7INFO: Elapsed time: 0.146s, Critical Path: 0.01s
8INFO: 1 process: 1 internal.
9INFO: Build completed successfully, 1 total action

As you can see, the scala_binary rule generates another file which is named runner in addition to runner.jar. This is a wrapper script for runner.jar, and we can easily run the JAR with this script.

1$ ./bazel-bin/src/main/scala/cmd/runner
2Hi!
3
4$ bazel run //src/main/scala/cmd:runner
5Hi!

Tip: build multiple targets

In the examples above, we specify a target’s label and build one target, but is it possible to build multiple build targets at the same time?

The answer is yes. We can use a wildcard to select multiple targets. For example, we can build all targets by $ bazel build //….

Now, we have learned the basics of Bazel by building a simple application in Scala, but how do we use third-party libraries in Bazel?

External JVM dependencies

Let’s now learn how to use third-party libraries from Maven by building a simple application that parses Scala programs using scalameta and pretty prints the AST using pprint.

In this example, we will use rules_jvm_external, one of the standard rules set to manage external JVM dependencies.

Note: we can download jars from Maven repositories using maven_jar, a natively supported rule by Bazel. However, I recommend using rules_jvm_extenal because it has a number of useful features over maven_jar.

The complete code example is available here: https://github.com/tanishiking/bazel-tutorial-scala/tree/main/02_scala_maven.

This project has only one Scala program.

1// src/main/scala/example/App.scala
2package example
3import scala.meta._
4object App {
5  def main(args: Array[String]) = {
6    pprint.pprintln(parse(args.head))
7  }
8  private def parse(arg: String) = {
9    arg.parse[Source].get
10  }
11}

To download scalameta and pprint from Maven repositories, we use rules_jvm_external. So, we have to download rules_jvm_external first.

To download rules_jvm_external, copy and paste setup statements from the release page to your WORKSPACE file, like this:

1http_archive(
2    name = "rules_jvm_external",
3    strip_prefix = "rules_jvm_external-4.5",
4    sha256 = "b17d7388feb9bfa7f2fa09031b32707df529f26c91ab9e5d909eb1676badd9a6",
5    url = "https://github.com/bazelbuild/rules_jvm_external/archive/refs/tags/4.5.zip",
6)
7...

1Then list all the dependencies in the maven_install statement, which is also in the WORKSPACE file.

Now, Bazel can download the dependencies, but how can we use them?

To use the downloaded dependencies, we need to add them to the deps attribute of the build rules. rules_jvm_external automatically generates targets for the libraries under @maven repository in the following format:

> The default label syntax for an artifact foo.bar:baz-qux:1.2.3 is @maven//:foo_bar_baz_qux https://github.com/bazelbuild/rules_jvm_external#usage

Therefore, we can refer to com.lihaoyi:pprint_2.13:0.7.3 with the label @maven//:com_lihaoyi_pprint_2_13. So, put the following BUILD.bazel file adjacent to App.scala.

1# src/main/scala/example/BUILD.bazel
2scala_binary(
3    name = "app",
4    main_class = "example.App",
5    srcs = ["App.scala"],
6    deps = [
7        "@maven//:com_lihaoyi_pprint_2_13",
8        "@maven//:org_scalameta_scalameta_2_13",
9    ],
10)

And build it!

1$ bazel build //src/main/scala/example:app
2...
3INFO: Found 1 target...
4Target //src/main/scala/example:app up-to-date:
5  bazel-bin/src/main/scala/example/app.jar
6  bazel-bin/src/main/scala/example/app
7INFO: Elapsed time: 0.165s, Critical Path: 0.00s
8INFO: 1 process: 1 internal.
9INFO: Build completed successfully, 1 total action
10...
11
12$ bazel-bin/src/main/scala/example/app "object main { println(1) }"
13Source(
14  stats = List(
15    Defn.Object(
16      ...
17    )
18  )
19)

Nice! 🎉 So, that’s how we use external JVM dependencies with rules_jvm_external.

Conclusion

In this article, we have shown how Bazel enables fast build in large repositories and introduced the basic concepts and usage of Bazel by building a simple Scala application.

As you saw, Bazel requires users to manage a lot more build configurations than other build tools such as sbt and Maven. However, this is an acceptable tradeoff for very large projects. considering the scalable build speed thanks to Bazel’s advantages, such as reproducible builds and remote cache.

I hope this article helps you take the first steps in getting started with Bazel.

If you want to learn more about Bazel, I recommend reading through the official Bazel starting guide and getting your hands dirty with your first Bazel project. There are many more interesting topics (such as Bazel testing) to learn about and explore!