As AI tools take over more and more of the actual coding work, a new question emerges: who's watching what they do? TraceVault is a tool designed to bring that control back.
Krzysztof Grajek, Principal Software Developer at SoftwareMill and the lead engineer behind TraceVault, talks about why the rise of AI-generated code demands a completely new approach to trust, auditability, and documentation.
The Problem TraceVault Solves
Where did the idea for TraceVault come from, and what gap did you see in the existing developer workflow?
Krzysztof: We saw that as AI takes on more of the development work, it becomes much harder to trust and verify how code is produced.
The underlying assumption is that more and more code will be generated by AI. There's a very real possibility, especially now that models are already highly capable and keep improving, that code simply won't be typed by a developer at all. The developer will become a coordinator of AI tools, directing them to build features rather than writing anything themselves. We might reach a point where developers just file issues, bugs, and feature requests, and the code writes itself.
If developers become coordinators of AI instead of authors of code, what exactly do we lose in terms of understanding, accountability, and trust?
Krzysztof: When AI takes over the writing of code, the only artifact we're left with is the committed code sitting in a remote repository - the same as it's always been. The developer used to push code; now AI pushes code. But all the context behind it disappears. Why was this architectural decision made? Why was this technology chosen over another? Why was this problem solved this particular way? Those questions used to have answers, because a human made those decisions and could be asked. Now the AI made them, and nobody captured the reasoning.
There's also a trust dimension. When you hire a developer, they go through interviews, they work alongside you for years, and you build a relationship. You have a sense of their judgment. You trust what they ship. The moment AI takes over that cognitive load entirely, that trust relationship breaks down. You're left with a black box.
TraceVault was built to address exactly that - to track every interaction and every decision the AI makes in your codebase, and store that history centrally so it's actually useful.
What TraceVault Captures
So what does TraceVault actually capture that traditional version control systems completely miss?
Krzysztof: The first thing it does is collect the full interaction history. Every session with an AI coding tool - currently, we support Claude - is captured: what the AI wrote, what implementation plans it came up with, what the developer typed to guide it, and how many tokens were consumed. All of that lands in TraceVault automatically. Once initialized, capture is invisible and requires no manual steps from the developer.
What's particularly useful is that a lot of valuable information gets generated during an AI coding session and then simply vanishes. Claude will produce implementation plans, design documents, and large blocks of reasoning in the console - the developer reads them, responds, and then they're gone. Nobody commits them. TraceVault preserves all of that, so you can go back into a session trace and see exactly what ideas the AI had, which is genuinely useful when you're building something similar later.
Policy Enforcement
Does TraceVault go beyond tracking and actually enforce how AI tools are used within a team?
Krzysztof: Yes, and that's an important part of it. You can define policies within TraceVault that are enforced on every push. For example, an organization might require that before any code is pushed, a secondary AI review tool - say, a Codex-based reviewer - must have been run on top of what Claude produced. TraceVault checks for that. If the policy wasn't satisfied, the push is blocked.
This gives organizations real control over how their developers use AI tools - which tools must be run, which actions are permitted or restricted, and how token budgets are managed. It's a way of encoding organizational requirements into the development workflow rather than relying on people to follow guidelines manually.
There's also a Secret Redaction feature that checks whether any secrets, passwords, or API keys were leaked during a developer's conversation with Claude. That's a common concern and something that needs to happen automatically, before anything reaches a remote repository.
Auditability and Compliance
Financial institutions and other regulated industries have strict audit requirements. How does TraceVault fit into that?
Krzysztof: Every trace pushed to TraceVault is cryptographically signed - each session and each commit is signed using the key from the previous signing, which means the chain works similarly to a blockchain. To alter anything in history, you would need to regenerate the entire chain from the beginning. If anyone has taken a backup of the already-signed records, any tampering becomes detectable immediately.
We also audit changes to TraceVault itself - adding a user, changing a role, any administrative action. So the audit trail covers both the code changes coming in from developers and everything happening within the platform.
We've mapped this against standards like SOX, PCI-DSS, and SR 11-7, which specify how long data must be retained and the integrity requirements around it. TraceVault is still a Research Preview, and the compliance features in particular are likely to evolve.
Analytics and Stories
Once you start collecting all this AI interaction data, what new insights become possible for teams?
Krzysztof: Quite a lot. TraceVault provides detailed analytics across your team - token usage and associated costs, how much time each developer spends in AI sessions, which tools they use within Claude, including bash, built-in skills, and MCP servers, what percentage of your codebase is AI-generated, and savings from prompt caching. You can drill down per contributor, identify patterns, and potentially surface recommendations - if one developer is consistently delivering more with fewer tokens, that's a signal worth acting on.
You mentioned “Stories” as a way to understand code. How does that work in practice, and why is it different from traditional documentation?
Documentation is a chronic problem in software projects. Either nobody writes it, or it gets written and quickly goes out of date. Stories address that directly.
Because TraceVault has the complete trace of how any piece of code was produced, you can browse your repository within TraceVault, click on any line of code, and generate documentation for it on the spot - how that code came to be, why it was written that way, which AI sessions were responsible for it. From there, you can navigate directly into the session transcript and see exactly what the developer typed and how Claude responded. That narrative is what we call a Story. It's also the foundation for auto-generated Architecture Decision Records, giving new team members full context without having to ask anyone.
Deployment and Versions
Given how sensitive this data is, how is TraceVault deployed, and who has access to it?
Krzysztof: It's installed in-house - it is not a SaaS product. All the data stays within the client's own infrastructure, whether that's a Docker image on Kubernetes or whatever setup they prefer. From my experience, once someone installs TraceVault, we have no communication with that instance whatsoever. Given that TraceVault captures the complete record of what your developers are doing with AI, that privacy guarantee is essential.
There are two versions: OSS and Enterprise. The Enterprise version will include full RBAC, SSO, and more. The OSS version currently supports a single GitHub organization - not just one repository, but one organization. GitLab support is planned and should be straightforward to add. Analytics are available in both versions for now, though that may be differentiated in future iterations.
What's Next
As AI coding tools evolve rapidly, how do you see TraceVault evolving alongside them?
Krzysztof: A few things. Expanding support to other AI coding tools beyond Claude - Cursor, Codex, and others - is on the list, and there's an emerging industry standard for trace schemas that Cursor proposed earlier this year, which Microsoft and Anthropic have signed on to. That would make broader integration more tractable. Claude currently gives us a particularly rich signal about what the developer is doing and what the model is thinking, which is why it's our primary focus.
We're also planning real-time suggestions: analyzing traces as they happen and nudging developers toward tools or skills they're not using but should be. If someone isn't using a particular MCP server that would help them, TraceVault could surface that in the moment.
Editor's note: To learn more about TraceVault and how it could fit into your organization, visit https://virtuslab.com/services/tracevault and https://tracevault.dev/




