Agents for Legacy Code migration

According to a McKinsey report, 70% of software in Fortune 500 companies was developed 20 or more years ago. Reuters reports that 95% of ATM swipes are processed using COBOL-based systems. Delta Airlines might have lost $500 mln in 2024, as a result of the IT outage, which could be less severe if modern systems were in place.

The traditional large, legacy system migration can cost multiple times the yearly company revenue and still yield limited versions of the original system functionality. However, with the help of AI, it can change. The migration process might become faster, cheaper, and smoother.

Agentic approach for codebase migration

When designing an agentic system, I always like to think about 3 rules:

Try to mimic how humans would approach the problem. LLMs, which are the core of each agent, are trained to speak and act as every other human. Therefore, the human problem approach will be the most intuitive for them.
Break the problem down into as many simple pieces as possible. The process needs to make logical sense. Small tasks are less prone to errors and tend to guide the process in the right direction.
Use double verification, critical thinking, and reflection. Like in the human teams, the second pair of eyes helps to spot weak spots and mistakes as early as possible.

One shot. Shoot and forget. Vibe coded

Ok, but wouldn’t it be enough to just tell the agent what we want, and watch the work being done?...

Let’s try it out and see. I took as an example the Pilot Episode game written in C++ and asked Claude Code to rewrite the game in Rust. Maybe it is not an outdated code, but it is a full standalone project that can be rewritten in a safe and secure language. I created a detailed task description and gave it to Claudecode Sonet 4.5.

Task: System Migration & Parity Rewrite (Legacy to Rust)

Objective: You are given the legacy codebase. Please go through it, analyze, understand the logic, note UI, UX and overall functionality of the system. Your goal is to rewrite the system in Rust. Please keep the functionality and the UI as close to the original as possible.

Scope of Work:

System Audit: Conduct a deep-dive analysis of the legacy logic, data structures, and edge cases.
Behavioral Parity: Replicate the existing functionality and business logic precisely. The goal is a seamless transition where the end-user perceives no change in behavior.
UI/UX Fidelity: Maintain the current interface layout and user experience flow, ensuring visual and interactive consistency.

Deliverables: A fully functional Rust implementation that passes all parity tests against the original system's output.

Given the prompt above, the Claude code performed the plan and migrated the game accordingly, but the created game was not working as expected. The physics logic was broken (bullets slower than the plane), no clouds, no enemies, only single-shot bullets, incomplete colors, and many more bugs. Simple testing and prompting the agent to fix bug by bug slightly improved the game, but still left many unresolved, high-impact bugs, which require human intervention.

So, NO. Vibecoding is still not good enough to migrate a large codebase. Especially taking into account that the example pilot game I used had about ~20k lines, and there are much larger and more complicated systems. On the other hand, as I wrote in my previous blogpost Do I need an AI agent, the LLM capabilities are growing rapidly, and things that are not doable today with a single agent might be achievable in the near future.

Tailor-made agentic system for code migration

I approach the task specifying 3 main phases:

Plan the migration. Create up-to-date, detailed documentation of the system. Verify, update, and write more tests to fully cover system functionality. Create a migration plan, dividing it into verifiable subtasks and phases.
Code the new system according to the migration plan, documentation, tests, and legacy codebase.
Iteratively test and refine the code based on differences to the legacy system.

The key idea was to limit human interventions and to make the process as automatic as possible.

Phase 1. Create/update documentation, update and expand tests, and create a detailed migration plan.

Phase 2. Coding

The Coding agent is spawned. It writes and tests the code according to the migration plan, old codebase, project documentation and legacy code tests. With a well-designed migration plan, the work can be parallelised between multiple agents.

Phase 3. Iterative refinements

The end-to-end tester agent investigates differences between legacy and post-migration apps. In the iterative feedback loop it cooperates with the coding agent to fix all issues and achieve system parity.

E2E tester agent importance

The most challenging and important part of the system is the E2E tester agent. It requires the capability to capture the system logic, flow, and dynamics. It needs to navigate and interact with the unknown environment, capturing its functionalities, UI, UX and being able to connect action-outcomes of long-time-dependent events.

A well-designed E2E tester agent allows for creating good project documentation, and is invaluable for comparing the groundtruth app with the developed app, while iterating in the fix-iteration loop. It can precisely spot all differences, convert them to issues, and pass them for fixing to the coding agent.

How much does it cost?

I used claude-sonnet-4-5. One refinement iteration (comparing C++ vs Rust implementation + fixing issues/diff using coding agent) costs about $5. It takes about 20 iterations to achieve parity. Including planning phase 1 and coding phase 2, overall costs are about $150 for the 20k lines project.

Challenges

Achieving parity is usually not desirable while migrating the system. We might want to introduce new, modern schemas, use newer protocols, change parts of the system that were buggy. Also, the legacy code might contain buggy features that the rest of the system relies on. Fixing them might cause other problems.

Even though the multi-agent system is capable of migrating the system, keeping its functionality, logic, UI, UX, it may miss the performance, security, quality and memory requirements. Coding agents often create code functional but messy, hard to maintain, or inefficient. While the functional requirements can be met, agents might miss the non-functional requirements.

Another challenge might be the limited context window of agents when approaching very large systems. To succeed, the migration plan needs to be well divided, with clear separable subtasks, so the coding agent can work on smaller tasks, with just a high level knowledge about the rest of the system. Creating a documentation and migration plan that does not fit into the LLM context is a massive engineering challenge on its own.

Legacy systems usually contain multiple interdependent modules. Approaching the migration, one should take into account communication between those modules and the organization's overall system architecture and business goals. Migrating a single module, without border strategic context, might miss the point of migration.

Finally, testing the game, UI based application or a library is doable with the LLM agent, however testing an embedded system or green-screen terminals is not straightforward.

Conclusions

Migrating legacy code is a complex and costly process, often deterring even the biggest companies from updating their systems to modern languages and libraries. However, by leveraging AI-driven agentic workflows, a big part of the process can be already automated. While the single coding agents are still not powerful enough to lead the whole migration process, building a carefully designed multi-agentic system can already significantly speed up the codebase migration. The key to the success is the E2E testing agent, which traverses the system environment, captures its logic, functionality, UI and UX, and provides valid feedback to the coding agent, so it can perform fixes.