This month, we used AI to stress-test three repeatable patterns:
- first, we ran a full Java 8 to 21 migration to compare Claude Code with Cursor, measuring time, cost, and where each shines,
- then we scaled agents on a large codebase by right-sizing context with Cline’s Focus Chains,
- then, we put two specialized agents into a review loop (React dev × TypeScript reviewer) and added a scribe to keep sessions coherent.
- finally, we did rapid PoCs with Cursor (agentic mode).
Curious about more details? Dive in!
I migrated a small Java 1.8 project to Java 21 using two AI assistants - Claude Code (VS Code plugin) and Cursor (VS Code-based editor), following the same playbook: set the target, let the AI scan the codebase, iterate toward modern idioms (var, switch expressions, try-with-resources, pattern matching, text blocks), then compile, run, and compare. The first pass didn’t boot cleanly, so I dropped into a human-in-the-loop debug cycle. With Claude, it felt like pairing with a diligent junior: it added debug prints, checked whether methods were actually invoked, searched for the right call sites, and proposed logic fixes. My job was to run the app, paste logs back, and keep the loop moving—log → analysis → fix → test—until the build finally ran.
Where AI helped me the most
Day-to-day, Claude maintained context in longer sessions, proactively debugged, and produced suggestions that matched Java 21 conventions, which made the migration feel collaborative and precise. Cursor also generated a to-do and tried to run/inspect logs, but it often stalled, felt less interactive, and seemed to struggle with full-project context - good for quick, targeted edits rather than end-to-end migrations.
The numbers reflect that:
Claude took ~3–4 hours and about $28.92 in tokens (business plan), while Cursor fit in roughly an hour within its free monthly limit.
- If I’m doing holistic migrations or cross-cutting refactors, I’ll reach for Claude.
- For small fixes on a predictable subscription, Cursor does the job.
Cline is moving fast, and on large codebases, the big win has been how Focus Chains and Context Summarization tame context. I tested this approach on my NES emulator, kNES, carving narrow focus corridors to exactly the files and tests I was touching, then pinning a compact summary of the non-negotiables (APIs, contracts, timing rules, recent decisions).
Auto-selection is a fine starting point, but reliability came from human-in-the-loop curation: include, exclude, override. Net effect: right-sized input, shorter prompts, and agents that act smarter because they only see what matters and always remember what must hold.
Where AI helped me the most
Once the context is clean, I fan it out to multiple agents or mix models to compare and combine outputs without adding noise. This is where Cline’s upgrades shine: Focus Chains keep each agent scoped, and Context Summarization keeps them aligned, resulting in faster convergence, tidier diffs, and fewer regressions on kNES.
Additionally, when I need to prepare a shareable bundle for web chats about the codebase, I learned that detroittommy879/aicodeprep-gui is a decent tool for the job.
This month, I was testing Claude Code agents. I made two agents with almost the same prompts, giving them different expertise. One was as a TypeScript expert who can do deep code reviews and care about type safety in the codebase. The other one was more like a React developer. In the main context, I asked Claude Code to orchestrate their work. I told it to make a review loop: React agent changes the code, and the TypeScript agent reviews and asks for changes if needed.
Where AI helped me the most
I noticed this method gave me more consistent and better-quality code. Because automatic summarisation does not destroy agents' prompts, I didn’t need to remind LLM about type safety, quality, or the general rules.
Another tip I can share is having a “scribe” agent. I tell it to keep a log of all the work done. Thanks to that, bringing back context when starting a new session is easier.
Over the last few weeks, I've been working on a handful of proof-of-concepts within the greenfield part of a larger project. From the very beginning, I've employed Cursor (agentic mode) to create the scaffolding and working prototypes of new services, prompting me from a rough application shell to a nearly complete solution.
Where AI helped me the most
That approach let me iterate rapidly and present a result to both the tech team and the business, all at a speed yet unseen - several hours for something that would probably take me up to a few days, if crafted manually.
But it's not without issues, though.
Where AI still fails
First, it's very tempting to go down the rabbit hole of the solution coded by the AI Agent, and quickly lose track of the good coding practices or architectural patterns. The agent will most likely not care about things like "ports and adapters", unless explicitly told to. By default, it will just try to create a working solution, with little attention to established patterns. It's okay for a proof-of-concept, but not a way to go in case of production code.
Second, one still has to carefully review the code. It happened to me several times that when the coding agent ran into an issue it could not solve easily, after a few iterations, it would resort to what I call "unfastening the seatbelts" and working around things like compiler errors (TypeScript folks: x as any
YOLO!) and linter issues (disabling eslint for the next line... or entire file!). However, such cases can mostly be prevented using cursor rules.
Third, "you are so right". Have you ever happened to catch the AI producing utterly dumb output, be it code or an explanation of how certain things work? The moment you tell it it's wrong (and why), the very first line in its response is most likely going to be along the lines of "certainly, you are right (...)". This is always a "bruh" moment to me - why not admit that it just didn't know straight ahead?
These are just a few cases. Don't get me wrong, though - the AI companion is a significant, positive change in developers’ work. It's just that, as the old saying goes, "with great power comes great responsibility". And I think this line is so relevant now, more than ever.
Tried agentic AI or prompt engineering on a real task? We’d love your story for the next issue. Drop it in the comments or ping us on your channel of choice. We’ll feature the most interesting submissions in the series 🚀
Contact page
Socials: X, Mastodon, Bluesky