I’ve always claimed there’s no better way to learn than by building something from scratch… and the second-best way is reading through someone else’s code 😁.
At VirtusLab, we recently had a sobering thought - our “collection” of starred projects had grown to massive proportions, without bringing real value to us or the wider community. So we decided to change that: add a bit of regularity and become chroniclers of these open-source gems. That way, we’ll better understand them and discover the ones where we can actually contribute.
Every Wednesday, we pick one trending repository from the past week and give it attention by preparing a tutorial, article, or code review – learning from its creators in the process. We focus on whatever piques our interest: it could be a tool, a library, or anything the community deems worth publishing. One simple rule applies – it has to be a new or lesser-known project, not the big, widely recognized giants that rack up thousands of stars after a major update.
Today, we’re diving into a fresh project from GitHub engineers: github/spec-kit.
Introduction: The End of “Vibe Coding”
We’ve all been there. That frustrating yet strangely familiar dance with an AI assistant we call “vibe coding.” It starts with a simple prompt, followed by a series of increasingly desperate attempts to clarify what we actually want to achieve. The model generates code that looks correct, compiles, but subtly misses the point - ignoring key architectural constraints or simply failing to grasp the broader project context. It’s a chaotic, unpredictable process that doesn’t scale well.

In response to this chaos, Spec-Driven Development (SDD) was born—a disciplined, engineering-first approach that doesn’t reject AI but instead seeks to harness its power in a predictable way. Rather than treating AI as a magical black box, SDD forces us back to the roots of good engineering craft: precisely defining requirements before writing a single line of code.
github/spec-kit is GitHub’s official open-source toolkit designed to put this methodology into practice. It’s not just another language model, but a carefully curated collection of templates, scripts, and a command-line interface (CLI) that works with a variety of AI agents, such as GitHub Copilot, Claude, and Gemini.
In this article, we’ll break spec-kit down to its core components. We’ll explore its philosophy, architecture, and the powerful engineering patterns behind it to understand how GitHub is trying to transform chaotic “vibe coding” into a structured software development process.
Philosophy: From Code as Truth to Intent as Truth
To understand why spec-kit was created, we need to step back and look at the fundamental problem it solves. Traditionally, in software engineering, the ultimate source of truth has been the code. Documentation is often outdated, and the original business requirements get lost in the maze of implementation decisions. spec-kit proposes a radical shift in this perspective: the source of truth should not be the code itself, but the durable, versioned, and human-readable intent behind that code.
This project is a direct response to the limitations of spontaneous prompting. It addresses the problem of mismatched assumptions and lack of shared context, which plague development teams. By forcing requirements to be clearly defined from the very beginning, it creates a verifiable contract describing how the code should behave. This contract—the specification—becomes the “lingua franca” of the entire process, reducing ambiguity, guesswork, and errors when the AI agent moves into implementation.
This paradigm shift is the heart of Spec-Driven Development. It marks the transition from treating a specification as a one-off artifact to making it “executable”—a document that directly drives code generation, testing, and validation.
The following table synthesizes this fundamental shift, contrasting the chaotic ad-hoc approach with the structured methodology of SDD. The comparison makes it clear that SDD is not a cosmetic fix, but a profound change in workflow, the developer’s role, and the very nature of project artifacts.
| Aspect | "Vibe Coding" (Ad-Hoc Approach) | Spec-Driven Development (with Spec-Kit) |
|---|---|---|
| Source of Truth | The developer’s fleeting thought; the last typed prompt. | Versioned spec.md and plan.md files. |
| Process | Unstructured, iterative trial and error. | A closed, four-phase process: Specify → Plan → Tasks → Implement. |
| Outcome | Often unpredictable, non-idiomatic code that just “looks right.” | Verifiable, consistent code that respects architectural constraints. |
| Developer’s Role | Prompt engineer, AI output debugger. | Architect, specification author, and validator of AI-generated artifacts. |
| Scalability | Hard to scale beyond small tasks; context is easily lost. | Designed for entire features and projects; context is managed and preserved. |
Architecture: A Four-Phase Symphony of Creation
At first glance, one might look for spec-kit’s architecture in its Python source code. Yet the true innovation and structure of this project do not lie in complex classes or modules, but in the rigorously defined, sequential workflow it enforces. The methodology is the architecture. The CLI tool is merely the orchestrator of this process, creating a tangible, auditable trail from the high-level intent (spec.md) all the way to the concrete implementation. This mirrors the classic goal of mature software engineering disciplines - now applied to the world of AI.

We have seen these two often
This process is divided into four closed phases. The key principle: you never advance to the next stage until the current one has been fully validated by a human. This is the main control mechanism that brings order to an otherwise potentially chaotic interaction with AI.
Phase 1: Specify – Defining the "What" and the "Why"
Everything begins with capturing the essence of the problem. This phase is about framing requirements from the user’s perspective. It does not deal with the tech stack or application design. Instead, it focuses on goals, anti-goals (what we deliberately exclude), personas, user journeys, and acceptance criteria.
- Artifact: The main product of this phase is the
spec.mdfile. This is not a static document but a “living artifact” that evolves with the project. - Process: The developer runs the
/specifycommand with a general description of the functionality. Guided by spec-kit templates, the AI agent generates a detailed specification. The engineer’s role is to verify, refine, and approve it.
Phase 2: Plan – Designing the "How"
Once the specification is validated, the focus shifts to technical considerations. In this phase, the architecture, tech stack, data models, API contracts, and non-functional requirements are defined. Here, architectural decisions are codified in a machine-readable way.
- Artifacts: A set of documents, such as
plan.md,data-model.md, andapi-spec.jsonare created. These files represent the technical plan for implementation. - Process: The developer runs the
/plancommand. Using the approvedspec.mdas context, the AI agent proposes a detailed technical plan. Once again, human verification and refinement are critical.
Phase 3: Tasks – Decomposing the Plan
A high-level technical plan is too large to hand over to an AI agent for a one-shot implementation. The third phase breaks it down into small, atomic, verifiable, and executable tasks.
- Artifact: The result is a
tasks.mdfile or an equivalent task list. - Process: The
/taskscommand analyzesplan.mdand generates a granular list of steps to structure the implementation process. Each task should be small enough that its output (the generated code) is straightforward to review.
Phase 4: Implement – AI-Guided Execution
Only now, with solid foundations in the form of the specification, plan, and task list, does actual coding begin. Yet the bulk of the code is not written by the developer.
- Process: The engineer systematically feeds the AI agent with individual tasks from the list. With full context from the earlier phases, the agent generates code for each small, well-defined problem. The developer’s role is to verify the generated code, run tests, and integrate it with the rest of the project.
Patterns and Techniques: The Engineer’s Toolbox
Let’s now move from the philosophy and process architecture to the concrete patterns and techniques that make spec-kit such a powerful tool. This is where the real value lies for the engineer who wants to understand how everything works under the hood.
Pattern 1: Intent as Code – Declarative programming through spec.md
spec-kit treats natural language written in Markdown files as a form of declarative programming. Instead of writing imperative code that tells the computer how to do something, the developer declares the desired outcome in spec.md. They describe user journeys and acceptance criteria, and the system (human + AI) is responsible for translating this declaration into working code.
This is analogous to the declarative prompt style from the LangExtract example a few editions ago, but applied at a much higher level of abstraction—not just to data extraction, but to the entire software development process.
Below is a hypothetical yet realistic excerpt of a spec.md file for the “Taskify” application, inspired by examples from the documentation:
Pattern 2: Architectural Guardrails – Persistent Context with constitution.md
One of the biggest challenges when working with LLMs is their limited memory and tendency toward “context drift”—forgetting the initial instructions during long conversations. spec-kit solves this problem elegantly with the memory/constitution.md file.
This file serves as a mechanism for defining stable, non-negotiable rules and constraints for the entire project. These might include rules such as “always use .NET Aspire and Postgres”, “all API endpoints must have unit tests”, or “the interface must comply with our design system”. The file is automatically attached to the context in key phases (especially Plan), acting as a set of guardrails that keep the AI agent on track.
This is a powerful architectural pattern. Instead of relying on the model to “remember” critical decisions, we externalize them into a durable, versioned document. constitution.md becomes a form of long-term memory for the AI agent—a fixed anchor that prevents it from drifting off course due to the temporary context of a single task.
Example content of constitution.md:
Specification generation
Task generation
Plan generation
Task generation
Beneath this simple surface, the CLI manages the entire directory structure (.github, docs, memory, specs, templates), ensuring that the right files are created in the right places and that context flows correctly between the different phases.
Pattern 4: Conversation Templating – The Prompt Engineering Engine
At the heart of spec-kit’s flexibility and its support for multiple agents is the templates directory. Instead of hardcoding prompts, the project uses an advanced templating system. It includes separate template packages for different AI agents (Claude, Copilot, Gemini) and even for different shell environments (POSIX and PowerShell).
Each template file (e.g., spec-template.md) contains markers and instructions that the CLI dynamically fills with context from constitution.md, the user’s prompt, and the contents of existing files, before sending the final, precisely constructed prompt to the language model.
This pattern illustrates a mature approach to prompt engineering. It treats prompts not as one-off incantations, but as versioned, reusable, and parameterizable assets. This transforms prompt engineering from an art into a repeatable engineering discipline.
Here’s the English translation of your passage:
Collision with Reality: Community Feedback and Practical Limitations
No expert report would be complete without a look at the practical realities of using the tool. spec-kit is not a silver bullet, and its adoption comes with certain challenges - clearly visible in GitHub community discussions.
- Overkill for small tasks: Many users point out that for minor changes or bug fixes, going through the full four-phase SDD process feels bureaucratic and inefficient. The tool shines brightest when building new features or refactoring large parts of a system.

- Context is king (and it’s hard): The effectiveness of spec-kit depends heavily on the developer’s ability to manage context. If the specification is unclear or the plan too vague, the AI agent will generate chaotic code or unnecessary files. The responsibility for providing precise, well-filtered context rests with the developer.
- Redefining work, not eliminating it: spec-kit doesn’t reduce the amount of work - it changes it. Time spent writing implementation code drops drastically, but the time invested in planning, writing specifications, and validation increases. The balance shifts from ~80% coding to ~50% planning, 20% coding, and 30% verification. Not every team may be ready for this change.
- Still experimental: GitHub itself labels spec-kit an “experiment.” The project is under active development, with frequent releases and an engaged community reporting issues and suggesting improvements. It is not yet a fully mature, polished product.
Key Takeaways: The Evolution of the Engineer’s Role in the AI Era
Analyzing spec-kit offers invaluable lessons about what the future of our profession might look like. It is not just a tool, but a manifesto for a new way of thinking about software creation.
- From coder to architect: The engineer’s primary role shifts from writing implementation details to defining intent, architecture, and constraints. The most valuable activity becomes crafting a crystal-clear spec.md and a thoughtful
constitution.md.
- The power of staged refinement: spec-kit embodies a fundamental principle of solving complex problems: break the big problem into smaller ones, validate each step, and use automation (in this case, AI) to execute them. It replaces the risky “one-shot” strategy with iterative confidence-building.
- Reliability engineering for AI: SDD can be seen as a form of reliability engineering applied to the AI-driven development process. Closed phases, explicit checkpoints, and persistent context are mechanisms designed to tame the nondeterministic nature of LLMs and ensure predictable, high-quality results.
- Prompt engineering as a discipline: This project exemplifies how prompts should be treated as first-class engineering artifacts—versioned, templated, and managed with the same care as source code.
Conclusion: The Future is Specified
github/spec-kit and the Spec-Driven Development methodology are far more than just another gadget for programmers. They represent a bold step toward a mature, predictable, and scalable way of building software with the help of artificial intelligence. The project demonstrates how the raw power of language models can be combined with the discipline and rigor of traditional software engineering.
spec-kit suggests that the future of software engineering may lie less in mastering a specific programming language and more in the art of conducting precise, structured, and verifiable dialogues with AI.
⭐ A fully deserved GitHub star from me 😉.





