Blog
spec-kitspec-driven-developmentai-agentclaude-codevibe-codingai

[Spec Kit Part 1] Why Spec-Driven Development — Moving Beyond the Limits of Vibe Coding

AI-generated code is fast, but it falls apart the moment you make your second request. We look at why Spec-Driven Development (SDD) and GitHub Spec Kit — which treat the spec as the source of truth — emerged.

Data DynamicsJune 11, 202610 min read

You've probably built a first screen in 30 seconds with an AI coding tool. It's exhilarating. But the moment you make a second request, the part that worked perfectly in the first one quietly breaks. By the third request, neither you nor the AI can explain what the code even looks like anymore. This post is about why that "wall at the second request" appears, and how the Spec-Driven Development (SDD) approach proposed by GitHub Spec Kit tears that wall down.

What you'll learn in this post

  • The 3 failure modes in which "vibe coding" collapses as scale grows
  • The core shift in SDD that makes the spec — not the code — the source of truth
  • The SDD workflow at a glance: Spec → Plan → Tasks → Implement
  • Where SDD meets and where it diverges from TDD, BDD, and prompt engineering
  • The example and roadmap we'll build together across this series (7 parts in total)

This is Part 1 of the Spec Kit series. The series as a whole follows a single example project from 0 to 1 along the SDD workflow.


1. The Thrill, and What Comes After

The term "vibe coding" refers to a style where you toss your intent at the AI in natural language and it spits out code on its own. For prototypes, one-off scripts, and weekend hackathons, it's hard to beat. The problem is that this approach is stateless. Every request depends only on the immediately preceding conversational context, and that context grows blurrier the longer it gets.

In small projects, this isn't a problem. But the moment the codebase grows, requirements pile up, and you come back a few days later to add a feature — the moment it becomes "real software" — vibe coding starts to fall apart in three ways.

Failure Mode 1 — Context Loss

An AI's context window is finite. As the conversation grows long, the decisions you agreed on early ("auth via JWT," "errors in this format") get pushed out of the window. You say "the way we decided earlier," but for the model "earlier" no longer exists. As a result, you either re-explain the same decisions every time, or you forget to, and consistency breaks.

Failure Mode 2 — Consistency Drift

Even with the same intent, the output changes if the wording of the prompt shifts even slightly. The AI that created columns in snake_case yesterday creates them in camelCase today. Folder structure, error-handling patterns, and naming conventions drift subtly with each request, and as these tiny discrepancies accumulate, the codebase starts to look "as if several people wrote it separately."

Failure Mode 3 — Unverifiability

This is the most critical problem. When the spec lives only in someone's head (or in a conversation that has since scrolled away), there is no reference point for judging "does this code work correctly?" The AI may produce plausible-looking code, but there's no document to check it against to see whether it satisfies the original intent. Reviews come to rely on "does it feel okay," and bugs lose their accountability in the face of the question, "wait, were we supposed to do it that way?"

The real problem with vibe coding isn't code quality — it's that there is no source of truth. When there's nowhere written down stating what's correct, the faster you build, the faster you lose your way.


2. The Shift in Thinking — The Spec, Not the Code, Is the Truth

In traditional development, a spec document is usually written once and forgotten. It's merely a starting line you reference before writing code; once the code is finished, the document goes stale and only the code survives. In other words, the code is the source of truth and the spec is its shadow.

Spec-Driven Development inverts this relationship.

Loading diagram…

In SDD, the spec is a living document. To change a feature, you fix the spec first, not the code, and the code is then regenerated and verified (with the AI's help) from that spec. This structurally resolves the three failure modes we saw earlier.

Failure ModeThe SDD Solution
Context LossEvery decision is recorded in a spec file, so it never disappears out of the context window
Consistency DriftA constitution and the spec enforce the same standard at every step
UnverifiabilityA reference for cross-checking (analyze) code against the spec always exists

The point isn't "let's write more documents." It's that to get the AI to write good code, you have to lock the intent you give it into a form a human can review. The spec is a contract shared by humans and AI.


3. GitHub Spec Kit — A Toolbox for SDD

Spec Kit is an open-source toolkit released by GitHub, and it consists of two pillars that make SDD actually runnable.

  1. The specify CLI — a bootstrap tool that installs SDD scaffolding into your project. It downloads and sets up templates matching the AI agent you use (Claude Code, Copilot, Gemini, etc.).
  2. A set of slash commands — workflow commands you invoke inside your AI agent in the form /speckit.*. Each command handles one stage of SDD.

What's especially notable is that Spec Kit supports more than 30 AI coding agents with no lock-in when switching between agents. Because specs, plans, and tasks remain as text files, your assets stay intact even when you change tools. (In this series, we'll do our hands-on work based on the Claude Code integration familiar to our blog's readers.)

The SDD Workflow at a Glance

The core flow Spec Kit defines is Spec → Plan → Tasks → Implement, with quality gates attached before and after.

Loading diagram…

Here is each command's role in a single line. (Detailed hands-on coverage comes in Parts 3–6 of the series.)

StageCommandOne-line Summary
Constitution/speckit.constitutionEstablishes the principles that run through the project — code quality, testing, UX, performance, etc.
Specify/speckit.specifyLeaves out the tech stack and focuses on the what/why (requirements, user stories)
Clarify/speckit.clarifyFills the gaps in the spec with a sequence of questions
Plan/speckit.planDesigns the how — tech stack, architecture, and so on
Tasks/speckit.tasksBreaks the plan down into a dependency-ordered task list
Analyze/speckit.analyzeCross-checks for contradictions and omissions among spec, plan, and tasks
Checklist/speckit.checklistGenerates a tailored quality checklist that verifies requirements and clarity
Implement/speckit.implementExecutes the tasks in order to produce actual code
Converge/speckit.convergeCompares artifacts against the codebase and recovers remaining work as tasks

Key insight: SDD does not "generate it all in one shot." The very structure of splitting it into multiple stages and having a human verify at each one (multi-step refinement) is the source of quality.


4. Where SDD Meets and Where It Diverges

When you first encounter SDD, you might wonder, "isn't this just TDD/BDD?" or "isn't this just writing good prompts?" Placing it on the map reduces the confusion.

ApproachSource of TruthEmphasisRelationship to SDD
Prompt engineeringNone (volatile conversation)Quality of a single responseSDD is the higher-level system that locks prompts into files so they're reusable and verifiable
TDDTest codePins down "is the behavior correct" in codeSDD derives tests and tasks from the spec (mutually complementary)
BDDScenarios (Given/When/Then)Describes behavior in natural languageShares the same grain as the user stories in an SDD spec; SDD connects planning and execution to it
SDDThe spec fileConsistent traceability from intent → plan → tasks → codeThe umbrella that ties the three above together into an AI workflow

In short, SDD doesn't so much replace existing techniques as orchestrate them. It's closer to an operating system that threads the reproducibility of prompts, the verifiability of TDD, and the readability of BDD onto a single reference point — the spec — so that AI agents work consistently.


5. Where SDD Fits Well and Where It's Overkill

Not every task needs a constitution from the start. To introduce a tool honestly, you have to state its limits too.

  • Good fit: features you'll revise multiple times, code a team handles together, production features with ambiguity in the requirements, work spanning days to weeks. In other words, anything where a "second request" is sure to come.
  • Potential overkill: throwaway scripts, 10-line tasks where reading the code is faster than reading a spec, the earliest stages of exploratory prototyping.

The stages of SDD are quality gates, not mandatory tolls. Spec Kit itself recommends /speckit.constitution, /speckit.clarify, and /speckit.checklist as "gates you turn on for work with meaningful ambiguity." Knowing when to turn these gates on and off according to the weight of the work is exactly the skill of using SDD well.


6. What We'll Build Together in This Series

Concepts alone don't stick. So this series builds a single example project — a real-time data quality monitoring service — from start to finish along the SDD workflow. It monitors the freshness, integrity, and anomalies of data pipelines and sends alerts — an example close to our domain.

PartTopicProgress on the Example
Part 1 (this post)Why SDDProblem definition and the big picture
Part 2Getting Started with Spec Kitspecify install & init, Claude Code integration, project structure
Part 3ConstitutionEstablishing a project constitution for a data team
Part 4Specify & ClarifySpecifying the monitoring service's requirements and removing ambiguity
Part 5Plan & TasksArchitecture design, task breakdown, and consistency analysis
Part 6Implement & ConvergeActual implementation, GitHub issue integration, completion verification
Part 7A Hands-on RetrospectiveA 0→1 end-to-end case study and a roundup of pitfalls

Wrapping Up

Vibe coding is fast, but the faster it is, the faster it loses its way. The secret to not getting lost isn't a smarter AI — it's having a source of truth shared by AI and humans: a verifiable spec. Spec-Driven Development puts that spec at the center of the workflow, and GitHub Spec Kit gives you the toolbox to actually run that workflow.

In Part 2, we'll stop talking and start moving our hands. We'll install the specify CLI, lay down SDD scaffolding on an empty project, and get all the way to integrating with Claude Code — in 5 minutes.

References