spec-kitspec-driven-developmentconstitutionai-agentclaude-codeai

[Spec Kit Part 3] Constitution — Imprinting Your Team's Principles on the AI

The single file that flows into every Spec Kit phase: constitution.md. We walk through how a data engineering team writes the constitution for the dq-monitor project, and what separates a good principle from a useless one — with a full reusable example.

Data DynamicsJune 13, 202615 min read

In Part 2 we used the specify CLI to scaffold the dq-monitor project with SDD and wired it up to Claude Code. Now it's time to draw the first line on the blank canvas — and the curious thing about SDD is that the first line is neither the spec nor the code, but the constitution. The constitution is a file where you write down, once, what "good code" means for this project, and every /speckit.* phase that follows reads it before doing anything. This post is about writing that constitution well, grounded in the reality of a data engineering team.

What you'll learn in this post

Why the constitution (.specify/memory/constitution.md) is the highest-leverage file in the whole series

How to draft a constitution with the /speckit.constitution command

"Verifiable principles" vs "feel-good slogans" — what separates good principles from bad ones

A complete, ready-to-use constitution example for a data engineering team

How the constitution interlocks with the clarify, checklist, and analyze gates

This is Part 3 of the Spec Kit series. Part 1 covered the philosophy of SDD and Part 2 covered installation and project structure; Part 3 fills in the constitution, the first real step of the workflow.

1. What the constitution is, and why it governs every phase

The Spec Kit workflow flows in this order:

Constitution → Specification → Clarification → Planning
→ Task Breakdown → Implementation → Convergence

The constitution comes first. But it isn't merely "the first step" — it's the step that underlies all the others. When writing the spec, planning, breaking down tasks, and implementing code, the AI agent reads the constitution first and reasons on top of it.

The constitution file lives at this path:

.specify/memory/constitution.md

The word memory in that path is telling. The constitution is the project's long-term memory. In Part 1 we named "context loss" as the first failure mode of vibe coding: as the conversation grows, decisions agreed on early get pushed out of the context window. The constitution is the structural fix for exactly that. Pin decisions like "we annotate types 100% of the time" or "schemas never break silently" into a file, and those decisions survive all the way to the 50th request.

The constitution is loaded into every /speckit.* step. Spec files are created one per feature, plans fork per feature too — but the constitution is the single file that applies across all phases and all features. Change one line and its effect is multiplied across every downstream artifact. That is why it's the highest-leverage file in the entire series.

Concretely, the constitution holds "the principles that run through this project." There are four canonical axes:

Axis	What the constitution pins down	What happens if you don't
Code Quality	Style, typing, lint gates	The AI writes code with a different convention every request
Testing Standards	Coverage bar, which tests are mandatory	Code with tests and code without tests get mixed together
UX Consistency	Error format, tone and structure of logs and alerts	The same situation gets rendered differently every time
Performance	Latency and throughput budgets (SLA)	Only the word "fast" exists, with no bar to catch regressions

The key point is that the constitution defines not what to build (what) but how well to build it (how well). Functional requirements come in the spec (Part 4, as previewed). The constitution is the layer above — it lays down the floor of quality that must be held no matter which feature you build.

2. Running `/speckit.constitution`

You can write the constitution by hand, but Spec Kit provides a dedicated command that lets the AI agent draft it for you. Inside Claude Code, you invoke it like this:

/speckit.constitution

Run the bare command and the agent either asks "which principles should I focus on?" or, given an accompanying instruction, drafts immediately. The best practice is to provide a prompt that names the focus areas.

/speckit.constitution
 
Create principles focused on code quality, testing standards,
user experience consistency, and performance requirements.
We're a data engineering team building a real-time data quality
monitoring service (dq-monitor).

Given this prompt, the agent moves roughly through this sequence:

Loading diagram…

The point to stress here is step 4 (the human review). The agent's draft is a starting line, not a finish line. As we saw in Part 1, SDD quality comes not from "one-shot generation" but from "step-by-step human verification." The constitution is the first verification point, and any principle waved through here propagates its fuzziness to every downstream phase.

So when you receive the draft, the first question to ask is a single one: "Can a machine or a reviewer judge whether this principle was violated?" If the answer is "no," that principle needs a rewrite. That's exactly what the next section is about.

3. Good principles vs bad principles — verifiability is everything

The most common way a constitution fails is not the "wrong principle" but the "unverifiable principle." A sentence like "we pursue clean code" is something nobody disagrees with — and that nobody can violate either. A principle whose violation can't be judged constrains the AI not at all, because no matter what code it emits, it never runs into a counterexample proving "this isn't clean."

Good principles are specific and checkable. A human should be able to read one and split it into "pass/fail," and ideally a machine can judge it too. The table below shows the difference.

Area	Bad principle (vague, unverifiable)	Good principle (specific, verifiable)
Code Quality	"Code should be easy to read"	"Every public function has type annotations, and `ruff` and `mypy --strict` pass with zero warnings"
Testing	"Write enough tests"	"New/changed code has ≥80% line coverage; every data schema has a contract test"
Performance	"Alerts should be fast"	"p95 latency from anomaly detection to alert dispatch ≤5s; a single instance handles 10k events/sec"
Data Contracts	"Manage schemas well"	"Schema changes bump the version number; backward-incompatible changes can't merge without a migration note"
Observability	"Log things properly"	"All logs are structured JSON including `trace_id`. Every alert must be traceable back to its causal metric"
Security	"Care about security"	"Secrets never appear in code/logs and are injected only via env vars or a secret manager. Service accounts hold least privilege"

What the right-hand column shares is the presence of numbers, tool names, and explicit prohibitions. Expressions like p95 5s, 80% coverage, mypy --strict, and can't merge eliminate the gray zone. When the AI reads the constitution to write code, when a human reviews a PR, and when /speckit.checklist later generates items — all of them can judge against the same bar.

Self-check when writing a principle: if the sentence doesn't immediately evoke a test case, a lint rule, or a PR review comment, the principle is still at the slogan stage. Make it one notch more concrete.

4. A complete constitution example for a data engineering team

Now let's look at a constitution, in full, that a data engineering team building dq-monitor could actually use. Below is a reusable asset you can drop straight into .specify/memory/constitution.md and adjust just the numbers to fit your team. (The "verifiability" principle from Section 3 is applied to every clause.)

# dq-monitor Constitution
 
This constitution is the top-level set of principles applied to all
specs, plans, tasks, and implementation of the real-time data quality
monitoring service (dq-monitor). Every `/speckit.*` phase generates and
verifies its artifacts against this document. On conflict between
clauses, the higher-priority (lower number) wins.
 
Version: 1.0.0
Ratified: 2026-06-23
Last Amended: 2026-06-23
 
## 1. Code Quality
 
- 1.1 Language standard: Service code uses Python 3.12+, and every public
      function, method, and module boundary carries type annotations.
- 1.2 Static gates: `ruff check` (lint), `ruff format --check` (format),
      and `mypy --strict` must all pass with zero warnings for CI to go
      green. Code that fails the gates cannot be merged.
- 1.3 Function complexity: A single function does not exceed cyclomatic
      complexity 10. If exceeded, decompose it or leave an explicit
      exception note.
- 1.4 Dependencies: Adding a new runtime dependency requires at least one
      line in the PR description giving the reason and the alternatives
      considered. It must come with a lock-file update.
 
## 2. Testing Standards
 
- 2.1 Coverage: Test coverage of new/changed lines is ≥80%. A PR that
      lowers overall coverage must state the reason.
- 2.2 Contract tests: Every input data schema and outbound alert payload
      has a contract test. Schemas are pinned by fixtures; if a fixture
      and the real code diverge, the test fails.
- 2.3 Determinism: Tests do not depend on external networks or the real
      wall clock. Time is an injectable clock; external systems are
      replaced by fakes/mocks.
- 2.4 Regression tests: A bug-fix PR must include a regression test that
      would have failed before the fix.
 
## 3. Data Contract Discipline
 
- 3.1 Schema versions: Every data schema carries an explicit version
      (semver).
- 3.2 No silent breaking change: A backward-incompatible schema change
      cannot merge without (a) a major version bump, (b) a migration note,
      and (c) an explicit approval label.
- 3.3 Compatibility first: Prefer additive changes (adding fields); field
      removal, type changes, and meaning changes go through a deprecation
      window.
- 3.4 Validation location: Data is schema-validated immediately at the
      system boundary (ingestion point). Records that fail validation are
      not silently dropped — they are dead-lettered and counted in metrics.
 
## 4. Observability
 
- 4.1 Structured logs: All logs are structured JSON, including at minimum
      `timestamp`, `level`, `service`, `trace_id`, and `message`.
- 4.2 Traceability: A single event-processing flow must be linkable from
      ingestion to alert by a single `trace_id`.
- 4.3 Metrics: Processing latency, queue backlog, anomaly counts, and
      alert dispatch results are exposed as metrics (e.g. Prometheus).
- 4.4 Traceable alerts: Every alert must be traceable back to the causal
      metric/record that triggered it. Unsubstantiated alerts are forbidden.
 
## 5. Performance / SLA
 
- 5.1 Alert latency budget: Latency from anomaly detection to alert
      dispatch does not exceed p95 of 5s and p99 of 10s.
- 5.2 Throughput: A single worker instance reliably handles at least
      10,000 events/sec (without backpressure).
- 5.3 Performance regression gate: Changes that may affect performance
      attach benchmark results to the PR. Changes that break the 5.1/5.2
      budgets cannot merge.
- 5.4 Resource limits: Under normal load, worker memory stays under the
      configured ceiling; on overshoot it absorbs via backpressure rather
      than OOM.
 
## 6. Security
 
- 6.1 Secrets: API keys, tokens, and passwords never appear in plaintext
      in code, config files, or logs. They are injected only via env vars
      or a secret manager.
- 6.2 Least privilege: Every service account/token holds only the minimum
      scope required. Broad (admin/wildcard) grants require justification
      in the PR.
- 6.3 Don't trust input: Data from outside is untrusted and is processed
      or stored only after validation and normalization (ties to 3.4).
- 6.4 Dependency security: Dependencies with known vulnerabilities (e.g.
      `pip-audit` high severity) must be resolved before merge or carry an
      explicit risk-acceptance record.
 
## Governance
 
- This constitution is a living document. It is reviewed quarterly or on
  major architectural change; on amendment, the version (semver) and the
  amendment date are updated.
- No spec, plan, or task may conflict with this constitution. If a
  conflict is unavoidable, amend the constitution first, then proceed.
- Principles must be verifiable. When adding a clause, also write "how a
  violation is judged."

The intent of this example isn't "copy it verbatim" but to show how each clause was written to be verifiable. Every item carries numbers, tools, or explicit prohibitions, and the governance section pins the constitution down as a living document rather than write-once. Your team just swaps the numbers (80%, p95 5s, 10k/sec) and tool names (ruff, mypy, pip-audit) to fit your stack.

5. How the constitution interlocks with later gates

The constitution isn't a document you write and forget — it's the reference point that later quality gates consult. Spec Kit recommends turning on /speckit.constitution, /speckit.clarify, and /speckit.checklist as quality gates for production work or work with meaningful ambiguity. All of these gates shine a light down onto the constitution.

Loading diagram…

Concretely, here's how each gate uses the constitution:

Gate	How it references the constitution
`/speckit.clarify`	When questioning gaps in the spec, it probes for missing decisions against the bars the constitution sets (e.g. SLA, schema versioning policy)
`/speckit.checklist`	Converts constitution clauses into verifiable check items. A clause like "p95 5s" becomes a single checklist line
`/speckit.analyze`	Cross-checks not only whether spec/plan/tasks contradict each other, but whether any constitution-violating decision has crept in
`/speckit.implement`	When generating code, it applies the constitution's code-quality, testing, and security principles as the bar

The implication is clear. One line in the constitution is amplified later into many checklist items and analysis warnings. Conversely, a principle you omit from the constitution is caught by no gate — gates don't invent bars that aren't in the constitution. So the "write it verifiably" point stressed in Section 3 finally pays off here: the more concrete a clause, the more automatic verification it reduces to.

6. Anti-patterns — three ways to ruin a constitution

Knowing how to write a good constitution matters as much as knowing how to avoid the common failures. Here are three anti-patterns that recur in the field.

Anti-pattern 1 — Copy-pasting a generic constitution

This is when you paste a whole "model constitution" found online. The problem is that the value of a constitution lies in that team's specific decisions. "10k events/sec" and "p95 5s" are numbers born from dq-monitor's domain — they can't be written into a generic constitution. Someone else's constitution can be a good table of contents but never good content. Borrow the structure, but fill in every number and prohibition yourself.

Anti-pattern 2 — Principles nobody enforces

If you write "80% coverage" but there's no coverage gate in CI, that principle is decoration. Writing it verifiably (Section 3) is necessary but not sufficient — it only gains force once you actually wire it into a gate. For each clause, ask "where is the mechanism that enforces this?" (a CI job, a lint rule, a /speckit.checklist item, a PR review rule). A clause with no enforcement mechanism is a candidate to either enforce or delete at the next retro.

Anti-pattern 3 — Treating it as a write-once document

This is writing the constitution once and forgetting it. But projects grow: the initial SLA turns out to be unrealistic, or a new security requirement appears. The constitution must be a living document — that's why the governance section in the example above spells out "review quarterly, bump the version on amendment." Amending the constitution is not defeat but learning. Just keep SDD's direction when you do: fix the constitution first, not the code.

The three anti-patterns in one sentence: a constitution must be ours, not someone else's; enforced, not decorative; and alive, not embalmed.

Wrapping up

By length, the constitution is a one-screen file; by leverage, it's the heaviest file in the whole series — because every later phase of spec, plan, tasks, and implementation reads it before starting. So the secret to writing it well is just one thing: write it verifiably. Put in numbers, tools, and explicit prohibitions, so that humans, the AI, and automated gates can all judge against the same bar.

In Part 4, on top of the constitution's floor, we finally write down what to build. We'll follow the process of specifying dq-monitor's requirements with /speckit.specify and filling the gaps in that spec with /speckit.clarify. If the constitution decided "how well to build," the spec decides "what to build." Where the two meet, real SDD begins.

References

GitHub Spec Kit repository

Spec Kit official documentation

Diving Into Spec-Driven Development With GitHub Spec Kit (Microsoft for Developers)