[Spec Kit Part 3] Constitution — Imprinting Your Team's Principles on the AI
The single file that flows into every Spec Kit phase: constitution.md. We walk through how a data engineering team writes the constitution for the dq-monitor project, and what separates a good principle from a useless one — with a full reusable example.
In Part 2 we used the specify CLI to scaffold the dq-monitor project with SDD and wired it up to Claude Code. Now it's time to draw the first line on the blank canvas — and the curious thing about SDD is that the first line is neither the spec nor the code, but the constitution. The constitution is a file where you write down, once, what "good code" means for this project, and every /speckit.* phase that follows reads it before doing anything. This post is about writing that constitution well, grounded in the reality of a data engineering team.
What you'll learn in this post
- Why the constitution (
.specify/memory/constitution.md) is the highest-leverage file in the whole series- How to draft a constitution with the
/speckit.constitutioncommand- "Verifiable principles" vs "feel-good slogans" — what separates good principles from bad ones
- A complete, ready-to-use constitution example for a data engineering team
- How the constitution interlocks with the
clarify,checklist, andanalyzegates
This is Part 3 of the Spec Kit series. Part 1 covered the philosophy of SDD and Part 2 covered installation and project structure; Part 3 fills in the constitution, the first real step of the workflow.
1. What the constitution is, and why it governs every phase
The Spec Kit workflow flows in this order:
Constitution → Specification → Clarification → Planning
→ Task Breakdown → Implementation → Convergence
The constitution comes first. But it isn't merely "the first step" — it's the step that underlies all the others. When writing the spec, planning, breaking down tasks, and implementing code, the AI agent reads the constitution first and reasons on top of it.
The constitution file lives at this path:
.specify/memory/constitution.md
The word memory in that path is telling. The constitution is the project's long-term memory. In Part 1 we named "context loss" as the first failure mode of vibe coding: as the conversation grows, decisions agreed on early get pushed out of the context window. The constitution is the structural fix for exactly that. Pin decisions like "we annotate types 100% of the time" or "schemas never break silently" into a file, and those decisions survive all the way to the 50th request.
The constitution is loaded into every
/speckit.*step. Spec files are created one per feature, plans fork per feature too — but the constitution is the single file that applies across all phases and all features. Change one line and its effect is multiplied across every downstream artifact. That is why it's the highest-leverage file in the entire series.
Concretely, the constitution holds "the principles that run through this project." There are four canonical axes:
| Axis | What the constitution pins down | What happens if you don't |
|---|---|---|
| Code Quality | Style, typing, lint gates | The AI writes code with a different convention every request |
| Testing Standards | Coverage bar, which tests are mandatory | Code with tests and code without tests get mixed together |
| UX Consistency | Error format, tone and structure of logs and alerts | The same situation gets rendered differently every time |
| Performance | Latency and throughput budgets (SLA) | Only the word "fast" exists, with no bar to catch regressions |
The key point is that the constitution defines not what to build (what) but how well to build it (how well). Functional requirements come in the spec (Part 4, as previewed). The constitution is the layer above — it lays down the floor of quality that must be held no matter which feature you build.
2. Running /speckit.constitution
You can write the constitution by hand, but Spec Kit provides a dedicated command that lets the AI agent draft it for you. Inside Claude Code, you invoke it like this:
/speckit.constitutionRun the bare command and the agent either asks "which principles should I focus on?" or, given an accompanying instruction, drafts immediately. The best practice is to provide a prompt that names the focus areas.
/speckit.constitution
Create principles focused on code quality, testing standards,
user experience consistency, and performance requirements.
We're a data engineering team building a real-time data quality
monitoring service (dq-monitor).Given this prompt, the agent moves roughly through this sequence:
The point to stress here is step 4 (the human review). The agent's draft is a starting line, not a finish line. As we saw in Part 1, SDD quality comes not from "one-shot generation" but from "step-by-step human verification." The constitution is the first verification point, and any principle waved through here propagates its fuzziness to every downstream phase.
So when you receive the draft, the first question to ask is a single one: "Can a machine or a reviewer judge whether this principle was violated?" If the answer is "no," that principle needs a rewrite. That's exactly what the next section is about.
3. Good principles vs bad principles — verifiability is everything
The most common way a constitution fails is not the "wrong principle" but the "unverifiable principle." A sentence like "we pursue clean code" is something nobody disagrees with — and that nobody can violate either. A principle whose violation can't be judged constrains the AI not at all, because no matter what code it emits, it never runs into a counterexample proving "this isn't clean."
Good principles are specific and checkable. A human should be able to read one and split it into "pass/fail," and ideally a machine can judge it too. The table below shows the difference.
| Area | Bad principle (vague, unverifiable) | Good principle (specific, verifiable) |
|---|---|---|
| Code Quality | "Code should be easy to read" | "Every public function has type annotations, and ruff and mypy --strict pass with zero warnings" |
| Testing | "Write enough tests" | "New/changed code has ≥80% line coverage; every data schema has a contract test" |
| Performance | "Alerts should be fast" | "p95 latency from anomaly detection to alert dispatch ≤5s; a single instance handles 10k events/sec" |
| Data Contracts | "Manage schemas well" | "Schema changes bump the version number; backward-incompatible changes can't merge without a migration note" |
| Observability | "Log things properly" | "All logs are structured JSON including trace_id. Every alert must be traceable back to its causal metric" |
| Security | "Care about security" | "Secrets never appear in code/logs and are injected only via env vars or a secret manager. Service accounts hold least privilege" |
What the right-hand column shares is the presence of numbers, tool names, and explicit prohibitions. Expressions like p95 5s, 80% coverage, mypy --strict, and can't merge eliminate the gray zone. When the AI reads the constitution to write code, when a human reviews a PR, and when /speckit.checklist later generates items — all of them can judge against the same bar.
Self-check when writing a principle: if the sentence doesn't immediately evoke a test case, a lint rule, or a PR review comment, the principle is still at the slogan stage. Make it one notch more concrete.
4. A complete constitution example for a data engineering team
Now let's look at a constitution, in full, that a data engineering team building dq-monitor could actually use. Below is a reusable asset you can drop straight into .specify/memory/constitution.md and adjust just the numbers to fit your team. (The "verifiability" principle from Section 3 is applied to every clause.)
# dq-monitor Constitution
This constitution is the top-level set of principles applied to all
specs, plans, tasks, and implementation of the real-time data quality
monitoring service (dq-monitor). Every `/speckit.*` phase generates and
verifies its artifacts against this document. On conflict between
clauses, the higher-priority (lower number) wins.
Version: 1.0.0
Ratified: 2026-06-23
Last Amended: 2026-06-23
## 1. Code Quality
- 1.1 Language standard: Service code uses Python 3.12+, and every public
function, method, and module boundary carries type annotations.
- 1.2 Static gates: `ruff check` (lint), `ruff format --check` (format),
and `mypy --strict` must all pass with zero warnings for CI to go
green. Code that fails the gates cannot be merged.
- 1.3 Function complexity: A single function does not exceed cyclomatic
complexity 10. If exceeded, decompose it or leave an explicit
exception note.
- 1.4 Dependencies: Adding a new runtime dependency requires at least one
line in the PR description giving the reason and the alternatives
considered. It must come with a lock-file update.
## 2. Testing Standards
- 2.1 Coverage: Test coverage of new/changed lines is ≥80%. A PR that
lowers overall coverage must state the reason.
- 2.2 Contract tests: Every input data schema and outbound alert payload
has a contract test. Schemas are pinned by fixtures; if a fixture
and the real code diverge, the test fails.
- 2.3 Determinism: Tests do not depend on external networks or the real
wall clock. Time is an injectable clock; external systems are
replaced by fakes/mocks.
- 2.4 Regression tests: A bug-fix PR must include a regression test that
would have failed before the fix.
## 3. Data Contract Discipline
- 3.1 Schema versions: Every data schema carries an explicit version
(semver).
- 3.2 No silent breaking change: A backward-incompatible schema change
cannot merge without (a) a major version bump, (b) a migration note,
and (c) an explicit approval label.
- 3.3 Compatibility first: Prefer additive changes (adding fields); field
removal, type changes, and meaning changes go through a deprecation
window.
- 3.4 Validation location: Data is schema-validated immediately at the
system boundary (ingestion point). Records that fail validation are
not silently dropped — they are dead-lettered and counted in metrics.
## 4. Observability
- 4.1 Structured logs: All logs are structured JSON, including at minimum
`timestamp`, `level`, `service`, `trace_id`, and `message`.
- 4.2 Traceability: A single event-processing flow must be linkable from
ingestion to alert by a single `trace_id`.
- 4.3 Metrics: Processing latency, queue backlog, anomaly counts, and
alert dispatch results are exposed as metrics (e.g. Prometheus).
- 4.4 Traceable alerts: Every alert must be traceable back to the causal
metric/record that triggered it. Unsubstantiated alerts are forbidden.
## 5. Performance / SLA
- 5.1 Alert latency budget: Latency from anomaly detection to alert
dispatch does not exceed p95 of 5s and p99 of 10s.
- 5.2 Throughput: A single worker instance reliably handles at least
10,000 events/sec (without backpressure).
- 5.3 Performance regression gate: Changes that may affect performance
attach benchmark results to the PR. Changes that break the 5.1/5.2
budgets cannot merge.
- 5.4 Resource limits: Under normal load, worker memory stays under the
configured ceiling; on overshoot it absorbs via backpressure rather
than OOM.
## 6. Security
- 6.1 Secrets: API keys, tokens, and passwords never appear in plaintext
in code, config files, or logs. They are injected only via env vars
or a secret manager.
- 6.2 Least privilege: Every service account/token holds only the minimum
scope required. Broad (admin/wildcard) grants require justification
in the PR.
- 6.3 Don't trust input: Data from outside is untrusted and is processed
or stored only after validation and normalization (ties to 3.4).
- 6.4 Dependency security: Dependencies with known vulnerabilities (e.g.
`pip-audit` high severity) must be resolved before merge or carry an
explicit risk-acceptance record.
## Governance
- This constitution is a living document. It is reviewed quarterly or on
major architectural change; on amendment, the version (semver) and the
amendment date are updated.
- No spec, plan, or task may conflict with this constitution. If a
conflict is unavoidable, amend the constitution first, then proceed.
- Principles must be verifiable. When adding a clause, also write "how a
violation is judged."The intent of this example isn't "copy it verbatim" but to show how each clause was written to be verifiable. Every item carries numbers, tools, or explicit prohibitions, and the governance section pins the constitution down as a living document rather than write-once. Your team just swaps the numbers (80%, p95 5s, 10k/sec) and tool names (ruff, mypy, pip-audit) to fit your stack.
5. How the constitution interlocks with later gates
The constitution isn't a document you write and forget — it's the reference point that later quality gates consult. Spec Kit recommends turning on /speckit.constitution, /speckit.clarify, and /speckit.checklist as quality gates for production work or work with meaningful ambiguity. All of these gates shine a light down onto the constitution.
Concretely, here's how each gate uses the constitution:
| Gate | How it references the constitution |
|---|---|
/speckit.clarify | When questioning gaps in the spec, it probes for missing decisions against the bars the constitution sets (e.g. SLA, schema versioning policy) |
/speckit.checklist | Converts constitution clauses into verifiable check items. A clause like "p95 5s" becomes a single checklist line |
/speckit.analyze | Cross-checks not only whether spec/plan/tasks contradict each other, but whether any constitution-violating decision has crept in |
/speckit.implement | When generating code, it applies the constitution's code-quality, testing, and security principles as the bar |
The implication is clear. One line in the constitution is amplified later into many checklist items and analysis warnings. Conversely, a principle you omit from the constitution is caught by no gate — gates don't invent bars that aren't in the constitution. So the "write it verifiably" point stressed in Section 3 finally pays off here: the more concrete a clause, the more automatic verification it reduces to.
6. Anti-patterns — three ways to ruin a constitution
Knowing how to write a good constitution matters as much as knowing how to avoid the common failures. Here are three anti-patterns that recur in the field.
Anti-pattern 1 — Copy-pasting a generic constitution
This is when you paste a whole "model constitution" found online. The problem is that the value of a constitution lies in that team's specific decisions. "10k events/sec" and "p95 5s" are numbers born from dq-monitor's domain — they can't be written into a generic constitution. Someone else's constitution can be a good table of contents but never good content. Borrow the structure, but fill in every number and prohibition yourself.
Anti-pattern 2 — Principles nobody enforces
If you write "80% coverage" but there's no coverage gate in CI, that principle is decoration. Writing it verifiably (Section 3) is necessary but not sufficient — it only gains force once you actually wire it into a gate. For each clause, ask "where is the mechanism that enforces this?" (a CI job, a lint rule, a /speckit.checklist item, a PR review rule). A clause with no enforcement mechanism is a candidate to either enforce or delete at the next retro.
Anti-pattern 3 — Treating it as a write-once document
This is writing the constitution once and forgetting it. But projects grow: the initial SLA turns out to be unrealistic, or a new security requirement appears. The constitution must be a living document — that's why the governance section in the example above spells out "review quarterly, bump the version on amendment." Amending the constitution is not defeat but learning. Just keep SDD's direction when you do: fix the constitution first, not the code.
The three anti-patterns in one sentence: a constitution must be ours, not someone else's; enforced, not decorative; and alive, not embalmed.
Wrapping up
By length, the constitution is a one-screen file; by leverage, it's the heaviest file in the whole series — because every later phase of spec, plan, tasks, and implementation reads it before starting. So the secret to writing it well is just one thing: write it verifiably. Put in numbers, tools, and explicit prohibitions, so that humans, the AI, and automated gates can all judge against the same bar.
In Part 4, on top of the constitution's floor, we finally write down what to build. We'll follow the process of specifying dq-monitor's requirements with /speckit.specify and filling the gaps in that spec with /speckit.clarify. If the constitution decided "how well to build," the spec decides "what to build." Where the two meet, real SDD begins.
References