Session 3: The Component Spec Layer

Series: What If a Design System Could Self QA?
Author: C. Maldonado
Date: March 2026

Where We Picked Up

Phases 1 and 2 were done. Tokens flow from Figma to code. Three QA scripts catch accessibility regressions, breaking changes, and cascading impacts on every PR. The pipeline knows about tokens — individual values like colors and spacing. What it doesn't know about is components — the things that use those tokens.

Phase 3 is where that changes. The Master Spec describes it as "Claude Code as intelligent middleware," but before Claude Code can reason about components, something has to exist that describes components in a structured way. That's the component specification layer — the real deliverable of Phase 3.

I knew this going in. What I didn't know was how to build it.

Part 1: The Wrong Starting Point

My first instinct was to look at code. KWI uses MUI (Material UI) for its React component library, themed with our design tokens. So I thought: let me look at how MUI components are structured, understand the theming API, and design the spec schema around that.

Claude pushed back — politely, but clearly. And then I pushed back on Claude. The question I asked was: if the spec is supposed to be abstract and portable across frameworks, why would we derive it from a specific framework's component model?

That was the first real architectural decision of the session. The spec schema should be shaped by what Figma knows about a component, not by how any framework implements it. Figma is the source of truth. The spec is the Figma-truth layer. How MUI (or Tailwind, or Web Components) consumes that spec is a separate concern — the binding layer.

This reframing changed the order of operations:

Look at the Figma MCP to understand what component data Figma can give us
Design the spec schema around that (the abstract "recipe card")
Then look at the KWI repo to build the MUI-specific binding that consumes the spec

I almost started at step 3 and worked backwards. Glad I didn't.

Part 2: What the Figma MCP Tells Us

The Figma MCP server (the plugin that lets Claude talk to Figma) has tools that map almost exactly to the layers we need:

get_metadata returns node IDs, names, types, positions, sizes → component identity
get_variable_defs returns colors, spacing, typography variables used by a selection → token bindings
get_design_context returns full design context and can generate code → the bridge to implementation
get_code_connect_map and add_code_connect_map read and write mappings between Figma nodes and code components → the binding layer
create_design_system_rules generates rule files for translating designs into framework-specific code → framework binding config

This was a big moment. We don't have to invent the extraction pipeline from scratch. The MCP already knows how to pull component structure and token usage out of Figma. Our spec schema is the structured intermediate format that sits between what these tools extract and what any framework binding consumes.

The flow:

Figma component
  → MCP extracts (metadata + variables + design context)
    → Component Spec (our schema — the "recipe card")
      → Framework Binding (MUI theme override, Tailwind classes, etc.)
        → Code

The spec doesn't replace the MCP — it's the stable, version-controlled artifact that the MCP populates.

Part 3: Schema vs. Instance

I had to learn what a schema actually is during this session. Not ashamed to say it — I'm a designer, and the word "schema" was floating around without me having a concrete grip on it.

Here's how I understand it now: a schema is a template with rules. Like a medical form. It tells you: "put your name here (text), your date of birth here (date), check one box for gender." You can't put a phone number where the date goes. A schema does the same thing for data.

The component spec schema says: "every component must have a name, a list of variants, a set of states, a set of slots (anatomy), and accessibility requirements." That's the blank form — universal, framework-agnostic, works for any org.

The component spec instance is the filled-out form. KWI's Button spec says: "primary variant uses {color.brand.primary} for background." Another org's Button spec has the same structure but different token paths. The schema is the shape. The instance is the content.

This distinction resolved a conflict I was feeling between "the spec needs to be abstract" and "we need to make org-specific changes." Both are true — they just live at different levels. The schema is abstract. The instance is specific. No contradiction.

Part 4: The Button Spec — First Draft and the Destructive Debate

To test whether the schema fields were right, we filled out a Button spec using KWI's actual tokens. The first draft had four variants: primary, secondary, ghost, and danger.

I caught the problem immediately. In my Figma system, "destructive" isn't a standalone visual flavor. A primary button can be destructive. A secondary button can be destructive. A ghost button can be destructive. It's a behavioral modifier — it says "this action is irreversible," not "this button is red." Making it a variant bakes in an assumption that destructive always looks one specific way.

The fix: split states into two categories.

Interaction states (default, hover, focus, active) — mutually exclusive, user-driven, one at a time. You can't hover and press simultaneously.

Behavioral states (destructive, disabled, loading) — modifiers that combine with any variant AND any interaction state. A primary + destructive + hover button is a real thing. The behavioral state provides tokenOverrides that replace the variant's base tokens.

This is actually the more abstract truth, not a KWI-specific decision. Any design system that treats "destructive" as a standalone variant is conflating emphasis level with behavioral meaning. They happen to overlap visually (red button), but they're orthogonal concepts.

The distinction matters for the spec because it determines what's composable. If destructive is a variant, you get: primary, secondary, ghost, danger — four things. If destructive is a behavioral state, you get: (primary × destructive) + (secondary × destructive) + (ghost × destructive) — combinatorial. More expressive, more accurate, more true to how design systems actually work.

Part 5: Atomic Design in the Schema

The spec has a composition section that describes how components relate to each other — what a Button can contain (Icons) and what it can live inside (ButtonGroup, Toolbar, Form, Card, Dialog). This is Brad Frost's atomic design: atoms compose into molecules, molecules into organisms.

The rule for where the boundary falls: if it's reusable on its own, it gets its own spec. If it only lives inside one parent, it's a slot. An Icon appears in Buttons, Cards, Nav items — own spec. A DropdownOption only exists inside a Dropdown — slot. A TextField appears standalone and inside Forms — own spec.

This keeps the spec count manageable. You're not writing specs for every sub-element. Just for the reusable building blocks.

Part 6: What's Deferred

Three things came up that we're explicitly parking:

Responsive behavior. Buttons go full-width on mobile in our system. But the schema doesn't have a breakpoint-aware token override mechanism yet. This needs design work — it's not a field we can just add.

Animation/motion. Hover transitions, focus ring appearance, loading spinners. The token system doesn't have a motion category yet. Need that before the spec can reference it.

Dark mode. The token architecture doesn't include dark mode primitives. The schema might need a theme/mode dimension, or dark mode might be handled purely at the token layer with aliases. Decision pending.

All three are real needs. None of them block the first deliverable.

Part 7: 3B — The Binding Layer and the Gaps It Exposed

With the spec schema proven, I needed to prove the next layer: can this abstract spec actually produce real framework code?

I shared the KWI repo's theme files — BackOfficeTheme.ts, backOfficeTokens.ts, pallete.ts. These are MUI createTheme() calls with palette entries and component styleOverrides. The question: can we read our button spec, apply an MUI binding, and output something that looks like what KWI already has?

The answer was yes, mostly. The generator (generate-mui-theme.mjs) reads the button spec, the MUI binding file, and the token JSONs, then outputs a valid createTheme() snippet with palette entries, size overrides, disabled handling, and focus styles.

But the "mostly" matters. Reviewing the generated output against KWI's actual theme exposed five gaps:

Gap 1: textTransform. Our spec had no concept of it. KWI's buttons are uppercase. We added baseTokens — a new section in the schema for properties that apply universally regardless of variant or size. That's where font-family, text-transform, and letter-spacing live.

Gap 2: Font family mismatch. The generator output Inter; KWI uses Helvetica Neue. This wasn't a spec bug — it was a token resolution issue. The spec references {font.family.sans}, which resolves to whatever the org's typography tokens define. Different org, different font. The spec is doing its job.

Gap 3: Hardcoded opacity. The disabled state had "opacity": "0.5" — a literal value with no Figma origin. I initially "fixed" this by creating an interaction.json token file with opacity.disabled and opacity.loading tokens. That was wrong.

Gap 4: The opacity token was invented. This was the session's most important correction. I caught myself inventing a token that doesn't exist in Figma. In a design system, when a button is disabled, the visual treatment is defined by the designer through color variables — colors that already have the reduced alpha baked in. A disabled button's background isn't "the primary color at 50% opacity." It's a specific color token like {color.state.disabled.bg} that the designer chose in Figma. Creating a standalone opacity token breaks the pipeline's fundamental rule: everything traces back to Figma. If Figma doesn't define it, it doesn't exist in the token layer.

We deleted the interaction.json file and updated the disabled/loading states to reference color tokens instead.

Gap 5: Optional vs. mandatory fields. KWI doesn't define focus ring styling; we did. The schema already supports this (focus tokens aren't required), but it raised the question of how the generator handles missing tokens. Answer: if the org's tokens don't resolve, the generator skips that section and the framework's defaults apply.

These weren't failures — they were exactly what a proof of concept is supposed to surface. The binding layer works. The gaps it exposed made the spec better.

Part 8: The Phase 3 Shape

Phase 3 is bigger than one deliverable. It breaks into two parts:

Part 1: Prove the architecture (one Button, vertical slice)

3A — Spec schema format (the blank form — what fields, what structure) ✓
3B — Binding format (how the spec maps to MUI, proven with one Button) ✓
3C — Extraction pipeline (Figma MCP data → auto-drafted spec) ✓

Part 2: Scale to the component library (many components)

3D — Bulk spec generation (Claude reads Figma components, drafts specs)
3E — Spec validation (are the auto-generated specs correct and complete?)
3F — Binding generation at scale

Part 1 is the proof of concept. Part 2 is the product. We're not perfecting — we're shipping.

Part 9: 3C — Figma Extraction (The Scary Part)

This was the part that felt like jumping off a cliff. Everything up to now was structured — JSON schemas, binding maps, generator scripts. This was: point the Figma MCP at a real component and see what comes back.

I dropped in a link to KWI's Button component in Figma. Three MCP tool calls later, I was looking at real data:

get_variable_defs returned the actual Figma variables bound to that Button instance: semantic/surface/primary (#1166dd), semantic/content/onColor (#ffffff), fontFamily/Base (Helvetica Neue), fontWeight/medium (500), and a composite typography token called Merx/button/medium with font size, line height, and letter spacing.

get_design_context returned something even more useful — generated React code showing the Button's internal structure (container → state-layer → label), plus literal style values (uppercase text-transform, 4px border radius, 16px horizontal padding, cursor: pointer). And the variant properties from Figma: Style=Filled, State=Active, Show Icon=False, Icon Position=Hidden, Hover=False.

That variant property list is the extraction pipeline's cheat code. It tells us the component set has five axes: Style, State, Show Icon, Icon Position, Hover. Those map directly to our spec structure — Style maps to variants, State and Hover map to interaction states, Show Icon and Icon Position map to icon slots.

The extraction script (extract-figma-spec.mjs) takes this raw MCP data and transforms it into a spec draft. Running --demo with the KWI Button data produced a draft that validates against our schema. It captured:

Component identity with the Figma node ID and file key populated
The primary (Filled) variant with real token bindings from Figma variables
baseTokens: Helvetica Neue, uppercase, 0.15px letter spacing
Slots: label, iconLeading, iconTrailing (inferred from variant properties)
One size entry from the instance dimensions
Placeholder interaction states inferred from the State and Hover axes

What it can't extract: behavioral states (destructive, disabled, loading), accessibility requirements, composition rules. Those are design decisions that live in my head, not in Figma's data model. The extraction does the tedious part; I do the judgment part.

The Figma variable naming (semantic/surface/primary) doesn't match our token path naming (color.brand.primary). That's expected — every org names differently. The extraction script maps between them. That mapping is itself a piece of org-specific configuration, like the binding layer.

The moment the draft validated against the schema was the moment Phase 3 Part 1 proved out. Figma data in → structured spec draft out → schema validation passes. The vertical slice works end to end.

What I Learned

The spec derives from Figma, not from code. I almost designed the schema around MUI's theming API. Starting from "what does Figma know about a component" produces a spec that any framework binding can consume.

Schema vs. instance resolves the abstract-vs-specific tension. The schema (structure) is universal. The instance (content) is org-specific. Same form, different answers.

Destructive is a state, not a variant. A structural decision that affects composability. Getting it right means the schema models how design systems actually work — primary × destructive, secondary × destructive, ghost × destructive.

Never invent tokens that don't trace back to Figma. I tried to create standalone opacity tokens. That's not how design systems work. Opacity lives inside color tokens as alpha. If Figma didn't define it, it doesn't go in the pipeline. This is now a hard rule in the design-system-ops skill.

The binding layer is where framework-specific knowledge lives. The abstract spec doesn't know what MuiButton-sizeLarge means. The binding maps spec.sizes[name=lg] to that MUI selector. Different framework, different binding, same spec.

baseTokens fills a real gap. Properties like font-family and text-transform don't vary by variant or size — they're universal to the component. They needed a home in the spec, and baseTokens is it.

Figma's MCP is half the extraction pipeline. I expected to build everything from scratch. Instead, three tool calls gave us component identity, token bindings, internal structure, style values, and variant properties. The extraction script is a transformer, not a crawler.

The extraction can't replace design judgment. It gets the "what" (structure, tokens, dimensions) but not the "why" (when to use destructive, what accessibility means for this component, how it composes with other components). That's the human part.

Figma variable naming ≠ token path naming. semantic/surface/primary in Figma becomes {color.brand.primary} in the token pipeline. Every org has its own Figma naming convention. The mapping is org-specific config.

Artifacts

components/component-spec.schema.json — JSON Schema 2020-12 rulebook. Defines: component identity, variants, baseTokens, sizes, states (interaction + behavioral), slots, accessibility, composition, qa, backlog.
components/specs/button.spec.json — v3. Three variants, split states, baseTokens for font-family/text-transform, disabled/loading reference color tokens with alpha.
components/specs/button.draft.json — Auto-drafted from Figma MCP extraction. Validates against schema. Shows what automated extraction can capture vs. what needs human input.
components/bindings/mui.binding.json — MUI binding. Palette mapping, size selectors, variant mapping (primary→contained, secondary→outlined, ghost→text).
scripts/validate-spec.mjs — Validates spec instances against the schema.
scripts/generate-mui-theme.mjs — Reads spec + binding + tokens → MUI createTheme() output.
scripts/extract-figma-spec.mjs — Transforms raw Figma MCP data into a component spec draft.

Open Threads (Phase 3 Part 2 and beyond)

Bulk spec generation across multiple components
KWI repo integration as a proof-of-concept demo
Responsive behavior mechanism for the schema
Motion/animation token category
Dark mode strategy (token layer vs. spec layer)
A generalizable design system skill beyond org-specific ops

Phase 3 Part 1 complete. Three proofs, one Button, vertical slice: schema validates (3A), MUI binding generates real createTheme() output (3B), Figma extraction produces a valid spec draft (3C). The pipeline's hardest rule: if Figma doesn't define it, it doesn't exist. Next up: Phase 3 Part 2 — scaling from one component to many.

Built with: Figma, Tokens Studio, GitHub Actions, Style Dictionary v4, Claude Code, the Figma MCP, and an unreasonable amount of patience for JSON nesting.

Session 3: The Component Spec Layer — Designing the Schema

Where We Picked Up

Part 1: The Wrong Starting Point

Part 2: What the Figma MCP Tells Us

Part 3: Schema vs. Instance

Part 4: The Button Spec — First Draft and the Destructive Debate

Part 5: Atomic Design in the Schema

Part 6: What's Deferred

Part 7: 3B — The Binding Layer and the Gaps It Exposed

Part 8: The Phase 3 Shape

Part 9: 3C — Figma Extraction (The Scary Part)

What I Learned

Artifacts

Open Threads (Phase 3 Part 2 and beyond)

Comments

What If a Design System Could Self QA?

What We're Building and Why

More from this blog

Session 2: Automated QA — WCAG, Contracts, and Impact Tracing

Session 1: Pipeline + Color Token Architecture

What We're Building and Why

Command Palette

Where We Picked Up

Part 1: The Wrong Starting Point

Part 2: What the Figma MCP Tells Us

Part 3: Schema vs. Instance

Part 4: The Button Spec — First Draft and the Destructive Debate

Part 5: Atomic Design in the Schema

Part 6: What's Deferred

Part 7: 3B — The Binding Layer and the Gaps It Exposed

Part 8: The Phase 3 Shape

Part 9: 3C — Figma Extraction (The Scary Part)

What I Learned

Artifacts

Open Threads (Phase 3 Part 2 and beyond)

Comments

What If a Design System Could Self QA?

What We're Building and Why

More from this blog