[DRAFT RFC] Token Schema Structure and Validation System #645

GarthDB · 2025-12-16T21:17:17Z

GarthDB
Dec 16, 2025
Maintainer

"> ⚠️ DRAFT STATUS: This RFC documents the token schema structure implemented in PR #644 and proposes it as the standard for future token work. Open for feedback and review.\n\n---\n\n# RFC: Token Schema Structure and Validation System\n\nStatus: Draft - Implementation Complete \nAuthor: Garth Braithwaite \nDACI: [To be assigned] \nImplementation: PR #644 \nRelated: DNA-1485, RFC #624, RFC #625, RFC #626\n\n---\n\n## Executive Summary\n\nThis RFC proposes a comprehensive schema structure for all Spectrum design tokens, transforming hyphen-delimited token names into structured JSON objects with full validation capabilities. This provides the foundation for advanced tooling including token recommendations, automated documentation, and cross-platform transformation.\n\nProblem: Current token structure uses hyphen-delimited names with implicit meaning, no validation of naming conventions, and limited ability to query or analyze tokens systematically. This makes it difficult to build tooling, enforce governance, or provide semantic guidance.\n\nSolution: Implement structured token format with JSON Schema validation, controlled vocabularies (enums), semantic analysis capabilities, and perfect round-trip conversion. Complete implementation provided in PR #644.\n\nResults: All 2,338 tokens across 8 files successfully parsed and validated with 100% regeneration rate and 82% schema validation coverage.\n\n---\n\n## Background & Context\n\n### Origin\n- DNA-1485: Initiative to improve and expand design data schemas\n- August 2025 Onsite: Discussion of data system improvements and tooling needs\n- Token Recommendation Requirements: Need structured data for semantic token suggestions\n- Documentation Generation: Need queryable token structure for automated docs\n\n### Current Token Format\njson\n{\n \"text-to-visual-50\": {\n \"$schema\": \"https://opensource.adobe.com/spectrum-design-data/schemas/token-types/dimension.json\",\n \"value\": \"4px\",\n \"uuid\": \"f1bc4c85-c0dc-44bf-a156-54707f3626e9\"\n }\n}\n\n\nLimitations:\n- Token name meaning is implicit (what does "text-to-visual" mean?)\n- No validation of naming conventions\n- Can't query "show me all spacing tokens"\n- Can't analyze semantic complexity\n- Difficult to track token references\n- No structured relationship data\n\n### Design Data System Vision\nFrom your onsite presentation:\n- System of systems: Foundation → Platform → Product\n- Easy to reason about: Clear structure and relationships\n- Governance and iteration: Enforceable standards\n- Partner collaboration: Shared understanding of token structure\n\n---\n\n## Proposal\n\n### Structured Token Format\n\nTransform token names into structured objects with full semantic information:\n\njson\n{\n \"id\": \"f1bc4c85-c0dc-44bf-a156-54707f3626e9\",\n \"$schema\": \"https://opensource.adobe.com/spectrum-design-data/schemas/token-types/dimension.json\",\n \"value\": \"4px\",\n \"name\": {\n \"original\": \"text-to-visual-50\",\n \"structure\": {\n \"category\": \"spacing\",\n \"property\": \"spacing\",\n \"spaceBetween\": {\n \"from\": \"text\",\n \"to\": \"visual\"\n },\n \"index\": \"50\"\n },\n \"semanticComplexity\": 1\n },\n \"validation\": {\n \"isValid\": true,\n \"errors\": []\n }\n}\n\n\n### Token Categories\n\nNine primary token categories identified across all tokens:\n\n1. spacing - Space-between relationships (text-to-visual-50)\n2. component-property - Component-specific properties (button-height-100)\n3. generic-property - Global properties (corner-radius-100)\n4. semantic-alias - Reference tokens (accent-color-100 → {blue-800})\n5. color-base - Base color palette (blue-800)\n6. color-scale - Color scales with modifiers (blue-800, transparent-blue-800)\n7. gradient-color - Gradient stops (gradient-stop-1-red)\n8. typography-base - Font properties (bold-font-weight, sans-font-family)\n9. special - Edge cases requiring custom schemas (composite tokens, platform-specific)\n\n### Schema Architecture\n\n#### Base Schema Hierarchy\n\nbase-token.json (foundation for all tokens)\n├── regular-token.json (single-value tokens)\n│ ├── spacing-token.json\n│ ├── component-property-token.json\n│ ├── generic-property-token.json\n│ ├── semantic-alias-token.json\n│ ├── color-base-token.json\n│ ├── color-scale-token.json\n│ ├── gradient-color-token.json\n│ └── typography-base-token.json\n│\n└── scale-set-token.json (desktop/mobile variants)\n ├── spacing-scale-set-token.json\n ├── component-property-scale-set-token.json\n ├── generic-property-scale-set-token.json\n ├── color-set-token.json\n ├── color-scale-scale-set-token.json\n └── semantic-alias-color-set-token.json\n\n\n#### Enum Schemas (Controlled Vocabularies)\n\n12 enum schemas define allowed values for token name parts:\n\n1. components.json - 80 component names (button, checkbox, field, etc.)\n2. anatomy-parts.json - 249 anatomy parts (control, track, text, visual, etc.)\n3. properties.json - 376 properties (height, width, size, spacing, etc.)\n4. sizes.json - 19 numeric scale indices (0, 25, 50, 75, 100-1500)\n5. component-options.json - 10 options (small, medium, large, quiet, compact, etc.)\n6. states.json - 5 UI states (hover, down, focus, disabled, selected)\n7. colors.json - 23 base color names (blue, red, green, gray, etc.)\n8. color-modifiers.json - 7 modifiers (transparent, static, etc.)\n9. color-indices.json - 15 color scale indices (100-1400)\n10. platforms.json - 2 platform identifiers (android, ios)\n11. themes.json - 3 theme names (light, dark, wireframe)\n12. relationship-connectors.json - 1 spacing connector ("to")\n\nTotal controlled vocabulary: 800+ values ensuring consistency\n\n### Semantic Complexity Metric\n\nMeasures how much semantic context a token provides (0-3+):\n\njavascript\n// Calculated based on semantic fields in name.structure\nsemanticComplexity = countOf([\n component,\n property,\n anatomyPart,\n spaceBetween,\n referencedToken,\n options,\n state,\n calculation,\n platform\n])\n\n\nExamples:\n- gray-100: complexity 0 (base palette, no semantic context)\n- background-color-default: complexity 1 (semantic alias with property)\n- button-background-color-default: complexity 2 (component + property + alias)\n- button-control-background-color-hover: complexity 3+ (component + anatomy + property + state)\n\nUse Case: Token recommendation systems can suggest more semantically specific tokens:\n- "You used blue-800, consider accent-color-100 (more semantic)"\n- "Consider button-background-color-default (most specific for this use case)"\n\n### Validation Strategy\n\n#### Schema-Driven Validation\n- Validator: AJV with JSON Schema Draft 2020-12\n- Enums: All token name parts validated against controlled vocabularies\n- References: Semantic aliases validated for correct token references\n- Structure: Each category has specific structural requirements\n\n#### Validation Levels\n- 100%: Perfect schema coverage (color-palette, semantic-color-palette, icons)\n- 95%+: Well-validated with minor edge cases (typography)\n- 75-90%: Good coverage with known special tokens (color-aliases, color-component, layout)\n- 70-85%: Complex tokens with many special cases (layout-component)\n\n#### Current Validation Results\n| File | Tokens | Match | Valid | Rate |\n|------|--------|-------|-------|------|\n| color-palette.json | 372 | 100% | 372 | 100% |\n| semantic-color-palette.json | 94 | 100% | 94 | 100% |\n| icons.json | 79 | 100% | 79 | 100% |\n| typography.json | 312 | 100% | 297 | 95.2% |\n| color-aliases.json | 169 | 100% | 150 | 88.8% |\n| color-component.json | 73 | 100% | 56 | 76.7% |\n| layout.json | 242 | 100% | 180 | 74.4% |\n| layout-component.json | 997 | 100% | 701 | 70.3% |\n| Total | 2,338 | 100% | 1,929 | 82.5% |\n\n### Anonymous Token Array Structure\n\nTokens stored as array of objects (not keyed by name):\n\nWhy:\n- Enables tokens with identical names across themes\n- Perfect round-trip conversion (name is reconstructed from structure)\n- Maintains all original metadata (uuid, deprecated flags, etc.)\n- Supports future multi-dimensional token spaces\n\nBefore (keyed by name):\njson\n{\n \"blue-800\": { \"value\": \"#1473E6\", \"uuid\": \"...\" }\n}\n\n\nAfter (anonymous array):\njson\n[\n {\n \"id\": \"550e8400-e29b-41d4-a716-446655440000\",\n \"value\": \"#1473E6\",\n \"name\": {\n \"original\": \"blue-800\",\n \"structure\": { \"category\": \"color-base\", \"color\": \"blue\", \"index\": \"800\" },\n \"semanticComplexity\": 0\n }\n }\n]\n\n\n### Round-Trip Verification\n\nCritical Requirement: Structured format must perfectly regenerate original token names.\n\nImplementation:\n- Handlebars templates for each token category\n- Template per category (spacing-token.hbs, component-property-token.hbs, etc.)\n- Automated comparison of original vs regenerated names\n\nResults: 100% match rate (2,338/2,338 tokens) - zero data loss\n\nExample Template (spacing-token.hbs):\nhandlebars\n{{#if component}}{{component}}-{{/if}}{{spaceBetween.from}}-to-{{spaceBetween.to}}{{#if index}}-{{index}}{{/if}}{{#if options}}{{#each options}}-{{this}}{{/each}}{{/if}}\n\n\n---\n\n## Implementation\n\n### Complete Implementation: PR #644\n\nPackage 1: packages/structured-tokens/\n- 8 structured token files (all 2,338 tokens)\n- 25+ JSON schemas for validation\n- 12 enum schemas for controlled vocabularies\n- Production-ready, fully documented\n\nPackage 2: tools/token-name-parser/\n- Token parser (converts names → structure)\n- Schema validator (validates against JSON schemas)\n- Name regenerator (converts structure → names)\n- Name comparator (verifies round-trip accuracy)\n- 8 Handlebars templates\n- 19 passing tests (100% test coverage)\n- 20+ documentation files\n\n### Parser Capabilities\n\nPattern Detection:\n- Spacing tokens with space-between relationships\n- Component properties with anatomy parts and options\n- Generic properties with compound names (drop-shadow-x, corner-radius)\n- Semantic aliases with reference tracking\n- Color tokens (base, scale, gradient) with theme sets\n- Typography tokens with font properties\n- Compound patterns (multi-word components: "radio-button", options: "extra-large")\n\nExample Parsing:\n\nInput: checkbox-control-size-small\njson\n{\n \"category\": \"component-property\",\n \"component\": \"checkbox\",\n \"anatomyPart\": \"control\",\n \"property\": \"size\",\n \"options\": [\"small\"]\n}\n\n\nInput: text-to-visual-compact-medium\njson\n{\n \"category\": \"spacing\",\n \"property\": \"spacing\",\n \"spaceBetween\": { \"from\": \"text\", \"to\": \"visual\" },\n \"options\": [\"compact\", \"medium\"]\n}\n\n\nInput: accent-color-100 (references {blue-800})\njson\n{\n \"category\": \"semantic-alias\",\n \"property\": \"accent-color-100\",\n \"referencedToken\": \"blue-800\",\n \"notes\": \"Semantic alias providing contextual naming\"\n}\n\n\n### Usage Examples\n\n#### Query Tokens by Category\njavascript\nconst spacingTokens = tokens.filter(t => t.name.structure.category === 'spacing');\n// Returns: All 461 spacing tokens\n\n\n#### Find High-Complexity Tokens\njavascript\nconst semanticTokens = tokens.filter(t => t.name.semanticComplexity >= 2);\n// Returns: Tokens with strong semantic context for recommendations\n\n\n#### Track Token References\njavascript\nconst aliases = tokens.filter(t => \n t.name.structure.category === 'semantic-alias' &&\n t.name.structure.referencedToken === 'blue-800'\n);\n// Returns: All tokens that reference blue-800\n\n\n#### Validate Token Names\njavascript\nconst invalid = tokens.filter(t => !t.validation.isValid);\n// Returns: Tokens that don't match schemas (need attention)\n\n\n---\n\n## Benefits & Use Cases\n\n### 1. Token Recommendation Systems\nEnabled by semantic complexity metric and reference tracking\n\nUse Case: IDE plugin suggests semantic alternatives\n\n// Developer types: color: #1473E6\n// Plugin suggests:\n// - blue-800 (exact match, complexity: 0)\n// - accent-color-100 (semantic alias, complexity: 1) ✓ Recommended\n// - button-background-color-default (component-specific, complexity: 2)\n\n\n### 2. Automated Documentation Generation\nEnabled by structured data and queryable format\n\nUse Case: Generate token catalog by category\nmarkdown\n# Spacing Tokens\n\n## Text-to-Visual Spacing\n- text-to-visual-50: 4px\n- text-to-visual-100: 8px\n- text-to-visual-200: 12px\n\n\n### 3. Design System Governance\nEnabled by schema validation and controlled vocabularies\n\nUse Case: CI/CD validation of token PRs\nbash\n$ pnpm validate-tokens\n✓ All token names follow conventions\n✗ Error: \"button-height-350\" - index 350 not in allowed sizes\n✗ Error: \"unknow-component-size\" - component not in allowed list\n\n\n### 4. Cross-Platform Token Transformation\nEnabled by structured format and perfect round-trip\n\nUse Case: Transform tokens for different platforms\njavascript\n// Web: --spectrum-button-background-default\n// iOS: ButtonBackgroundDefault\n// Android: button_background_default\n// All from same structured source\n\n\n### 5. Token Migration & Deprecation\nEnabled by reference tracking and semantic analysis\n\nUse Case: Identify tokens to migrate\njavascript\n// Find all tokens referencing deprecated blue-800\nconst affectedTokens = findReferences('blue-800');\n// Plan migration path: blue-800 → blue-900\n// Update all 23 semantic aliases automatically\n\n\n### 6. Foundation for Future RFCs\nDirectly enables other proposed RFCs\n\n- RFC #626 (Sourcemaps): Structured tokens provide UUIDs and reference tracking needed for sourcemaps\n- RFC #625 (Authoring): Schema validation enforces authoring rules in CI/CD\n- RFC #624 (Multi-platform): Structured format can express platform extensions/overrides\n\n---\n\n## Alternatives Considered\n\n### Alternative 1: Keep Hyphenated Names Only\nPros: No change, existing tooling works\nCons: Can't build advanced tooling, no governance, limited querying\nDecision: Rejected - doesn't meet future needs\n\n### Alternative 2: Use DTCG Format Directly\nPros: Standard format, external tool support\nCons: Doesn't capture Spectrum-specific semantics (anatomy, space-between), loses semantic complexity\nDecision: Considered for future (RFC #627 proposes DTCG as additional output)\n\n### Alternative 3: Object with Names as Keys\nPros: Familiar structure, easy lookup by name\nCons: Can't have duplicate names across themes, harder round-trip\nDecision: Rejected - anonymous array provides more flexibility\n\n### Alternative 4: AI/LLM-Based Parsing\nPros: Could handle more edge cases\nCons: Non-deterministic, harder to validate, slower\nDecision: Rejected - rule-based parsing with schemas is more reliable\n\n---\n\n## Migration & Adoption\n\n### Phase 1: Non-Breaking Addition (Complete in PR #644)\n- ✅ Structured tokens live alongside original tokens\n- ✅ No changes to existing token files in packages/tokens/src/\n- ✅ New package: packages/structured-tokens/\n- ✅ Tooling in: tools/token-name-parser/\n\n### Phase 2: Tooling Integration (Next)\n- Build token recommendation MCP\n- Integrate validation into CI/CD\n- Create documentation generator\n- Implement sourcemaps (RFC #626)\n\n### Phase 3: Authoring Workflow (Future)\n- Define token authoring process using schemas\n- Implement validation gates\n- Roll out to team (RFC #625)\n\n### Phase 4: Platform Transformation (Future)\n- Use structured tokens for platform-specific builds\n- Implement multi-platform support (RFC #624)\n- Generate platform-specific formats\n\nNo breaking changes to existing token consumers.\n\n---\n\n## Success Metrics\n\n### Achieved in PR #644:\n- ✅ 100% Token Coverage: All 2,338 tokens parsed (8/8 files)\n- ✅ 100% Regeneration Rate: Perfect round-trip conversion\n- ✅ 82.5% Validation Rate: 1,929/2,338 tokens fully validated\n- ✅ 100% Test Pass Rate: 19/19 tests passing\n- ✅ Zero Breaking Changes: Original tokens unchanged\n\n### Future Success Metrics:\n- 90%+ Validation Rate: Improve with additional schemas for edge cases\n- Developer Adoption: Token recommendation tool usage\n- CI/CD Integration: Automated validation in PR checks\n- Documentation Generation: Auto-generated token docs\n- Platform Support: All platforms using structured format\n\n---\n\n## Known Limitations & Future Work\n\n### 455 Special Tokens (19.5%)\nTokens that regenerate correctly but need additional schemas:\n\nCategories:\n- Composite Typography (15 tokens): component-xs-regular bundles multiple font properties\n- Drop Shadow Composites (4 tokens): drop-shadow-emphasized has complex structure\n- Component Opacity (17 tokens): swatch-border-opacity direct opacity values\n- Multiplier Tokens (various): button-minimum-width-multiplier calculation-based\n- Platform-Specific (2 tokens): android-elevation needs platform schema\n\nFuture Schemas Needed:\n1. typography-composite-token.json\n2. drop-shadow-composite-token.json\n3. multiplier-token.json\n4. Platform-specific token schemas\n\nImpact: These tokens work correctly (100% regeneration) but show as "special" in validation reports.\n\n### Edge Cases\n- Some anatomy parts are compound words (focus-indicator, side-label-character-count)\n- Component options sometimes stack (compact-extra-large)\n- Platform-specific tokens need additional categorization\n- Multi-dimensional tokens (modes beyond light/dark) not fully modeled\n\n### Performance Considerations\n- Parser processes 2,338 tokens in <2 seconds\n- Schema validation adds ~500ms\n- Acceptable for CI/CD and development use\n- Not intended for runtime use in products\n\n---\n\n## Open Questions\n\n1. Schema Evolution: How do we version schemas as token structure evolves?\n2. Special Token Threshold: At what validation rate do we consider "special" tokens acceptable?\n3. DTCG Alignment: Should we align more closely with DTCG format? (See RFC #627)\n4. Multi-Dimensional Tokens: How do we model modes beyond light/dark/wireframe?\n5. Platform Extensions: How do platform-specific tokens integrate with this structure? (See RFC #624)\n\n---\n\n## Related Work & References\n\n### GitHub Discussions\n- RFC #624: Token Structure Changes for Multi-Platform Support - This RFC provides foundation for structured platform extensions\n- RFC #625: Token Authoring Workflow and Process - Schema validation enables authoring workflow enforcement\n- RFC #626: Design Token Sourcemaps and Traceability - Structured tokens provide UUIDs and references needed for sourcemaps\n- Discussion #297: Composite tokens for Drop Shadow and Typography - Identified composite token patterns\n- Discussion #507: Composite token proof of concept (s2 typography) - Context for composite typography tokens\n\n### Jira Tickets\n- DNA-1485: Initiate a project to improve and expand design data schemas - Parent initiative\n\n### Implementation\n- PR #644: Add structured token parser and comprehensive schema system - Complete implementation\n\n### Documentation (in PR #644)\n- FINAL_PROJECT_SUMMARY.md - Complete project overview\n- ICONS_RESULTS.md - Icons parsing results (100% validation)\n- TYPOGRAPHY_RESULTS.md - Typography parsing results (95.2% validation)\n- LAYOUT_COMPONENT_RESULTS.md - Layout component results (70.3% validation)\n- COLOR_FINAL_RESULTS.md - All color files summary\n- SEMANTIC_COMPLEXITY.md - Semantic complexity metric documentation\n- ROUND_TRIP_VERIFICATION.md - Round-trip conversion verification\n- 13 additional documentation files\n\n---\n\n## Decision Points\n\n### For Approval\n1. Accept structured token format as defined in this RFC\n2. Accept schema architecture (base schemas + category schemas + enums)\n3. Accept semantic complexity metric as standard measure\n4. Accept 80%+ validation rate as success criteria (with special tokens documented)\n5. Accept anonymous token array over keyed object structure\n\n### For Discussion\n1. Schema versioning strategy - how do we evolve schemas?\n2. Special token threshold - what % is acceptable long-term?\n3. DTCG alignment - how closely should we align with DTCG format?\n4. Governance - who approves new categories, enums, schema changes?\n\n---\n\n## Next Steps\n\n### Immediate (Post-Approval)\n1. Merge PR #644 - Make structured tokens available\n2. Close DNA-1485 - Mark schema improvement initiative complete\n3. Update related RFCs - Reference this as foundational work\n\n### Short-term (1-2 months)\n1. Implement RFC #626 - Build sourcemap system on this foundation\n2. Create special token schemas - Reduce "special" category from 19.5% to <5%\n3. Build token recommendation POC - Demonstrate semantic complexity value\n\n### Medium-term (3-6 months)\n1. Integrate into CI/CD - Automated validation of token PRs\n2. Documentation generator - Auto-generate token catalogs\n3. Token authoring workflow - Use schemas to guide token creation (RFC #625)\n\n### Long-term (6-12 months)\n1. Multi-platform support - Extend for platform-specific tokens (RFC #624)\n2. DTCG export - Generate DTCG format as additional output\n3. Design tool integration - Figma plugin using structured data\n\n---\n\n## Appendix\n\n### Appendix A: Complete Token Category Definitions\n\nSee full documentation in PR #644:\n- packages/structured-tokens/schemas/ - All schema definitions\n- packages/structured-tokens/schemas/enums/ - All enum definitions\n- tools/token-name-parser/templates/ - Regeneration templates\n\n### Appendix B: Validation Reports\n\nComplete validation reports available in PR #644:\n- tools/token-name-parser/output/[filename]-validation-report.json\n\n### Appendix C: Parser Implementation\n\nFull parser source:\n- tools/token-name-parser/src/parser.js - Token name parsing logic (838 lines)\n- tools/token-name-parser/src/validator.js - Schema validation (242 lines)\n- tools/token-name-parser/src/name-regenerator.js - Name regeneration (98 lines)\n\n### Appendix D: Test Coverage\n\nAll tests passing:\n- tools/token-name-parser/test/parser.test.js\n- tools/token-name-parser/test/name-regenerator.test.js\n- tools/token-name-parser/test/name-comparator.test.js\n- tools/token-name-parser/test/semantic-complexity.test.js\n\n---\n\n## Feedback & Discussion\n\nPlease provide feedback on:\n1. Schema architecture - Is the hierarchy clear and extensible?\n2. Token categories - Are the 9 categories comprehensive?\n3. Semantic complexity - Is this metric useful for your use cases?\n4. Validation threshold - Is 82% acceptable with documented special tokens?\n5. Anonymous array structure - Better than keyed object?\n6. Special tokens - Should we create more schemas or accept as edge cases?\n\nThis RFC is open for discussion and feedback before moving to approval.\n\n"

GarthDB · 2025-12-16T21:25:34Z

GarthDB
Dec 16, 2025
Maintainer Author

⚠️ This discussion was closed due to formatting issues.

Please see the correctly formatted version:

➡️ Discussion #646: Token Schema Structure and Validation System

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[DRAFT RFC] Token Schema Structure and Validation System #645

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

[DRAFT RFC] Token Schema Structure and Validation System #645

Uh oh!

GarthDB Dec 16, 2025 Maintainer

Replies: 1 comment

Uh oh!

GarthDB Dec 16, 2025 Maintainer Author

GarthDB
Dec 16, 2025
Maintainer

GarthDB
Dec 16, 2025
Maintainer Author