Skip to content

Decision memos, incident playbooks, SLO templates, and architecture reviews for engineering leaders.

License

Notifications You must be signed in to change notification settings

laugiov/cto-operating-system

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CTO Operating System

Decision memos, incident playbooks, SLO templates, and architecture reviews. Use them. Adapt them. Ship.


TL;DR

  • What: Templates for decisions, incidents, SLOs, security reviews, R&D governance
  • Who: Engineering leaders (CTO, VP Eng, Head of Security)
  • Inside: Guides, templates, anonymized examples
  • How: Copy, adapt, ship
  • Philosophy: Decisions get memos. SLOs drive priority. Security reviews gate launches.
  • License: MIT

Evaluate in 10 Minutes

Time File What You'll See
2 min Decision Memo Template How decisions get documented
2 min Postmortem Template How incidents get reviewed
3 min DM-001: Reliability vs Features A real trade-off with numbers
3 min R&D Lab Guide How experiments get governed

Full walkthrough: 10-Minute Guide


Start in 7 Days

  1. Day 1: Pick one cadence — adopt the Weekly Exec Review format
  2. Day 2: Write your first Decision Memo for a pending technical choice
  3. Day 3: Define 2-3 SLOs for your most critical service using the SLO Guide
  4. Day 4: Run one Architecture Review on an upcoming project
  5. Day 5: Set up incident communication templates in your Slack/Teams
  6. Day 6: Draft your first Quarterly Plan skeleton
  7. Day 7: Review and adjust — keep what works, discard what doesn't

Topics Covered

Area What's Included
Decisions Memos with options, trade-offs, rollback plans
Reliability SLOs, error budgets, postmortems
Security Threat models, risk acceptance, exception tracking
R&D/AI Experiment cards, kill criteria, model governance
Operations Weekly reviews, escalation rules, on-call targets
Org Hiring scorecards, onboarding checklists

Hiring Relevance

Role Start Here Key Sections
CTO 30-60-90 Plan Operating rhythm, Decision memos, Org design
Head of Engineering Operating Rhythm SLOs, Incident management, Quarterly planning
Head of Security Engineering Security-by-Design Risk acceptance, Security reviews, Governance
Platform Security Architect Architecture Review Security review template, Exception requests

Principles

  1. SLOs decide priority — Error budgets determine what ships. No budget = reliability work first.
  2. Decisions get memos — Every significant choice: context, options, rationale, owner.
  3. Security gates launches — Architecture reviews required. Exceptions tracked with expiry dates.
  4. Experiments have kill dates — Hypothesis, success criteria, time box. No infinite projects.

Non-Goals

  • Not a blog — No thought leadership essays; only actionable templates and examples
  • Not vendor-specific — No proprietary tools or cloud-specific implementations
  • Not exhaustive — Covers common scenarios. Adapt for edge cases.
  • Not prescriptive — Adapt to your organization's size, culture, and constraints
  • Not a compliance framework — Complements but doesn't replace SOC2, ISO27001, etc.

Contents

Guides (docs/)

File Description
00-evaluate-in-10min.md Start here
01-cto-30-60-90.md First 100 days as CTO/Head of Engineering
02-operating-rhythm.md Weekly, monthly, quarterly cadences
03-slo-sla-error-budgets.md Reliability framework fundamentals
04-incident-management.md Incident response from detection to postmortem
05-roadmap-arbitrage-framework.md Prioritization when everything is urgent
06-org-design-and-hiring.md Team structures and hiring practices
07-security-by-design-leadership.md Embedding security in engineering culture
08-innovation-ai-rnd-lab.md Running an internal R&D/AI lab with governance
09-decision-memos-playbook.md Writing effective architectural decisions

Templates (templates/)

File Description
weekly-exec-review.md Weekly executive status meeting agenda
monthly-tech-governance.md Monthly technical governance review
quarterly-planning.md Quarterly OKR and roadmap planning
decision-memo.md Technical decision documentation
postmortem.md Blameless incident postmortem
incident-comms-internal.md Internal incident communication
incident-comms-exec.md Executive incident briefing
slo-catalog.md Service Level Objectives registry
risk-acceptance.md Formal risk acceptance documentation
exception-request.md Security/policy exception request
architecture-review.md Architecture review checklist
security-review.md Security design review
hiring-scorecard.md Structured interview evaluation
onboarding-30-days.md New hire onboarding checklist
lab-proposal-onepager.md R&D experiment proposal
experiment-card.md Experiment tracking card
model-eval-report.md AI/ML model evaluation report

Examples (examples/)

File Description
DM-001-stabilize-vs-features.md Trade-off: stability investment vs. feature velocity
DM-002-platform-security-investment.md Multi-year security platform investment case
DM-003-ai-lab-governance.md Establishing AI lab governance framework
INC-001-incident-comms-internal.md Internal comms during major incident
INC-001-incident-comms-exec.md Executive briefing during incident
INC-001-postmortem.md Complete postmortem example
Q-001-quarterly-plan-example.md Quarterly planning output example

License

This project is licensed under the MIT License.

Releases

No releases published

Packages

No packages published