antonbabenko · ayogun · Jan 20, 2026 · Jan 20, 2026 · Jan 20, 2026 · Jan 20, 2026
diff --git a/CLAUDE.md b/CLAUDE.md
@@ -18,13 +18,14 @@ This repository contains a **Claude Code skill** - executable documentation that
 terraform-skill/
 ├── .claude-plugin/
 │   └── marketplace.json                  # Marketplace and plugin metadata
-├── SKILL.md                              # Core skill file (~524 lines)
+├── SKILL.md                              # Core skill file (~645 lines)
 ├── references/                           # Reference files (progressive disclosure)
 │   ├── ci-cd-workflows.md                # CI/CD templates (~473 lines)
 │   ├── code-patterns.md                  # Code patterns & modern features (~859 lines)
 │   ├── module-patterns.md                # Module best practices (~1,126 lines)
-│   ├── quick-reference.md                # Command cheat sheets (~600 lines)
+│   ├── quick-reference.md                # Command cheat sheets (~800 lines)
 │   ├── security-compliance.md            # Security guidance (~470 lines)
+│   ├── state-management.md               # State management guide (~1,867 lines)
 │   └── testing-frameworks.md             # Testing guides (~563 lines)
 ├── README.md                             # For GitHub/marketplace users
 ├── CLAUDE.md                             # For contributors (YOU ARE HERE)
@@ -36,8 +37,8 @@ terraform-skill/
 | File | Audience | Purpose |
 |------|----------|---------|
 | `.claude-plugin/marketplace.json` | Claude Code | Marketplace and plugin metadata |
-| `SKILL.md` | Claude Code | Core skill (~524 lines, ~4.4K tokens) |
-| `references/*.md` | Claude Code | Reference files loaded on demand (6 files, ~26K tokens) |
+| `SKILL.md` | Claude Code | Core skill (~645 lines, ~5.4K tokens) |
+| `references/*.md` | Claude Code | Reference files loaded on demand (7 files, ~39K tokens) |
 | `README.md` | End users | Installation, usage examples, what it covers |
 | `CLAUDE.md` | Contributors | Development guidelines, architecture decisions |
 | `LICENSE` | Everyone | Apache 2.0 legal terms |
@@ -65,20 +66,21 @@ Response: Code following best practices
 ### Token Budget
 
 - **Metadata (YAML frontmatter):** ~100 tokens - always loaded
-- **Core SKILL.md:** ~4,400 tokens - loaded on activation
+- **Core SKILL.md:** ~5,400 tokens - loaded on activation
 - **Reference files:** Individual estimates (loaded on demand only):
   - ci-cd-workflows.md: ~2,300 tokens
   - code-patterns.md: ~5,100 tokens
   - module-patterns.md: ~7,000 tokens
-  - quick-reference.md: ~3,800 tokens
+  - quick-reference.md: ~5,000 tokens
   - security-compliance.md: ~2,500 tokens
+  - state-management.md: ~11,700 tokens
   - testing-frameworks.md: ~3,400 tokens
-- **Target:** Aim for under 500 lines for main SKILL.md (current: 524 lines - comprehensive core guidance)
+- **Target:** Aim for under 500 lines for main SKILL.md (current: 645 lines - includes essential state management summary)
 
 **Our Architecture:**
-- SKILL.md: 524 lines, ~4.4K tokens (comprehensive core guidance)
-- Reference files: 6 files totaling 4,091 lines, ~26K tokens
-- Progressive disclosure: ~56-70% token reduction for typical queries (vs loading all content)
+- SKILL.md: 645 lines, ~5.4K tokens (comprehensive core guidance with state management)
+- Reference files: 7 files totaling 6,158 lines, ~39K tokens
+- Progressive disclosure: ~50-65% token reduction for typical queries (vs loading all content)
 
 ## Content Philosophy
 
@@ -316,15 +318,16 @@ As Terraform evolves, balance:
 - **Detail** vs **Scannability**
 - **Examples** vs **Reference**
 
-**Current Status:** SKILL.md is at 524 lines, slightly above the suggested 500-line target. This is justified by:
-- Comprehensive decision matrices (testing, count vs for_each)
+**Current Status:** SKILL.md is at 645 lines, above the suggested 500-line target. This is justified by:
+- Comprehensive decision matrices (testing, count vs for_each, state organization)
 - Essential quick reference tables
+- State management fundamentals (critical gap filled)
 - Version-specific guidance (multiple Terraform versions)
 - Progressive disclosure architecture minimizes token cost
 
-The extra 24 lines provide significant value while maintaining scannability. Future updates should prioritize reference file expansion over core skill growth.
+The additional content provides significant value, particularly the state management section which addresses one of the most critical aspects of Terraform usage. Future updates should prioritize reference file expansion over core skill growth.
 
-Current sweet spot: ~4.4K tokens for core SKILL.md, with 6 reference files (~26K tokens) providing deep-dive content on demand. Total coverage: ~30.4K tokens structured for progressive disclosure.
+Current sweet spot: ~5.4K tokens for core SKILL.md, with 7 reference files (~39K tokens) providing deep-dive content on demand. Total coverage: ~44.4K tokens structured for progressive disclosure.
 
 ### Long-term Vision
 

diff --git a/README.md b/README.md
@@ -19,6 +19,12 @@ Comprehensive Terraform and OpenTofu best practices skill for Claude Code. Get i
 - Versioning strategies
 - Public vs private module patterns
 
+🗄️ **State Management**
+- Remote backend configuration (S3, Azure, GCS, Terraform Cloud)
+- State locking and security patterns
+- Multi-team state isolation strategies
+- State migration and recovery procedures
+
 🔄 **CI/CD Integration**
 - GitHub Actions workflows
 - GitLab CI examples
@@ -76,6 +82,9 @@ Claude will automatically use the skill when working with Terraform/OpenTofu cod
 **Create a module with tests:**
 > "Create a Terraform module for AWS VPC with native tests"
 
+**Set up remote state:**
+> "Configure S3 backend with DynamoDB locking for Terraform state"
+
 **Review existing code:**
 > "Review this Terraform configuration following best practices"
 
@@ -85,6 +94,9 @@ Claude will automatically use the skill when working with Terraform/OpenTofu cod
 **Testing strategy:**
 > "Help me choose between native tests and Terratest for my modules"
 
+**State management:**
+> "How should I organize state files for a multi-team environment?"
+
 ## What It Covers
 
 ### Testing Strategy Framework
@@ -118,6 +130,7 @@ Decision matrices for:
 - Policy-as-code patterns
 - Secrets management
 - State file security
+- State backend encryption
 - Compliance scanning workflows
 
 ### Common Patterns & Anti-patterns

diff --git a/SKILL.md b/SKILL.md
@@ -1,6 +1,6 @@
 ---
 name: terraform-skill
-description: Use when working with Terraform or OpenTofu - creating modules, writing tests (native test framework, Terratest), setting up CI/CD pipelines, reviewing configurations, choosing between testing approaches, debugging state issues, implementing security scanning (trivy, checkov), or making infrastructure-as-code architecture decisions
+description: Use when working with Terraform or OpenTofu - creating modules, writing tests (native test framework, Terratest), setting up CI/CD pipelines, reviewing configurations, choosing between testing approaches, debugging state issues, managing remote state backends, implementing security scanning (trivy, checkov), or making infrastructure-as-code architecture decisions
 license: Apache-2.0
 metadata:
   author: Anton Babenko
@@ -21,6 +21,8 @@ Comprehensive Terraform and OpenTofu guidance covering testing, modules, CI/CD,
 - Implementing CI/CD for infrastructure-as-code
 - Reviewing or refactoring existing Terraform/OpenTofu projects
 - Choosing between module patterns or state management approaches
+- Configuring remote state backends and state locking
+- Migrating state between backends or troubleshooting state issues
 
 **Don't use this skill for:**
 - Basic Terraform/OpenTofu syntax questions (Claude knows this)
@@ -396,6 +398,147 @@ checkov -d .
 **For detailed security guidance, see:**
 - **[Security & Compliance Guide](references/security-compliance.md)** - Trivy/Checkov integration, secrets management, state file security, compliance testing
 
+## State Management
+
+### Critical Principle: Never Use Local State in Production
+
+**Always use remote backends for:**
+- ✅ Team collaboration with automatic locking
+- ✅ State versioning and backup
+- ✅ Encryption at rest and in transit
+- ✅ Audit logging
+
+### Remote Backend Quick Setup
+
+**AWS S3 (Recommended):**
+
+*Terraform 1.11+ (Recommended):*
+```hcl
+# backend.tf - Native S3 locking, no DynamoDB needed
+terraform {
+  backend "s3" {
+    bucket       = "my-terraform-state"
+    key          = "prod/vpc/terraform.tfstate"
+    region       = "us-east-1"
+    encrypt      = true
+    use_lock_file = true  # Native S3 locking
+  }
+}
+```
+
+*Pre-1.11 or Legacy DynamoDB locking:*
+```hcl
+# backend.tf
+terraform {
+  backend "s3" {
+    bucket         = "my-terraform-state"
+    key            = "prod/vpc/terraform.tfstate"
+    region         = "us-east-1"
+    encrypt        = true
+    dynamodb_table = "terraform-state-lock"
+  }
+}
+```
+
+**Other backends:**
+- **Azure Storage** - Built-in blob lease locking
+- **Google Cloud Storage** - Built-in object metadata locking
+- **Terraform Cloud** - Fully managed with remote execution
+
+### State Organization Patterns
+
+| Pattern | Use When | Example Path |
+|---------|----------|--------------|
+| **Per Environment** | Different teams per env | `prod/terraform.tfstate`, `staging/terraform.tfstate` |
+| **Per Component** | Independent lifecycles | `prod/vpc/`, `prod/eks/`, `prod/rds/` |
+| **Hybrid** (Recommended) | Both benefits | `prod/networking/`, `prod/compute/`, `staging/networking/` |
+
+**Decision guide:**
+- **Split state when:** Different teams, different update frequencies, >500 resources
+- **Combine state when:** Tightly coupled resources, <100 resources, same lifecycle
+
+### Cross-State Data Sharing
+
+**Use terraform_remote_state for component dependencies:**
+
+```hcl
+# Consumer module references producer state
+data "terraform_remote_state" "networking" {
+  backend = "s3"
+
+  config = {
+    bucket = "my-terraform-state"
+    key    = "prod/networking/terraform.tfstate"
+    region = "us-east-1"
+  }
+}
+
+resource "aws_instance" "app" {
+  subnet_id = data.terraform_remote_state.networking.outputs.private_subnet_ids[0]
+}
+```
+
+### Essential State Commands
+
+```bash
+# View resources in state
+terraform state list
+
+# Show resource details
+terraform state show aws_instance.web
+
+# Rename resource (refactoring)
+terraform state mv aws_instance.old aws_instance.new
+
+# Remove from state (keep resource)
+terraform state rm aws_instance.old
+
+# Import existing resource
+terraform import aws_instance.web i-1234567890abcdef0
+
+# Detect drift
+terraform plan -refresh-only
+
+# Backup state
+terraform state pull > backup.tfstate
+```
+
+### State Locking
+
+**How it works:**
+- Lock acquired before state operations
+- Prevents concurrent modifications
+- Automatically released on completion
+
+**Handling stuck locks:**
+
+```bash
+# Only force-unlock if operation crashed and you verified it's not running
+terraform force-unlock LOCK_ID
+
+# Check who holds lock from error message first!
+```
+
+### State Security Best Practices
+
+✅ **DO:**
+- Use encryption at rest (KMS)
+- Enable state versioning
+- Restrict IAM access per environment
+- Use write-only arguments for secrets (Terraform 1.11+)
+- Reference secrets from external sources (Secrets Manager)
+- Enable audit logging
+
+❌ **DON'T:**
+- Store secrets in variables
+- Use local state for teams/production
+- Share state files via git
+- Force-unlock without verification
+- Skip backups
+
+**For comprehensive state management guidance, see:**
+- **[State Management Guide](references/state-management.md)** - Remote backends, locking, security, migration, multi-team patterns, recovery strategies
+
 ## Version Management
 
 ### Version Constraint Syntax
@@ -445,6 +588,7 @@ terraform plan
 | Provider functions | 1.8+ | Provider-specific data transformation |
 | Cross-variable validation | 1.9+ | Validate relationships between variables |
 | Write-only arguments | 1.11+ | Secrets never stored in state |
+| S3 native lock-file | 1.11+ | State locking without DynamoDB |
 
 ### Quick Examples
 
@@ -505,6 +649,7 @@ This skill uses **progressive disclosure** - essential information is in this ma
 - **[Module Patterns](references/module-patterns.md)** - Module structure, variable/output best practices, ✅ DO vs ❌ DON'T patterns
 - **[CI/CD Workflows](references/ci-cd-workflows.md)** - GitHub Actions, GitLab CI templates, cost optimization, automated cleanup
 - **[Security & Compliance](references/security-compliance.md)** - Trivy/Checkov integration, secrets management, compliance testing
+- **[State Management](references/state-management.md)** - Remote backends, state locking, security, migration, multi-team patterns, recovery
 - **[Quick Reference](references/quick-reference.md)** - Command cheat sheets, decision flowcharts, troubleshooting guide
 
 **How to use:** When you need detailed information on a topic, reference the appropriate guide. Claude will load it on demand to provide comprehensive guidance.