diff --git a/README.md b/README.md index 5dc1323..df44968 100644 --- a/README.md +++ b/README.md @@ -49,6 +49,12 @@ Ralph reads these specs and builds the entire project autonomously. |----------|-------------|------------| | [react-native-app](specs/mobile/react-native-app.md) | Cross-platform mobile app | Intermediate | +### SEO & AEO +| Template | Description | Difficulty | +|----------|-------------|------------| +| [aeo-toolkit](specs/seo/aeo-toolkit.md) | Answer Engine Optimization with llms.txt, AI crawlers, citations | Advanced | +| [seo-toolkit](specs/seo/seo-toolkit.md) | Technical SEO with metadata, sitemaps, Core Web Vitals | Intermediate | + ### Tools | Template | Description | Difficulty | |----------|-------------|------------| diff --git a/specs/seo/aeo-toolkit.md b/specs/seo/aeo-toolkit.md new file mode 100644 index 0000000..9a9b153 --- /dev/null +++ b/specs/seo/aeo-toolkit.md @@ -0,0 +1,159 @@ +# AEO Toolkit — Answer Engine Optimization + +Build Answer Engine Optimization into an existing project so AI crawlers (GPTBot, ClaudeBot, PerplexityBot) can discover, parse, and cite the site's content. + +## Overview + +An AEO (Answer Engine Optimization) implementation that ralph drops into the user's existing project. Ralph first reads `package.json` to detect the framework (Next.js, Nuxt, Astro, Remix, Express, or static), then generates only the files that match that stack. The toolkit covers the core AEO primitives: `robots.txt` with AI crawler directives, `llms.txt` / `llms-full.txt` per the llmstxt.org spec, AI-optimized sitemaps, structured data (JSON-LD) tuned for answer extraction, and a CLI auditor that scores the site's AEO readiness. + +AI crawlers do not execute JavaScript — all critical content must be in the initial HTML response. This is the fundamental constraint the toolkit addresses. + +## Features + +- Framework auto-detection from `package.json` +- `robots.txt` with AI crawler directives (GPTBot, ClaudeBot, PerplexityBot, Google-Extended, CCBot) +- `llms.txt` and `llms-full.txt` generation per llmstxt.org spec +- AI-optimized XML sitemap with priority scoring +- JSON-LD structured data (Article, FAQ, HowTo, Product, Organization) +- Bot detection middleware (identifies AI crawlers by user-agent) +- Markdown endpoint support (`.md` versions of pages for LLM consumption) +- AEO audit CLI that scores a site 0-100 +- Entity consistency checker + +## Tasks + +### Task 1: Detect Framework and Setup + +- [ ] Read `package.json` to detect framework (next, nuxt, astro, remix, express) +- [ ] Create `aeo.config.ts` with Zod-validated schema (site name, URL, crawler policies) +- [ ] Create `src/lib/aeo/` directory for core utilities +- [ ] Install dependencies (zod, unified/remark, xml2js) + +### Task 2: robots.txt with AI Crawler Directives + +- [ ] Generate `robots.txt` using the detected framework's routing pattern +- [ ] Include directives for: GPTBot, ChatGPT-User, ClaudeBot, Claude-Web, PerplexityBot, Google-Extended, CCBot, Meta-ExternalAgent, Bytespider, Applebot-Extended +- [ ] Support three policies via config: `allow-all`, `block-training`, `selective` +- [ ] Add Sitemap and Crawl-delay directives +- [ ] Write tests + +### Task 3: llms.txt and llms-full.txt + +- [ ] Generate `llms.txt` from config content map (H1 title, blockquote summary, H2 link sections) +- [ ] Generate `llms-full.txt` with full site documentation (company overview, products, audience) +- [ ] Serve both files via the framework's routing system +- [ ] Add `lastUpdated` timestamp +- [ ] Write tests + +### Task 4: AI-Optimized Sitemap + +- [ ] Generate XML sitemap with priority scoring (landing pages > docs > blog > archives) +- [ ] Add `lastmod` timestamps and `changefreq` hints +- [ ] Reference sitemap in robots.txt +- [ ] Write tests + +### Task 5: Structured Data (JSON-LD) + +- [ ] Create JSON-LD generator functions: Article, FAQ, HowTo, Product, Organization, BreadcrumbList +- [ ] Create component/helper to inject `