-
Notifications
You must be signed in to change notification settings - Fork 1
Description
📋 Issue Type
Bug Fix - Translation Completeness
🎯 Objective
Fix ~48 committee-reports articles (4 dates × 12 languages) that contain English section headings and body paragraphs instead of content in the target language.
📊 Current State
The following committee-reports articles contain English content (headings and body text) despite being designated for non-English languages:
Affected Articles (4 dates × 12 languages = 48 articles)
| Date | Languages Affected |
|---|---|
| 2026-02-16 | da, no, fi, de, fr, es, nl, ar, he, ja, ko, zh |
| 2026-02-17 | da, no, fi, de, fr, es, nl, ar, he, ja, ko, zh |
| 2026-02-18 | da, no, fi, de, fr, es, nl, ar, he, ja, ko, zh |
| 2026-02-24 | da, no, fi, de, fr, es, nl, ar, he, ja, ko, zh |
Specific English Content Found
- Section headings in English:
<h2>What to Watch</h2>,<h2>What to Watch in the Coming Weeks</h2> - Body paragraphs in English: Full analytical paragraphs about committee proceedings
- English phrases in body: "Chamber debate tactics", "amendment proposals from opposition parties", "Expected vote outcome"
- Meta keywords in English:
content="committee, reports, betänkanden, Ukraine aid, data protection..."
Example (Danish article with English content)
File: news/2026-02-16-committee-reports-da.html
- Title/description: ✅ Correctly in Danish
- H2 headings: ❌ "What to Watch" (should be "Hvad skal man holde øje med")
- Body paragraphs: ❌ Multiple English paragraphs
- Keywords: ❌ English keywords
🚀 Desired State
All 48 articles fully translated into their target language:
- Section headings translated using CONTENT_LABELS equivalents
- Body paragraphs rewritten in the target language
- Meta keywords translated to target language
data-translatemarkers removed if present
🔧 Implementation Approach
Recommended Strategy: Re-generate or Batch-Translate
Option A (Preferred): Use the existing scripts/generate_committee_articles.py translation system to regenerate the affected articles with proper translations.
Option B: Create a targeted fix script similar to scripts/fix-mixed-language-descriptions.py that:
- Scans
news/2026-02-{16,17,18,24}-committee-reports-{lang}.html - Identifies English headings and replaces with CONTENT_LABELS equivalents
- Translates English body paragraphs to the target language
- Localizes meta keywords
- Validates the result with
scripts/validate-news-translations.ts
Files to Fix (48 total)
news/2026-02-16-committee-reports-{da,no,fi,de,fr,es,nl,ar,he,ja,ko,zh}.html
news/2026-02-17-committee-reports-{da,no,fi,de,fr,es,nl,ar,he,ja,ko,zh}.html
news/2026-02-18-committee-reports-{da,no,fi,de,fr,es,nl,ar,he,ja,ko,zh}.html
news/2026-02-24-committee-reports-{da,no,fi,de,fr,es,nl,ar,he,ja,ko,zh}.html
🤖 Recommended Agent
agent:news-journalist — Has expertise in the article generation system, translation pipeline, and can use MCP tools to regenerate articles with proper translations. The content-generator agent could also assist with batch translation.
✅ Acceptance Criteria
- All 48 committee-reports articles have headings in the target language
- All body paragraphs are translated (no English paragraphs in non-EN files)
- Meta keywords are localized per language
-
data-translate="true"markers eliminated -
npx tsx scripts/validate-news-translations.tspasses for all fixed files - HTML validation passes (
htmlhint) - RTL languages (ar, he) maintain correct text direction
📚 References
- Content labels for heading translations:
scripts/data-transformers/constants/content-labels-part1.ts,content-labels-part2.ts - Translation dictionary:
scripts/translation-dictionary.ts - Existing fix script pattern:
scripts/fix-mixed-language-descriptions.py - Committee article generator:
scripts/generate_committee_articles.py
🏷️ Labels
type:bug, component:i18n, component:news, translation, priority-high, component:content