Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
85 changes: 85 additions & 0 deletions docs/branch-notes/steps1-3-hardening.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,85 @@
# Branch Handoff: `codex/steps1-3-hardening`

## Purpose
This branch hardens steps 1-3 of the pipeline (spec/scenario/sample validation and sampler consistency) and related persona consistency checks.

## Commits Included (in order)
1. `969ea9d` - fix(scenario): include extended attributes in validator references (#117)
2. `de2298c` - fix(scenario): validate condition literals against categorical options (#118)
3. `0755fae` - fix(cli): add persona validation path and robust validate type detection (#110)
4. `72af9a4` - fix(sampler): reconcile household-derived attributes after assignment (#114)
5. `696c81c` - refactor(sampler): make partner correlation policy-driven with metadata fallback (#123)
6. `84a4ef3` - feat(validator): detect ambiguous categorical/boolean modifier overlaps (#122)
7. `f85393d` - fix(sampler): surface modifier condition eval failures with strict/permissive modes (#121)
8. `ee7a262` - feat(sample): enforce expression constraints and promoted warning gates without new flags (#119 #120)
9. `b790597` - fix(persona): apply semantic context to avoid unemployed/occupation contradictions (#113)

## Exact Fixes by Area

### Scenario Validation
- Validator now resolves references against both base population attributes and scenario `extended_attributes`.
- Validator now checks condition string literals against known categorical options, preventing case/value drift (e.g., invalid enum labels).
- Timeline exposure rule conditions now get the same attribute/literal validation checks.

### CLI Validation
- `extropy validate` spec-type detection is more robust (population vs scenario vs persona).
- Persona config validation path added with structural checks and context-aware checks when scenario/population can be resolved.

### Sampling / Household Integrity
- Post-sampling reconciliation pass aligns household-derived fields to actual sampled household composition:
- household size
- has-children flags
- children count fields (when present)
- marital consistency for partnered households/dependent agents
- Sampling stats are recomputed after reconciliation so summary stats reflect final values.

### Partner Correlation Generalization
- Added explicit `partner_correlation_policy` support in population models.
- Correlation algorithm resolution is now policy/metadata driven first, with legacy-name fallback for backward compatibility.
- Added semantic warning when `partner_correlated` attrs lack policy/semantic metadata.

### Semantic Validator Enhancements
- Added overlap analysis for categorical/boolean modifiers (`MODIFIER_OVERLAP`, `MODIFIER_OVERLAP_EXCLUSIVE`).
- Added partner-policy completeness warnings (`PARTNER_POLICY`).

### Sampler Failure Visibility / Strictness
- Modifier condition evaluation failures are now surfaced with strict/permissive behavior:
- strict mode: fail sampling
- permissive mode: collect warnings
- `sample` now enforces expression constraints in normal mode (without `--skip-validation`).
- Some semantic warnings are promoted to blocking during strict sampling paths.

### Persona Rendering Consistency
- Added semantic-context phrase override so non-working agents are not rendered with contradictory active-employment phrasing.
- Simulation persona generation now passes semantic metadata to renderer.

## Test Coverage Added/Updated
- `tests/test_scenario_validator.py`
- `tests/test_cli.py`
- `tests/test_household_sampling.py`
- `tests/test_validator.py`
- `tests/test_sampler.py`
- `tests/test_persona_renderer.py`

## Current Known Gaps (not fixed in this branch)
- Spec overlap volume can be high; overlap warnings require spec-side cleanup for deterministic behavior.
- Household type labels can still be semantically mismatched in some edge cases (`couple_with_kids`/`single_parent` labels vs zero dependents after reconciliation).
- Some implausible demographic combinations remain spec-driven (not engine bugs), e.g. age/education/employment combinations where conditional coverage is incomplete.

## Issue Tracking Guidance (keep issues open for merge coordination)
Do **not** close issues until merge + verification in shared integration branch.
Suggested per-issue state update:
- Add comment: "Implemented on `codex/steps1-3-hardening`, pending integration verification".
- Link commit hash(es) above.
- Keep status open with label like `ready-for-merge-test`.

Mapped issues on this branch:
- #110, #113, #114, #117, #118, #119, #120, #121, #122, #123

## Merge Safety Notes
- This branch intentionally increases strictness in sampling validation; expect older specs to fail faster.
- If another branch modifies validator/sampler internals, resolve conflicts by preserving:
- extended-attr aware scenario validation
- post-household reconciliation pass
- strict/permissive condition handling
- expression constraint enforcement behavior
70 changes: 70 additions & 0 deletions extropy/cli/commands/sample.py
Original file line number Diff line number Diff line change
Expand Up @@ -193,6 +193,56 @@ def sample_command(
out.text("[dim]Use --skip-validation to sample anyway[/dim]")
raise typer.Exit(out.finish())
else:
promoted_warning_categories = {
"CONDITION_VALUE",
"MODIFIER_OVERLAP_EXCLUSIVE",
"PARTNER_POLICY",
}
promoted_warnings = [
w
for w in validation_result.warnings
if w.category in promoted_warning_categories
]
if promoted_warnings:
out.set_data(
"promoted_warning_categories",
sorted(promoted_warning_categories),
)
out.set_data(
"promoted_warnings",
[
{
"location": w.location,
"category": w.category,
"message": w.message,
"suggestion": w.suggestion,
}
for w in promoted_warnings
],
)

if skip_validation:
out.warning(
f"Spec has {len(promoted_warnings)} promoted warning(s) - skipping validation"
)
else:
out.error(
f"Merged spec has {len(promoted_warnings)} promoted warning(s)",
exit_code=ExitCode.VALIDATION_ERROR,
)
if not agent_mode:
for warn in promoted_warnings[:5]:
out.text(
f" [red]✗[/red] [{warn.category}] {warn.location}: {warn.message}"
)
if len(promoted_warnings) > 5:
out.text(
f" [dim]... and {len(promoted_warnings) - 5} more[/dim]"
)
out.blank()
out.text("[dim]Use --skip-validation to sample anyway[/dim]")
raise typer.Exit(out.finish())

if validation_result.warnings:
out.success(
f"Spec validated with {len(validation_result.warnings)} warning(s)"
Expand All @@ -205,6 +255,8 @@ def sample_command(
sampling_start = time.time()
result = None
sampling_error = None
strict_condition_errors = not skip_validation
enforce_expression_constraints = not skip_validation

show_progress = count >= 100 and not agent_mode

Expand Down Expand Up @@ -238,6 +290,8 @@ def on_progress(current: int, total: int):
on_progress=on_progress,
household_config=household_config,
agent_focus_mode=agent_focus_mode,
strict_condition_errors=strict_condition_errors,
enforce_expression_constraints=enforce_expression_constraints,
)
except SamplingError as e:
sampling_error = e
Expand All @@ -251,6 +305,8 @@ def on_progress(current: int, total: int):
seed=seed,
household_config=household_config,
agent_focus_mode=agent_focus_mode,
strict_condition_errors=strict_condition_errors,
enforce_expression_constraints=enforce_expression_constraints,
)
except SamplingError as e:
sampling_error = e
Expand All @@ -262,6 +318,8 @@ def on_progress(current: int, total: int):
seed=seed,
household_config=household_config,
agent_focus_mode=agent_focus_mode,
strict_condition_errors=strict_condition_errors,
enforce_expression_constraints=enforce_expression_constraints,
)
except SamplingError as e:
sampling_error = e
Expand All @@ -282,6 +340,18 @@ def on_progress(current: int, total: int):
sampling_time_seconds=sampling_elapsed,
)

if result.stats.condition_warnings:
warning_count = len(result.stats.condition_warnings)
out.warning(
f"{warning_count} modifier condition evaluation warning(s) encountered during sampling"
)
out.set_data("condition_warning_count", warning_count)
if report and not agent_mode:
for warning in result.stats.condition_warnings[:3]:
out.text(f" [yellow]⚠[/yellow] {warning}")
if warning_count > 3:
out.text(f" [dim]... and {warning_count - 3} more[/dim]")

# Report
if agent_mode or report:
out.set_data("stats", format_sampling_stats_for_json(result.stats, merged_spec))
Expand Down
Loading
Loading