-
Notifications
You must be signed in to change notification settings - Fork 4
Description
Summary
Scenario validation checks expression syntax and unknown attribute names, but does not validate that compared string literals are valid categorical options for the referenced attribute.
This allows case/value mismatches (e.g., urban_rural == 'urban' when option is Urban) to pass validation and silently no-op at runtime.
Why This Matters
This is a major source of "looks valid but behavior is wrong" bugs. Incorrect conditions don’t throw hard errors during simulation; they just fail to match and flatten dynamics.
Current Behavior (Code)
In /Users/adithyasrinivasan/Projects/extropy/extropy/scenario/validator.py:
- syntax check:
validate_expression_syntax(...) - reference check:
extract_names_from_expression(...)vs known attrs - no check that literals used in comparisons are present in attribute option domains
By contrast, population semantic validation already has this concept via AST comparison extraction.
Proposed Fix
- Add AST-based comparison extraction for scenario
whenclauses:
seed_exposure.rules[].when- timeline exposure rules
timeline[].exposure_rules[].when(if present) spread.share_modifiers[].when
- For each
(attribute, compared_string_values)pair:
- if attribute is categorical with known options, require literal values to match one of those options
- invalid literals should be
ERROR(not warning) because rule is effectively broken
-
Handle list membership checks (
in [...]) and single comparisons (==,!=). -
Add tests covering:
- exact match pass
- case mismatch fail
- nonexistent value fail
- non-categorical attributes skipped
Acceptance Criteria
- Invalid categorical literals in scenario conditions are caught at validation time.
- Common mismatch classes (case/style/legacy tokens) no longer survive to runtime.
- Validation message includes valid option set for quick fix.
Pipeline Impact
Reduces recursive debug loops by catching scenario-domain mismatches before sampling/simulation.