-
Notifications
You must be signed in to change notification settings - Fork 4
Open
Description
Summary
Partner correlation currently uses hardcoded attribute-name checks (age, race_ethnicity variants, country) to select correlation behavior. This is brittle and not generalizable across arbitrary specs.
Why This Matters
Behavior should depend on declared semantics/policy, not string names. Renaming equivalent attributes can silently change sampling dynamics.
Current Behavior (Code)
In /Users/adithyasrinivasan/Projects/extropy/extropy/population/sampler/households.py:
attr_name == "age"-> gaussian offset (line ~102)attr_name in ("race_ethnicity", "ethnicity", "race")-> same-group rates (line ~114)attr_name == "country"-> same-country logic (line ~127)
Proposed Fix
- Introduce policy-driven partner correlation resolution:
- policy keyed by metadata (
semantic_type,identity_type, explicit partner policy) - deterministic resolver chooses algorithm (
gaussian_offset,same_group_rate,same_value_probability)
-
Keep backwards compatibility by mapping legacy names to policy in one migration layer, then deprecate name branching.
-
Update validation to ensure required policy metadata exists for
scope=partner_correlatedattributes.
Tests
- equivalent semantics with different attribute names produce equivalent partner-correlation behavior
- missing policy metadata is flagged in strict mode
- legacy specs continue to work with migration mapping
Acceptance Criteria
- No runtime behavior depends on hardcoded attribute name strings.
- Partner correlation is fully metadata/policy-driven.
Pipeline Impact
Critical for generalization across diverse populations and scenario-extended schemas.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels