Skip to content

Implement ratio-of-means variance for per-acre estimates#70

Merged
mihiarc merged 4 commits intomainfrom
feat/exact-bp-variance
Feb 7, 2026
Merged

Implement ratio-of-means variance for per-acre estimates#70
mihiarc merged 4 commits intomainfrom
feat/exact-bp-variance

Conversation

@mihiarc
Copy link
Owner

@mihiarc mihiarc commented Feb 7, 2026

Summary

  • Implement correct ratio-of-means variance from Bechtold & Patterson (2005) Section 4.2 for all per-acre SE calculations: V(R) = (1/X²) × [V(Y) + R²×V(X) - 2R×Cov(Y,X)]
  • Fix carbon_pools.py x_i=1.0 bug — was using pl.lit(1.0) instead of sum(CONDPROP_UNADJ) per plot, making covariance meaningless
  • Add 17 unit tests covering hand-calculated verification, backward compatibility (constant x_i matches old formula), positive covariance reducing variance, grouped ratio variance, and non-negativity property tests

The old formula se_acre = se_total / total_area treated the denominator as a known constant, ignoring covariance between Y (tree attribute) and X (area). Since they're estimated from the same sample plots and positively correlated, the old formula overestimated per-acre SE.

Files changed

File Change
variance.py Add calculate_ratio_of_means_variance() + exact B&P and simplified helpers; extend grouped functions to compute V(X), Cov(Y,X)
base.py Update _calculate_grouped_multi_metric_variance() and _calculate_overall_multi_metric_variance()
grm_base.py Update _calculate_grm_variance() grouped and ungrouped paths
carbon_pools.py Fix x_i bug, add B&P columns to all_plots, use ratio variance
test_variance_formulas.py Add TestRatioOfMeansVariance class with 17 tests

Test plan

  • uv run pytest tests/unit/test_variance_formulas.py -v — 76/76 pass (59 existing + 17 new)
  • uv run pytest tests/unit/ -v — 703/703 pass, 0 regressions
  • uv run pytest tests/validation/ -v — EVALIDator comparison (SE values should improve slightly)
  • Spot-check: volume estimation se_acre should decrease slightly for positively correlated data

Replace simplified variance formula V = Σ_h EXPNS² × s²_yh × n_h with
the exact Bechtold & Patterson (2005) formula that includes the V2
post-stratification correction term, matching rFIA's unitVar():

  V_EU = (A²/n) × Σ_h W_h × s²_yh/n_h + (A²/n²) × Σ_h (1-W_h) × s²_yh/n_h

This corrects a 1-3% discrepancy with EVALIDator that grows when
stratum allocation is non-proportional (W_h ≠ n_h/n).

- Load POP_ESTN_UNIT and compute STRATUM_WGT in data_loading.py
- Rewrite both variance functions with exact V1+V2 per estimation unit
- Propagate B&P columns through all_plots selection in base.py/grm_base.py
- Add 8 new tests including hand-calculated values and V2 > 0 verification
- Falls back to simplified formula when B&P columns are absent
Replace the simplified se_acre = se_total / total_area formula with
the correct ratio-of-means variance from B&P (2005) Section 4.2:

  V(R) = (1/X²) × [V(Y) + R²×V(X) - 2R×Cov(Y,X)]

The old formula treated the denominator (total area X) as a known
constant, ignoring the covariance between Y and X. Since they are
estimated from the same plots and positively correlated, the old
formula overestimated per-acre SE.

Changes:
- Add calculate_ratio_of_means_variance() with exact B&P and
  simplified fallback paths to variance.py
- Extend grouped variance functions to compute V(X) and Cov(Y,X)
  alongside V(Y) for ratio variance in se_acre
- Update callers in base.py, grm_base.py, carbon_pools.py
- Fix carbon_pools.py x_i=1.0 bug (now uses sum(CONDPROP_UNADJ))
- Add 17 unit tests including hand-calculated verification,
  backward compatibility, and non-negativity property tests
@mihiarc mihiarc changed the title Implement exact B&P post-stratified variance formula (V1 + V2) Implement ratio-of-means variance for per-acre estimates Feb 7, 2026
Prevents inf/NaN propagation in downstream variance calculations
if a POP_ESTN_UNIT record has null or zero P1PNTCNT_EU.
The exact B&P post-stratified variance formula uses s²_h directly:
  V = (A²/n) × Σ W_h × s²_h + (A²/n²) × Σ (1-W_h) × s²_h

The previous implementation incorrectly divided s²_h by n_h_design
(P2POINTCNT), which underestimated variance by a factor of ~n_h
(typically 15-100x). This caused SE values to be far below EVALIDator
benchmarks for TPA, volume, biomass, and GRM estimators.

Fix applied to both _calculate_exact_bp_variance (domain total) and
_calculate_exact_bp_ratio_variance (ratio-of-means). Unit test
hand-calculations updated to match corrected formula. All 703 unit
tests and EVALIDator validation tests now pass.
@mihiarc mihiarc merged commit 23b43cd into main Feb 7, 2026
0 of 3 checks passed
@mihiarc mihiarc deleted the feat/exact-bp-variance branch February 7, 2026 14:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant