Fix bugs and clean up technical debt by mihiarc · Pull Request #69 · mihiarc/pyfia

mihiarc · 2026-02-06T23:14:01Z

Summary

Commit 1: Critical review bug fixes and technical debt

Fix version mismatch: __init__.py was 1.1.0b7, now matches pyproject.toml (1.2.3)
Fix FIA.__exit__ no-op: Added close() method to properly disconnect DB backend on context manager exit
Remove pandas dependency: Removed from both core and optional deps — zero pandas imports exist in src/pyfia/
Fix hardcoded year 2023: format_output now uses _extract_evaluation_year() instead of config.get("year", 2023)
Preserve AREA_TOTAL column: Was dropped in compute_per_acre_values despite being documented as output
Fix lru_cache memory leak: Replaced @lru_cache on instance method get_stratification_data with manual instance-level cache
Migrate carbon_pools to AggregationResult pattern: Eliminated last estimator using old instance-state pattern; removed dead code
Document variance formula: Added algebraic identity explaining equivalence of W_h²×n_h and textbook N_h²/n_h forms

Commit 2: Area variance fix, deprecated code removal, example updates

Fix Issue Variance underestimated for rare categories when using grp_by #68: Area variance severely underestimated for rare categories when using grp_by. Now cross-joins all stratum plots with group values so non-matching plots contribute y=0
Fix LazyFrame/DataFrame concat type mismatch in batch_query_by_values
Fix EVALIDator client using incorrect JSONDecodeError for empty responses
Pass area_domain to get_cond_columns in all estimators so domain filter columns are loaded
Remove deprecated assign_forest_type_group (use add_forest_type_group)
Remove 5 outdated example scripts, modernize 3 remaining ones
Add descriptive aliases for EVALIDator estimate type codes
Add 4 tests for rare category variance (Issue Variance underestimated for rare categories when using grp_by #68)

Test plan

All 677 unit tests pass (21 skipped — optional spatial deps)
ruff check and ruff format pass
Run validation tests against EVALIDator to confirm estimates unchanged
Verify FIA context manager properly disconnects with close()
Test area variance with grouped estimates against EVALIDator SE values

- Fix version mismatch: __init__.py now matches pyproject.toml (1.2.3) - Fix FIA.__exit__ no-op: added close() to properly disconnect DB backend - Remove pandas dependency: not used anywhere in src/pyfia/ - Fix hardcoded year 2023 default in format_output - Preserve AREA_TOTAL column in aggregation output (was dropped prematurely) - Replace lru_cache on instance method with manual cache to prevent GC leak - Migrate carbon_pools to AggregationResult pattern, remove dead code - Add algebraic identity documentation to variance formula - Use NotImplementedError for aggregate_results (not abstract) to support estimators like AreaChangeEstimator that override estimate() entirely All 677 unit tests pass.

…mples Bug fixes: - Fix Issue #68: Area variance severely underestimated for rare categories when using grp_by. Now cross-joins all stratum plots with group values so non-matching plots contribute y=0 to variance calculation. - Fix LazyFrame/DataFrame type mismatch in batch_query_by_values concat - Fix EVALIDator client using incorrect JSONDecodeError for empty responses - Pass area_domain to get_cond_columns in all estimators so domain filter columns are loaded from the database - Update area_change calculate_variance signature for AggregationResult Cleanup: - Remove deprecated assign_forest_type_group (use add_forest_type_group) - Remove 5 outdated example scripts, modernize 3 remaining ones - Add descriptive aliases for EVALIDator estimate type codes - Fix data_reader type annotation for MotherDuck path Tests: - Add 4 tests for rare category variance (Issue #68) - Update classification tests to use add_forest_type_group directly - Update reference table download test for bundled zip handling - Improve biomass validation skip reason documentation

The base class was fixed to use _extract_evaluation_year() but the area estimator overrides format_output and still had pl.lit(2023). Now uses the same pattern as all other estimators.

This reverts commit 6e7ad9c.

mihiarc added 3 commits February 6, 2026 18:13

Fix hardcoded year 2023 in area estimator format_output

2da8f66

The base class was fixed to use _extract_evaluation_year() but the area estimator overrides format_output and still had pl.lit(2023). Now uses the same pattern as all other estimators.

mihiarc merged commit 6e7ad9c into main Feb 7, 2026
3 checks passed

mihiarc deleted the dev/critical-review branch February 7, 2026 11:40

mihiarc added a commit that referenced this pull request Feb 7, 2026

Revert "Fix bugs and clean up technical debt (#69)"

644d860

This reverts commit 6e7ad9c.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix bugs and clean up technical debt#69

Fix bugs and clean up technical debt#69
mihiarc merged 3 commits intomainfrom
dev/critical-review

mihiarc commented Feb 6, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

mihiarc commented Feb 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Commit 1: Critical review bug fixes and technical debt

Commit 2: Area variance fix, deprecated code removal, example updates

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

mihiarc commented Feb 6, 2026 •

edited

Loading