-
Notifications
You must be signed in to change notification settings - Fork 1
Description
Use Case
When linking pyfia biomass estimates to external plot-level models (e.g., harvest probability predictions keyed by PLT_CN + CONDID), users need to know which FIA plot-conditions contribute to each aggregated estimate. Currently, biomass() and other estimators aggregate across plots, and there's no way to retrieve plot-condition level results or identify which plots contributed to a group.
Concrete Example (from pyNCA)
pyNCA's four-phase timber valuation uses a harvest probability model that produces predictions at the FIA plot-condition level (PLT_CN, CONDID). To compute biomass-weighted harvest probabilities for inventory groups (e.g., all Loblolly stands aged 25 in Georgia), we need to know which plot-conditions contribute to each group and their relative biomass weights.
Currently we work around this with supplementary SQL:
# Workaround: separate query to get plot-condition weights per group
plot_weights_sql = f"""
SELECT c.PLT_CN, c.CONDID, c.STDAGE, c.OWNGRPCD, c.FORTYPCD,
SUM(t.DRYBIO_AG * t.TPA_UNADJ) as biomass_weight
FROM COND c
JOIN TREE t ON c.PLT_CN = t.PLT_CN AND c.CONDID = t.CONDID
WHERE c.COND_STATUS_CD = 1 AND t.STATUSCD = 1
GROUP BY c.PLT_CN, c.CONDID, c.STDAGE, c.OWNGRPCD, c.FORTYPCD
"""This duplicates much of pyfia's internal data loading and filtering logic, and bypasses the stratified estimation framework entirely.
Proposed Change
Add PLT_CN and CONDID as supported grouping columns so that estimators can return plot-condition level results.
Option A: Add to existing grouping column lists
In pyfia/estimation/columns.py:
# Could add to COND_GROUPING_COLUMNS (CONDID is already in BASE_COND_COLUMNS)
COND_GROUPING_COLUMNS = [
"OWNGRPCD",
"FORTYPCD",
...
"PLT_CN", # NEW: enable plot-level grouping
"CONDID", # NEW: enable condition-level grouping
]This would allow:
from pyfia import biomass
# Plot-condition level biomass estimates
df = biomass(db, grp_by=["PLT_CN", "CONDID", "STDAGE", "OWNGRPCD", "FORTYPCD"])
# Returns one row per plot-condition with biomass estimateOption B: Separate plot_detail=True parameter
Add a parameter to estimators that returns plot-condition level detail alongside the aggregated estimate:
df = biomass(db, grp_by=["STDAGE", "OWNGRPCD"], plot_detail=True)
# Returns additional columns: PLT_CN, CONDID, plot_biomass_weightOption C: New utility function
from pyfia.estimation import plot_contributions
# Get plot-condition contributions for a given domain
contributions = plot_contributions(
db,
estimator="biomass",
grp_by=["STDAGE", "OWNGRPCD", "FORTYPCD"],
area_domain="FORTYPCD IN (161, 162, 163)"
)
# Returns: PLT_CN, CONDID, group_key_columns, biomass_weightContext
PLT_CNis already inBASE_TREE_COLUMNSandBASE_COND_COLUMNS(used for joins)CONDIDis already inBASE_COND_COLUMNS- The issue is that
get_tree_columns()only adds columns fromTREE_GROUPING_COLUMNSandget_cond_columns()only adds fromCOND_GROUPING_COLUMNS - Since PLT_CN/CONDID aren't in those lists,
grp_by=["PLT_CN", "CONDID"]silently drops them
Impact
This would enable direct integration between pyfia estimates and external plot-level models without requiring users to write raw SQL that duplicates pyfia's internal logic. The main downstream use case is linking FIA inventory data to harvest probability predictions, growth models, or other plot-level attributes from external sources.