Skip to content

Support PLT_CN/CONDID as grouping columns for plot-condition level estimates #71

@mihiarc

Description

@mihiarc

Use Case

When linking pyfia biomass estimates to external plot-level models (e.g., harvest probability predictions keyed by PLT_CN + CONDID), users need to know which FIA plot-conditions contribute to each aggregated estimate. Currently, biomass() and other estimators aggregate across plots, and there's no way to retrieve plot-condition level results or identify which plots contributed to a group.

Concrete Example (from pyNCA)

pyNCA's four-phase timber valuation uses a harvest probability model that produces predictions at the FIA plot-condition level (PLT_CN, CONDID). To compute biomass-weighted harvest probabilities for inventory groups (e.g., all Loblolly stands aged 25 in Georgia), we need to know which plot-conditions contribute to each group and their relative biomass weights.

Currently we work around this with supplementary SQL:

# Workaround: separate query to get plot-condition weights per group
plot_weights_sql = f"""
SELECT c.PLT_CN, c.CONDID, c.STDAGE, c.OWNGRPCD, c.FORTYPCD,
       SUM(t.DRYBIO_AG * t.TPA_UNADJ) as biomass_weight
FROM COND c
JOIN TREE t ON c.PLT_CN = t.PLT_CN AND c.CONDID = t.CONDID
WHERE c.COND_STATUS_CD = 1 AND t.STATUSCD = 1
GROUP BY c.PLT_CN, c.CONDID, c.STDAGE, c.OWNGRPCD, c.FORTYPCD
"""

This duplicates much of pyfia's internal data loading and filtering logic, and bypasses the stratified estimation framework entirely.

Proposed Change

Add PLT_CN and CONDID as supported grouping columns so that estimators can return plot-condition level results.

Option A: Add to existing grouping column lists

In pyfia/estimation/columns.py:

# Could add to COND_GROUPING_COLUMNS (CONDID is already in BASE_COND_COLUMNS)
COND_GROUPING_COLUMNS = [
    "OWNGRPCD",
    "FORTYPCD",
    ...
    "PLT_CN",   # NEW: enable plot-level grouping
    "CONDID",   # NEW: enable condition-level grouping
]

This would allow:

from pyfia import biomass

# Plot-condition level biomass estimates
df = biomass(db, grp_by=["PLT_CN", "CONDID", "STDAGE", "OWNGRPCD", "FORTYPCD"])
# Returns one row per plot-condition with biomass estimate

Option B: Separate plot_detail=True parameter

Add a parameter to estimators that returns plot-condition level detail alongside the aggregated estimate:

df = biomass(db, grp_by=["STDAGE", "OWNGRPCD"], plot_detail=True)
# Returns additional columns: PLT_CN, CONDID, plot_biomass_weight

Option C: New utility function

from pyfia.estimation import plot_contributions

# Get plot-condition contributions for a given domain
contributions = plot_contributions(
    db, 
    estimator="biomass",
    grp_by=["STDAGE", "OWNGRPCD", "FORTYPCD"],
    area_domain="FORTYPCD IN (161, 162, 163)"
)
# Returns: PLT_CN, CONDID, group_key_columns, biomass_weight

Context

  • PLT_CN is already in BASE_TREE_COLUMNS and BASE_COND_COLUMNS (used for joins)
  • CONDID is already in BASE_COND_COLUMNS
  • The issue is that get_tree_columns() only adds columns from TREE_GROUPING_COLUMNS and get_cond_columns() only adds from COND_GROUPING_COLUMNS
  • Since PLT_CN/CONDID aren't in those lists, grp_by=["PLT_CN", "CONDID"] silently drops them

Impact

This would enable direct integration between pyfia estimates and external plot-level models without requiring users to write raw SQL that duplicates pyfia's internal logic. The main downstream use case is linking FIA inventory data to harvest probability predictions, growth models, or other plot-level attributes from external sources.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions