Release v0.6.0 · singularity-energy/open-grid-emissions

v0.6.0 of OGE includes new data for 2023, a major methodological update, and various other enhancements and bug fixes.

2023 Data Release and Early Release Capability.

OGE now includes data for 2023, based on the final release data from EIA and EPA.
In addition, OGE now includes functionality to be able to ingest "Early Release" data from the EIA, which is typically available several months prior to the final release data released each autumn.

Aggregating Subplant data rather then Plant data

OGE includes data both at the "plant" level, as well as at the "fleet" and region level (see our documentation for more on these aggregations). While all emissions calculations were happening at the generator or "subplant" level, we had previously aggregated subplant data to the plant level, and then used the aggregated plant-level data to further aggregate to the fleet and region level. While this made these latter aggregations more computationally feasible, this could result in some irregularities and inconsistencies in the fleet and region data when a plant burned multiple fuels. Instead, we now use subplant data as the basis of all fleet-level aggregations as well.

Consider the example of the now-retired Meramec plant (ID 2104) in Missouri, which had 2 natural gas steam turbines and 2 conventional coal boilers:

In 2022, its final year of operation, this plant burned slightly more natural gas than coal (by heat content), so it was categorized as a natural gas plant.
Previously, since we were determining fleets based on plant data, the emissions from the entire plant (including the 2 coal generators) would have been aggregated into the natural gas fleet. However, this means that the average emissions for the natural gas fleet in this region would include some coal emissions, and thus be higher than typical natural gas fleet emissions.
Now that we use subplants as the basis for the fleet aggregations, the two natural gas generators at Meramec are aggregated to the natural gas fleet, and the two coal generators at Meramec are aggregated to the coal fleet.
This new approach more closely matches, in our understanding, how balancing authorities generally aggregate fleet data, using generators as the basis for these aggregations rather than plants.

In addition to affecting the fleet totals, this change also affects the hourly profile imputation process, since the residual hourly profiles will now be determined based on the updated fleet definitions.

Now that subplant data is being used more extensively through the pipeline, OGE also contains two new data outputs:

Subplant-level results data at the annual and monthly resolutions (in addition to the existing plant-level output data)
Subplant-specific attributes table that lists the primary fuel, nameplate capacity, and primary prime mover for each subplant.

For more details on these changes, see: #395

Exapanded and enhanced EPA-EIA crosswalking

The subplant-level aggregation revealed a number of previously-uncaught issues with our existing mapping between EPA plant/unit IDs and EIA plant/generator IDs:

The EPA-EIA mapping is not static over time: the relationship between an EPA ID and EIA ID can change from one year to the next, sometimes changing multiple times over the nearly 20-year historical period covered by OGE. In fact some mappings change one year, and then change back to the original mapping several years later! To address this, OGE now includes a "start year" and "end year" for each mapping, and only uses the mapping that is valid for the current year
The existing power sector data crosswalk published by the EPA is missing a number of newer mappings (since 2018), as well as many mappings for earlier years in the 2000s. We were able to expand these mappings using data that already exists in CAMPD's facility database

Ultimately, this update includes about 350 new mappings between EPA and EIA IDs. Without these mappings, the generation and emissions from a subplant could be double-counted if the unit reports data to both the EPA and EIA, since these would have been previously identified as separate subplants.

We also found that across various EPA datasets, that units with IDs starting with leading 0s (e.g., "001") were inconsistently having those leading zeros removed, resulting in sometimes incomplete matches between datasets. To address this, we now strip all leading zeros from EPA unit IDs to ensure consistent mapping.

Data usability enhancements

In the annual, plant-level results file, we now include plant attributes (such as name, location, capacity, fuel, etc) to make these files easier to use and filter in Excel rather than needing to work with them programmatically.

Other Improvements

The subplant-level aggregations also revealed that a number of subplants only include steam output data from CEMS, but no generation data. Examining these units revealed that these boilers may only be used for steam production (for district steam systems for example) and not power production, so these are once again being dropped from the dataset until we can get further clarification from EPA on how to interpret this data.

What's Changed

Updates code to zip files for upload by @grgmiller in #388
Update EPA/EIA crosswalk of plant 55641 by @grgmiller in #389
Enable running OGE pipeline with Early Release data and PUDL nightly builds by @grgmiller in #390
Update data export notebook and small bug fixes by @grgmiller in #391
Update year to 2023 by @grgmiller in #393
Update reference tables by @rouille in #394
Aggregate fleet data by subplant, not plant by @grgmiller in #395
Remove extra comma in energy source groups csv file by @rouille in #397
Expand EPA-EIA crosswalk manual table and create primary fuel manual table by @rouille in #398
Clean up warnings for 2023 data pipeline by @grgmiller in #400
Refactor function writing power sector data by @rouille in #401
Strip leading zeros from CEMS emission_unit_id_epa by @grgmiller in #402
Expand eia-epa crosswalk by @rouille in #403
Final Cleanup by @grgmiller in #407
v0.6.0 by @grgmiller in #408

Full Changelog: v0.5.0...v0.6.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.6.0