v0.6.0 of OGE includes new data for 2023, a major methodological update, and various other enhancements and bug fixes.
2023 Data Release and Early Release Capability.
OGE now includes data for 2023, based on the final release data from EIA and EPA.
In addition, OGE now includes functionality to be able to ingest "Early Release" data from the EIA, which is typically available several months prior to the final release data released each autumn.
Aggregating Subplant data rather then Plant data
OGE includes data both at the "plant" level, as well as at the "fleet" and region level (see our documentation for more on these aggregations). While all emissions calculations were happening at the generator or "subplant" level, we had previously aggregated subplant data to the plant level, and then used the aggregated plant-level data to further aggregate to the fleet and region level. While this made these latter aggregations more computationally feasible, this could result in some irregularities and inconsistencies in the fleet and region data when a plant burned multiple fuels. Instead, we now use subplant data as the basis of all fleet-level aggregations as well.
Consider the example of the now-retired Meramec plant (ID 2104) in Missouri, which had 2 natural gas steam turbines and 2 conventional coal boilers:
- In 2022, its final year of operation, this plant burned slightly more natural gas than coal (by heat content), so it was categorized as a natural gas plant.
- Previously, since we were determining fleets based on plant data, the emissions from the entire plant (including the 2 coal generators) would have been aggregated into the natural gas fleet. However, this means that the average emissions for the natural gas fleet in this region would include some coal emissions, and thus be higher than typical natural gas fleet emissions.
- Now that we use subplants as the basis for the fleet aggregations, the two natural gas generators at Meramec are aggregated to the natural gas fleet, and the two coal generators at Meramec are aggregated to the coal fleet.
- This new approach more closely matches, in our understanding, how balancing authorities generally aggregate fleet data, using generators as the basis for these aggregations rather than plants.
In addition to affecting the fleet totals, this change also affects the hourly profile imputation process, since the residual hourly profiles will now be determined based on the updated fleet definitions.
Now that subplant data is being used more extensively through the pipeline, OGE also contains two new data outputs:
- Subplant-level results data at the annual and monthly resolutions (in addition to the existing plant-level output data)
- Subplant-specific attributes table that lists the primary fuel, nameplate capacity, and primary prime mover for each subplant.
For more details on these changes, see: #395
Exapanded and enhanced EPA-EIA crosswalking
The subplant-level aggregation revealed a number of previously-uncaught issues with our existing mapping between EPA plant/unit IDs and EIA plant/generator IDs:
- The EPA-EIA mapping is not static over time: the relationship between an EPA ID and EIA ID can change from one year to the next, sometimes changing multiple times over the nearly 20-year historical period covered by OGE. In fact some mappings change one year, and then change back to the original mapping several years later! To address this, OGE now includes a "start year" and "end year" for each mapping, and only uses the mapping that is valid for the current year
- The existing power sector data crosswalk published by the EPA is missing a number of newer mappings (since 2018), as well as many mappings for earlier years in the 2000s. We were able to expand these mappings using data that already exists in CAMPD's facility database
Ultimately, this update includes about 350 new mappings between EPA and EIA IDs. Without these mappings, the generation and emissions from a subplant could be double-counted if the unit reports data to both the EPA and EIA, since these would have been previously identified as separate subplants.
We also found that across various EPA datasets, that units with IDs starting with leading 0s (e.g., "001") were inconsistently having those leading zeros removed, resulting in sometimes incomplete matches between datasets. To address this, we now strip all leading zeros from EPA unit IDs to ensure consistent mapping.
Data usability enhancements
In the annual, plant-level results file, we now include plant attributes (such as name, location, capacity, fuel, etc) to make these files easier to use and filter in Excel rather than needing to work with them programmatically.
Other Improvements
The subplant-level aggregations also revealed that a number of subplants only include steam output data from CEMS, but no generation data. Examining these units revealed that these boilers may only be used for steam production (for district steam systems for example) and not power production, so these are once again being dropped from the dataset until we can get further clarification from EPA on how to interpret this data.
What's Changed
- Updates code to zip files for upload by @grgmiller in #388
- Update EPA/EIA crosswalk of plant 55641 by @grgmiller in #389
- Enable running OGE pipeline with Early Release data and PUDL nightly builds by @grgmiller in #390
- Update data export notebook and small bug fixes by @grgmiller in #391
- Update year to 2023 by @grgmiller in #393
- Update reference tables by @rouille in #394
- Aggregate fleet data by subplant, not plant by @grgmiller in #395
- Remove extra comma in energy source groups csv file by @rouille in #397
- Expand EPA-EIA crosswalk manual table and create primary fuel manual table by @rouille in #398
- Clean up warnings for 2023 data pipeline by @grgmiller in #400
- Refactor function writing power sector data by @rouille in #401
- Strip leading zeros from CEMS
emission_unit_id_epa
by @grgmiller in #402 - Expand eia-epa crosswalk by @rouille in #403
- Final Cleanup by @grgmiller in #407
- v0.6.0 by @grgmiller in #408
Full Changelog: v0.5.0...v0.6.0