Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use serial update for each upgrades data frame #186

Merged
merged 49 commits into from
Sep 11, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
49 commits
Select commit Hold shift + click to select a range
292c149
Successful initial testing.
rHorsey Jan 22, 2024
6a6a949
Updates for successful medium test.
rHorsey Jan 23, 2024
5dee2c7
3x performance bump - more possible with batching.
rHorsey Jan 23, 2024
4bff2d7
Working with 1M, added in pseudorandom option.
rHorsey Feb 1, 2024
ee7bd7d
Mostly finished code updates for precomputed samples. Validating.
rHorsey Feb 8, 2024
e60d2ab
Fully working 100k, 1M, and 2M precomputed samples.
rHorsey Feb 14, 2024
d6f3307
Additional unused args
rHorsey Mar 19, 2024
ff97aff
Merge branch 'rhorsey/bsb-23-10-upgrade' into rHorsey/sampling-v2
rHorsey Mar 19, 2024
a5e363d
Merge branch 'main' into rHorsey/sampling-v2
rHorsey Mar 20, 2024
7e7c086
More options_lookup changes.
rHorsey Mar 22, 2024
be67b1f
Refactored sampling code to no longer follow buildstockbatch paradigms.
rHorsey Apr 4, 2024
6141e26
Updating sqft enumerations for geospatial join.
rHorsey Apr 4, 2024
14d3609
Merge branch 'rHorsey/sampling-v2' of http://github.com/nrel/comstock…
rHorsey Apr 4, 2024
1a5d329
sampling fixes to run on pc
amylebar Apr 9, 2024
43d6a9b
Initial step at processing one upgrade at a time
asparke2 Jun 5, 2024
e719266
save test py script here
wenyikuang Jun 7, 2024
7ac725f
fix the metadata_index issue, the results are matched
wenyikuang Jun 13, 2024
952d066
Modify export function with lazy frame api.
wenyikuang Jun 18, 2024
838d203
update
wenyikuang Jun 21, 2024
eda0a3b
update, working on syntax
wenyikuang Jun 24, 2024
d213a59
Fix logic in metadat index, use self.upgrade_id as primary key
wenyikuang Jun 25, 2024
c0134a0
use lazy frame till seaborn to save memory
wenyikuang Jun 25, 2024
ad05005
more assert to make it safer
wenyikuang Jun 25, 2024
b94540a
working on all in lazy frame
wenyikuang Jun 28, 2024
38fbdca
Move add columns to constructore
wenyikuang Jul 11, 2024
79a7ad0
check in the test file
wenyikuang Jul 11, 2024
a60bbf4
the plotting goes through
wenyikuang Jul 11, 2024
1d37c28
use lazy frame in cebecs and it's comparison
wenyikuang Jul 12, 2024
0f94f12
let ami/cbecs/eia and their comparison adopt lazyframe
wenyikuang Jul 17, 2024
ee0713c
single test upgrade
wenyikuang Jul 17, 2024
e4ff8f2
debug the scale issue
wenyikuang Jul 22, 2024
3a0c12b
fixing scale
wenyikuang Jul 23, 2024
e4e9aab
Fixed the export_data_and_enumeration_dictionary funct for iterating …
wenyikuang Jul 24, 2024
3a94647
Move weighting outside of init method
asparke2 Jul 24, 2024
36c1893
Seperate the unweighted and weighted columns initilaztion out.
wenyikuang Jul 31, 2024
059d94a
Fixed the un-matched columns.
wenyikuang Aug 6, 2024
0851fba
Cleaned up logging and rename functions and variables.
wenyikuang Aug 6, 2024
8cd96f2
lazyframe plotter wrapper
wenyikuang Aug 7, 2024
d6ec850
Refactored comstock_measure_comparison with fine grained way.
wenyikuang Aug 10, 2024
c2d6d12
Optimized performance for plotting! with comstock vs cbecs lazyframe …
wenyikuang Aug 13, 2024
0ae614d
Updated AMI/CBEC/EIA with plotter class.
wenyikuang Aug 14, 2024
d806261
Fixed the eia vs comstock comparision plotting and bugs in cebecs vs …
wenyikuang Aug 16, 2024
32a03d0
Up to date with sampling v2 state.
rHorsey Sep 6, 2024
8d376cd
National and Geospatial writes working.
rHorsey Sep 8, 2024
c6da741
Removing unneded profile.
rHorsey Sep 8, 2024
cd63d64
Merge branch 'main' into postproc_per_upgrade
rHorsey Sep 8, 2024
1d45003
Merge pull request #207 from NREL/rHorsey/sampling-v2
rHorsey Sep 8, 2024
ec34a3c
Fixed the syntax error from comstock.py
wenyikuang Sep 10, 2024
2731285
Fixing CA CZ enumerations bug in options lookup.
rHorsey Sep 11, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1,133 changes: 414 additions & 719 deletions national/housing_characteristics/options_lookup.tsv

Large diffs are not rendered by default.

23 changes: 19 additions & 4 deletions postprocessing/compare_comstock_to_cbecs.py.template
Original file line number Diff line number Diff line change
Expand Up @@ -12,12 +12,12 @@ def main():
# ComStock run
comstock = cspp.ComStock(
s3_base_dir='eulp/euss_com', # If run not on S3, download results_up**.parquet manually
comstock_run_name='baseline_vav_mdp_adjust2', # Name of the run on S3
comstock_run_version='baseline_vav_mdp_adjust2', # Use whatever you want to see in plot and folder names
comstock_run_name='cycle_4_sampling_test_rand_985932_20240321', # Name of the run on S3
comstock_run_version='new_sampling_test', # Use whatever you want to see in plot and folder names
comstock_year=2018, # Typically don't change this
athena_table_name='baseline_vav_mdp_adjust2', # Typically same as comstock_run_name or None
athena_table_name='rand_985932_20240321', # Typically same as comstock_run_name or None
truth_data_version='v01', # Typically don't change this
buildstock_csv_name='buildstock.csv', # Download buildstock.csv manually
buildstock_csv_name='rand_985932_sampling_buildstock.csv', # Download buildstock.csv manually
acceptable_failure_percentage=0.9, # Can increase this when testing and high failure are OK
drop_failed_runs=True, # False if you want to evaluate which runs failed in raw output data
color_hex='#0072B2', # Color used to represent this run in plots
Expand All @@ -26,17 +26,32 @@ def main():
include_upgrades=False, # False if not looking at upgrades
upgrade_ids_to_skip=[] # Use [1, 3] etc. to exclude certain upgrades
)

# Stock Estimation for Apportionment:
stock_estimate = cspp.Apportion(
stock_estimation_version='2024R2', # Only updated when a new stock estimate is published
truth_data_version='v01' # Typically don't change this
)

# Scale ComStock run to CBECS 2018 AND remove non-ComStock buildings from CBECS
comstock.add_weights_aportioned_by_stock_estimate(apportionment=stock_estimate)
comstock.create_national_aggregation()
comstock.create_geospatially_resolved_aggregations(comstock.STATE_ID, pretty_geo_col_name='state_id')
comstock.create_geospatially_resolved_aggregations(comstock.COUNTY_ID, pretty_geo_col_name='county_id')

# CBECS
cbecs = cspp.CBECS(
cbecs_year=2018, # 2012 and 2018 currently available
truth_data_version='v01', # Typically don't change this
color_hex='#009E73', # Color used to represent CBECS in plots
reload_from_csv=False # True if CSV already made and want faster reload times
)

# TODO Update past here including ensuring we can still apply CBECS weights on top of previous weights.

# Scale ComStock run to CBECS 2018 AND remove non-ComStock buildings from CBECS
comstock.add_national_scaling_weights(cbecs, remove_non_comstock_bldg_types_from_cbecs=True)
comstock.calculate_weighted_columnal_values()
comstock.export_to_csv_wide()

# Make a comparison by passing in a list of CBECs and ComStock runs to compare
Expand Down
1 change: 1 addition & 0 deletions postprocessing/comstockpostproc/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@
from .cbecs import CBECS
from .eia import EIA
from .ami import AMI
from .comstock_apportionment import Apportion
from .comstock_to_cbecs_comparison import ComStockToCBECSComparison
from .comstock_measure_comparison import ComStockMeasureComparison
from .comstock_to_eia_comparison import ComStockToEIAComparison
Expand Down
2 changes: 2 additions & 0 deletions postprocessing/comstockpostproc/ami.py
Original file line number Diff line number Diff line change
Expand Up @@ -140,6 +140,8 @@ def __init__(self, truth_data_version, color_hex=NamingMixin.COLOR_AMI, reload_f
self.ami_timeseries_data = pd.read_csv(file_path, low_memory=False, index_col='timestamp', parse_dates=True)
else:
self.calculate_ami_aggregates()

assert isinstance(self.ami_timeseries_data, pd.DataFrame)

def download_truth_data(self):
# AMI data
Expand Down
7 changes: 7 additions & 0 deletions postprocessing/comstockpostproc/cbecs.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@
import logging
import numpy as np
import pandas as pd
import polars as pl

from comstockpostproc.naming_mixin import NamingMixin
from comstockpostproc.units_mixin import UnitsMixin
Expand Down Expand Up @@ -76,6 +77,12 @@ def __init__(self, cbecs_year, truth_data_version, color_hex=NamingMixin.COLOR_C
for c in self.data.columns:
logger.debug(c)

assert isinstance(self.data, pd.DataFrame)
logging.info(f'Created {self.dataset_name} with {len(self.data)} rows')
self.data = self.data.astype(str)
self.data = pl.from_pandas(self.data).lazy()
assert isinstance(self.data, pl.LazyFrame)

def download_data(self):
# CBECS microdata
file_name = f'CBECS_{self.year}_microdata.csv'
Expand Down
Loading