-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ICF Off-model calculator - Refactor 1-3, Feature 1-3 #74
Changes from 9 commits
ed3a52d
4b82f9d
0f257d6
14caf82
905dc62
602d894
2dc3db8
89888b6
975c28c
31ff7de
917f176
7da9305
2e981d0
bf072f1
e963d2b
6e54c5c
2b5a967
ee612ac
e3a9aa5
4e08374
481423c
d222bc2
9f16274
ba76fb7
d2a7be8
694d5d0
ea35544
c6e9277
99594a2
805f2ab
dad1744
2149dab
1d6420c
0e5c49d
f615ea6
092361b
a43a568
4554dec
63baa74
43f3e78
78f9432
12c9219
4055639
07c9ca8
82a95df
8cf5b2a
0de092f
3dedb95
65fa495
f101f04
19806bb
db29fd6
d1dab58
b18724e
872d376
07dc954
29ed84c
974205f
ede0ace
4f1e9b5
7aa3777
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,38 @@ | ||
Output by X:/travel-model-one-master/utilities/RTP/Emissions/Off Model Calculators/BikeShare.R on Thu Feb 01 10:43:17 2024 | ||
"index","year","category","directory","variable","value" | ||
"2005-SB375 Base year-total_employment",2005,"SB375 Base year","2005_TM160_IPA_02","total_employment",3575933 | ||
"2005-SB375 Base year-total_population",2005,"SB375 Base year","2005_TM160_IPA_02","total_population",7096469 | ||
"2015-Previous base year-total_employment",2015,"Previous base year","2015_TM160_IPA_03","total_employment",3861318 | ||
"2015-Previous base year-total_population",2015,"Previous base year","2015_TM160_IPA_03","total_population",7581396 | ||
"2023-Base year-total_employment",2023,"Base year","2023_TM160_IPA_42","total_employment",4064307 | ||
"2023-Base year-total_population",2023,"Base year","2023_TM160_IPA_42","total_population",7847766 | ||
"2025-Previous Plan-total_employment",2025,"Previous Plan","2025_TM152_FBP_Plus_22","total_employment",4147691 | ||
"2025-Previous Plan-total_population",2025,"Previous Plan","2025_TM152_FBP_Plus_22","total_population",8231265 | ||
"2035-IPA-total_employment",2035,"IPA","2035_TM160_IPA_09","total_employment",4854742 | ||
"2035-IPA-total_employment",2035,"IPA","2035_TM160_IPA_09_minusModePref","total_employment",4854742 | ||
"2035-IPA-total_employment",2035,"IPA","2035_TM160_IPA_09_PBA50aoc","total_employment",4854742 | ||
"2035-IPA-total_employment",2035,"IPA","2035_TM160_IPA_09_PBA50ixex","total_employment",4854742 | ||
"2035-IPA-total_employment",2035,"IPA","2035_TM160_IPA_09_PBA50landuse","total_employment",4834513 | ||
"2035-IPA-total_employment",2035,"IPA","2035_TM160_IPA_10","total_employment",4854742 | ||
"2035-IPA-total_employment",2035,"IPA","2035_TM160_IPA_10_plusEN7","total_employment",4854742 | ||
"2035-IPA-total_employment",2035,"IPA","2035_TM160_IPA_11","total_employment",4854742 | ||
"2035-IPA-total_employment",2035,"IPA","2035_TM160_IPA_11_network2023","total_employment",4854742 | ||
"2035-IPA-total_employment",2035,"IPA","2035_TM160_IPA_12","total_employment",4854742 | ||
"2035-IPA-total_employment",2035,"IPA","2035_TM160_IPA_14","total_employment",4854742 | ||
"2035-IPA-total_employment",2035,"IPA","2035_TM160_IPA_15","total_employment",4854742 | ||
"2035-IPA-total_employment",2035,"IPA","2035_TM160_IPA_16","total_employment",4854742 | ||
"2035-IPA-total_population",2035,"IPA","2035_TM160_IPA_09","total_population",8476332 | ||
"2035-IPA-total_population",2035,"IPA","2035_TM160_IPA_09_minusModePref","total_population",8476332 | ||
"2035-IPA-total_population",2035,"IPA","2035_TM160_IPA_09_PBA50aoc","total_population",8476332 | ||
"2035-IPA-total_population",2035,"IPA","2035_TM160_IPA_09_PBA50ixex","total_population",8476332 | ||
"2035-IPA-total_population",2035,"IPA","2035_TM160_IPA_09_PBA50landuse","total_population",9002950 | ||
"2035-IPA-total_population",2035,"IPA","2035_TM160_IPA_10","total_population",8476332 | ||
"2035-IPA-total_population",2035,"IPA","2035_TM160_IPA_10_plusEN7","total_population",8476332 | ||
"2035-IPA-total_population",2035,"IPA","2035_TM160_IPA_11","total_population",8476332 | ||
"2035-IPA-total_population",2035,"IPA","2035_TM160_IPA_11_network2023","total_population",8476332 | ||
"2035-IPA-total_population",2035,"IPA","2035_TM160_IPA_12","total_population",8476332 | ||
"2035-IPA-total_population",2035,"IPA","2035_TM160_IPA_14","total_population",8476332 | ||
"2035-IPA-total_population",2035,"IPA","2035_TM160_IPA_15","total_population",8476332 | ||
"2035-IPA-total_population",2035,"IPA","2035_TM160_IPA_16","total_population",8476332 | ||
"2035-Previous Plan-total_employment",2035,"Previous Plan","2035_TM152_FBP_Plus_24","total_employment",4834513 | ||
"2035-Previous Plan-total_population",2035,"Previous Plan","2035_TM152_FBP_Plus_24","total_population",9002950 |
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Similarly -- what is this? I would think this should go on Box? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @lmz the IPA_TM2 folder saved in the inputs is the folder we received as an example of what the R script created (previous step). I am not touching this folder design because I'm not updating the R script. I'm only using it as the expected input sample folder. Once we complete all the updates and run a demo, we can remove all the files that are unnecessary in the process, if applicable. My understanding is that we only need the files in IPA_TM2/ModelData. Ideally, we'd like the R script to reflect that. |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,54 @@ | ||
import openpyxl | ||
|
||
from helper.calcs import Calc | ||
|
||
class Bikeshare(Calc): | ||
|
||
def __init__(self, *args, **kwargs): | ||
super().__init__(*args, **kwargs) | ||
self.masterWbName="PBA50+_OffModel_Bikeshare" | ||
self.dataFileName="Model Data - Bikeshare" | ||
|
||
def write_runid_to_mainsheet(self): | ||
# get variables location in calculator | ||
Calc.get_variable_locations(self) | ||
|
||
# add run_id to 'Main sheet' | ||
newWorkbook = openpyxl.load_workbook(self.new_workbook_file) | ||
mainsheet = newWorkbook['Main sheet'] | ||
|
||
# Select Main sheet variables | ||
vMS=self.v['Main sheet'] | ||
|
||
# Write run name and year | ||
mainsheet[vMS['Run_directory_2035']] = Calc.get_ipa(self, self.runs[0])[0] | ||
mainsheet[vMS['Run_directory_2050']] = Calc.get_ipa(self, self.runs[1])[0] | ||
mainsheet[vMS['year_a']] = Calc.get_ipa(self, self.runs[0])[1] | ||
mainsheet[vMS['year_b']] = Calc.get_ipa(self, self.runs[1])[1] | ||
|
||
|
||
# save file | ||
newWorkbook.save(self.new_workbook_file) | ||
|
||
if self.verbose: | ||
print(f"Main sheet updated with {self.runs} in location\n{self.new_workbook_file}") | ||
|
||
def update_calculator(self): | ||
|
||
# Step 1: Create run and copy files | ||
Calc.copy_workbook(self) | ||
|
||
# Step 2: load and filter model data of selected runs | ||
modelData, metaData=Calc.get_model_data(self) | ||
|
||
# Step 3: add model data of selected runs to 'Model Data' sheet | ||
Calc.write_model_data_to_excel(self,modelData,metaData) | ||
|
||
# Step 4: | ||
self.write_runid_to_mainsheet() | ||
|
||
## Step 5: open/close Excel, autosave | ||
# todo | ||
# | ||
# Step 6: log runs in master | ||
# todo |
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Please rename this to be something more descriptive; suggest "OffModelCalculator" rather than "Calc". Also, please add documentation for the class variables and methods. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Also, let's just make the args that are passed into the constructor explicit, rather than hiding them in "args" -- there's only two... There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @lmz sounds good. I've integrated this change in a new commit. |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,117 @@ | ||
''' | ||
Class for the common calculator methods. | ||
''' | ||
import shutil | ||
import pandas as pd | ||
import re | ||
|
||
from helper import (mons,runs) | ||
|
||
|
||
class Calc: | ||
|
||
def __init__(self, args, verbose=False): | ||
self.runs = [args.model_run_id_2035, args.model_run_id_2050] | ||
self.modelDataPath, self.masterFilePath = mons.get_directory_constants(args.d) | ||
self.masterWbName="" | ||
self.dataFileName="" | ||
self.verbose=verbose | ||
self.varsDir=mons.get_vars_directory(args.d) | ||
self.v=Calc.get_variable_locations(self) | ||
|
||
def copy_workbook(self): | ||
# Start run | ||
newWbFilePath=runs.createNewRun(self.runs) | ||
|
||
# make a copy of the workbook | ||
master_workbook_file = f"{self.masterFilePath}/{self.masterWbName}.xlsx" | ||
self.new_workbook_file = f"{newWbFilePath}/{self.masterWbName}__{self.runs[0]}__{self.runs[1]}.xlsx" | ||
|
||
|
||
shutil.copy2(master_workbook_file, self.new_workbook_file) | ||
|
||
if self.verbose: | ||
print(master_workbook_file) | ||
print(self.new_workbook_file) | ||
|
||
return self.new_workbook_file | ||
|
||
def get_model_metadata(self): | ||
metaData=pd.read_csv( | ||
f"{self.modelDataPath}/{self.dataFileName}.csv", | ||
nrows=0) | ||
|
||
if self.verbose: | ||
print(f"Model Data (R Script) metadata:\n{metaData.columns[0]}") | ||
|
||
return metaData.columns[0] | ||
|
||
def get_model_data(self): | ||
# Get Model Data as df | ||
rawData=pd.read_csv( | ||
f"{self.modelDataPath}/{self.dataFileName}.csv", | ||
skiprows=1) | ||
|
||
filteredData=rawData.loc[rawData.directory.isin(self.runs)] | ||
|
||
# Get metadata from model data | ||
metaData=Calc.get_model_metadata(self) | ||
|
||
if self.verbose: | ||
print("Unique directories:") | ||
print(rawData['directory'].unique()) | ||
|
||
return filteredData, metaData | ||
|
||
def get_ipa(self, arg): | ||
|
||
pattern = r"(\d{4})_(TM\d{3})_(.*)" | ||
matches = re.search(pattern, arg) | ||
|
||
if matches: | ||
ipa = matches.group(3) | ||
year = matches.group(1) | ||
|
||
if self.verbose: | ||
print(ipa) | ||
|
||
return [ipa, int(year)] | ||
|
||
def write_model_data_to_excel(self, data, meta): | ||
|
||
with pd.ExcelWriter(self.new_workbook_file, engine='openpyxl', mode = 'a', if_sheet_exists = 'replace') as writer: | ||
# add metadata | ||
meta=pd.DataFrame([meta]) | ||
meta.to_excel(writer, | ||
sheet_name='Model Data', | ||
index=False, | ||
header=False, | ||
startrow=0, startcol=0) | ||
with pd.ExcelWriter(self.new_workbook_file, engine='openpyxl', mode = 'a', if_sheet_exists = 'overlay') as writer: | ||
# add model data | ||
# this only works with pandas=1.4.3 or later; in earlier version, it will not overwrite sheet, but add new one with sheet name 'Model Data1' | ||
data.to_excel(writer, | ||
sheet_name='Model Data', | ||
index=False, | ||
startrow=1, startcol=0) | ||
|
||
if self.verbose: | ||
print(f"Metadata: {meta}") | ||
|
||
def get_variable_locations(self): | ||
|
||
allVars=pd.read_excel(self.varsDir) | ||
calcVars=allVars.loc[allVars.Workbook.isin([self.masterWbName])].drop(columns=['Workbook','Description']) | ||
groups=set(calcVars.Sheet) | ||
self.v={} | ||
for group in groups: | ||
self.v.setdefault(group,dict()) | ||
self.v[group]=dict(zip(calcVars['Variable Name'],calcVars['Location'])) | ||
|
||
if self.verbose: | ||
print("Calculator variables and locations in Excel:") | ||
print(self.v) | ||
|
||
|
||
|
||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What does "mons" mean? If these methods are for the Calc class, can they just be static methods of the class? Ditto for methods in There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @lmz "mons" was a shorter name for commons. I will rename the file as commons.py. Here I want to add helper methods that are common between multiple calculators were transformations are needed. I can definitely keep the methods in "commons" and "runs" in the same file. However, I wanted to have the OffModelCalculator class to include only the methods that are necessary in the high-level process. In this way, I can follow the update process in a more clear-cut way. Would that be okay with you? |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,75 @@ | ||
''' | ||
This module includes common functions within the main script. | ||
''' | ||
|
||
def get_directory_constants(dirType): | ||
''' | ||
This function extracts the corresponding relative or absolute paths | ||
used in the external or mtc options. | ||
''' | ||
# directory file paths (input, models, outputs) | ||
if dirType=="external": | ||
from templates.external import ( | ||
MODEL_DATA_BOX_DIR, | ||
OFF_MODEL_CALCULATOR_DIR, | ||
) | ||
else: | ||
from templates.mtc import ( | ||
MODEL_DATA_BOX_DIR, | ||
OFF_MODEL_CALCULATOR_DIR, | ||
) | ||
|
||
return MODEL_DATA_BOX_DIR, OFF_MODEL_CALCULATOR_DIR | ||
|
||
def get_vars_directory(dirType): | ||
# directory file paths (input, models, outputs) | ||
if dirType=="external": | ||
from templates.external import ( | ||
VARS | ||
) | ||
else: | ||
from templates.mtc import ( | ||
VARS | ||
) | ||
|
||
return VARS | ||
|
||
# method depricated, now file names added inside subclasses | ||
def get_master_and_data_file_names(calcChoice): | ||
|
||
if calcChoice=='bike_share': | ||
MASTER_WORKBOOK_NAME="PBA50+_OffModel_Bikeshare" | ||
MODEL_DATA_FILE_NAME="Model Data - Bikeshare" | ||
|
||
elif calcChoice=='car_share': | ||
MASTER_WORKBOOK_NAME="PBA50+_OffModel_Carshare" | ||
MODEL_DATA_FILE_NAME="Model Data - Carshare" | ||
|
||
elif calcChoice=='ebike': | ||
MASTER_WORKBOOK_NAME="PBA50+_OffModel_EBIKE" | ||
MODEL_DATA_FILE_NAME=None # Needs model data name | ||
|
||
elif calcChoice=='regional_charger': | ||
MASTER_WORKBOOK_NAME="PBA50+_OffModel_RegionalCharger" | ||
MODEL_DATA_FILE_NAME=None # Needs model data name | ||
|
||
elif calcChoice=='regional_charger_accii': | ||
MASTER_WORKBOOK_NAME="PBA50+_OffModel_RegionalCharger_ACCII" | ||
MODEL_DATA_FILE_NAME=None # Needs model data name | ||
|
||
elif calcChoice=='targeted_trans_alt': | ||
MASTER_WORKBOOK_NAME="PBA50+_OffModel_TargetedTransAlt" | ||
MODEL_DATA_FILE_NAME="Model Data - Targeted Transportation Alternatives" | ||
|
||
elif calcChoice=='vanpools': | ||
MASTER_WORKBOOK_NAME="PBA50+_OffModel_Vanpools" | ||
MODEL_DATA_FILE_NAME="Model Data - Employer Shuttles" # confirm name | ||
|
||
elif calcChoice=='vehicle_buyback': | ||
MASTER_WORKBOOK_NAME="PBA50+_OffModel_VehicleBuyback" | ||
MODEL_DATA_FILE_NAME=None # Needs model data name | ||
|
||
else: | ||
raise ValueError("Choice not in options. Check the calculator name is correct.") | ||
|
||
return MASTER_WORKBOOK_NAME, MODEL_DATA_FILE_NAME |
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The terminology is confusing here... what a "NewRun"? Internally, we think of model runs being Travel Model Runs, but we wouldn't be creating a new run here. Suggest some clarity -- if not in the method naming, through method documentation. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @lmz sounds good! I've added the method documentation. |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,39 @@ | ||
import os | ||
from templates.external import ( | ||
OFF_MODEL_CALCULATOR_DIR_OUTPUT | ||
) | ||
|
||
def getNextFilePath(output_folder, run): | ||
|
||
lastRunId=0 | ||
for f in os.listdir(output_folder): | ||
fileNameList=f.split("__") | ||
if f"{fileNameList[0]}__{fileNameList[1]}"== run: | ||
if int(fileNameList[2])>lastRunId: | ||
lastRunId=int(fileNameList[2]) | ||
|
||
else: | ||
continue | ||
|
||
return lastRunId + 1 | ||
|
||
def createNewRun(model_run_ids_list, verbose=False): | ||
|
||
runName=model_run_ids_list[0]+"__"+model_run_ids_list[0] | ||
pathToRun=f"{OFF_MODEL_CALCULATOR_DIR_OUTPUT}/{runName}__0" | ||
|
||
if not os.path.exists(pathToRun): | ||
runNameNumber=runName+"__0" | ||
os.makedirs(pathToRun) | ||
|
||
else: | ||
runID=getNextFilePath(OFF_MODEL_CALCULATOR_DIR_OUTPUT,runName) | ||
runNameNumber=f"{runName}__{runID}" | ||
pathToRun=f"{OFF_MODEL_CALCULATOR_DIR_OUTPUT}/{runNameNumber}" | ||
os.makedirs(pathToRun) | ||
|
||
if verbose: | ||
print(f"New run created: {runNameNumber}") | ||
print(f"Location: {pathToRun}") | ||
|
||
return pathToRun |
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What is this? Does it need to be excel? It makes it hard to see diffs, so if it could be a CSV instead, that would be preferrable. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @lmz this file was created with the intention that the Off Model Calculator manager could update the Calculators in the folder "models" and then, update the Variable_locations Excel file with the added/updated cell locations. We're using excel as a UI to make it easier to update. The end goal is for the manager to update the calculators and be able to run the script even when cells have changed location in the individual calculators. Now, the Off Model Calculators latest version will be in the folder "models" to be used as the source of truth. Those files will keep a log of the runs that have been executed. |
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This folder (templates) has three files which makes you click around and the paths could just be done using an if statement without this kind of complexity. A simple if would make it clear what variables need to be defined for each environment and this structure is more error prone. Suggest deprecating this, and just inlining this logic somewhere else that's logical. Also, please use pathlib for paths rather than strings with There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @lmz Thanks for the suggestion, I will keep them together instead in multiple locations and propose a new location for this logic. |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,17 @@ | ||
|
||
import os | ||
|
||
# from update_offmodel_calculator_workbooks_with_TM_output import ABS_DIRNAME | ||
# ABS_DIRNAME=r"C:\Users\63330\Documents\projects\MTC\travel-model-one\utilities\RTP\Emissions\Off Model Calculators".replace("\\","/") | ||
ABS_DIRNAME=os.path.join(os.path.dirname(__file__),"..").replace("\\","/") | ||
# Input data paths | ||
BOX_DIR = ABS_DIRNAME+r"\data\input\IPA_TM2".replace("\\","/") | ||
MODEL_DATA_BOX_DIR = BOX_DIR+"/ModelData" | ||
|
||
# Models | ||
OFF_MODEL_CALCULATOR_DIR = ABS_DIRNAME+r"\models".replace("\\","/") | ||
# Output | ||
OFF_MODEL_CALCULATOR_DIR_OUTPUT = ABS_DIRNAME+r"\data\output".replace("\\","/") | ||
|
||
# Variables locations | ||
VARS=ABS_DIRNAME+r"\models\Variable_locations.xlsx".replace("\\","/") |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,21 @@ | ||
|
||
import os | ||
|
||
''' | ||
Import the absolute paths used within the MTC team. | ||
E.g. links to box or other local directories. | ||
''' | ||
# Input data paths | ||
BOX_DIR = 'C:\Users\{}\Box\Plan Bay Area 2050+\Blueprint\Off-Model\PBA50+ Off-Model'\ | ||
.format(os.environ.get('USERNAME')) | ||
MODEL_DATA_BOX_DIR = os.path.join(BOX_DIR, 'model_data_all') | ||
|
||
# Models | ||
OFF_MODEL_CALCULATOR_DIR = os.path.join( | ||
BOX_DIR, 'DBP_v2', 'PBA50+ Off-Model Calculators') | ||
|
||
# Outputs | ||
OFF_MODEL_CALCULATOR_DIR_OUTPUT = OFF_MODEL_CALCULATOR_DIR | ||
|
||
# Variables locations | ||
VARS=r"utilities\RTP\Emissions\Off Model Calculators\models\Variable_locations.xlsx".replace("\\","/") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think these files should live on Box rather than in GitHub. It's likely our workflow will include a number of iterations on model runs and there's no need to repeatedly commit data files to GitHub since they don't serve any purpose here, and Box keeps versions/timestamps.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lmz I agree, the purpose of these files (at least having a couple of them as samples) is to give flexibility to people who don't have access to BOX to run the whole script as a demo. This is helpful for me or people on our team. In this way, anyone can demo the script and provide improvements later. The end goal is that for all MTC team, they'll automatically use the paths in the mtc.py file which point directly to BOX. Additionally, it's a way to clearly state as a sample the shape in which we are expecting to receive the inputs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's trivial to give other folks access to Box; please request if you need others to have access. We generally keep data files on Box and not on GitHub, and I'd prefer to keep it that way; having it both ways adds confusion.