-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
first implementation of steel model with revised structure #8
Changes from all commits
a0f0a45
8576c9e
16bf241
1c5cdf3
4024e7a
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Large diffs are not rendered by default.
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -50,6 +50,9 @@ def extrapolate_stock( | |
|
||
if curve_strategy == "GDP_regression": | ||
gdp_regression(historic_stocks_pc.values, parameters['gdppc'].values, stocks_pc.values) | ||
elif curve_strategy == 'Exponential_GDP_regression': | ||
gdp_regression(historic_stocks_pc.values, parameters['gdppc'].values, stocks_pc.values, | ||
fitting_function_type='exponential') | ||
else: | ||
raise RuntimeError(f"Prediction strategy {curve_strategy} is not defined. " | ||
f"It needs to be 'GDP_regression'.") | ||
|
@@ -61,7 +64,7 @@ def extrapolate_stock( | |
return StockArray(**dict(stocks)) | ||
|
||
|
||
def gdp_regression(historic_stocks_pc, gdppc, prediction_out): | ||
def gdp_regression(historic_stocks_pc, gdppc, prediction_out, fitting_function_type='sigmoid'): | ||
shape_out = prediction_out.shape | ||
assert len(shape_out) == 3, "Prediction array must have 3 dimensions: Time, Region, Good" | ||
pure_prediction = np.zeros_like(prediction_out) | ||
|
@@ -71,23 +74,48 @@ def gdp_regression(historic_stocks_pc, gdppc, prediction_out): | |
for i_region in range(shape_out[1]): | ||
for i_good in range(shape_out[2]): | ||
|
||
def fitting_function(prms): | ||
def sigmoid_fitting_function(prms): | ||
return ( | ||
prms[0] / (1. + np.exp(prms[1]/gdppc[:n_historic,i_region])) | ||
) - historic_stocks_pc[:,i_region,i_good] | ||
# Lagrangian multiplier to ensure matching last point: | ||
# + prms[2] * prms[0] / (1. + np.exp(prms[1]/gdppc[i_lh,0])) - stocks_pc[-1,0,i] ) | ||
prms_out = least_squares( | ||
fitting_function, | ||
x0=np.array([2.*gdppc[i_lh,i_region], historic_stocks_pc[-1,i_region,i_good]]), | ||
gtol=1.e-12 | ||
) | ||
assert prms_out.success | ||
|
||
pure_prediction[:,i_region,i_good] = prms_out.x[0] / ( | ||
1. + np.exp(prms_out.x[1]/gdppc[:,i_region]) | ||
def exponential_fitting_function(prms): | ||
return ( | ||
prms[0] * (1 - np.exp(-prms[1]*gdppc[:n_historic,i_region])) | ||
) - historic_stocks_pc[:,i_region,i_good] | ||
|
||
if fitting_function_type == 'sigmoid': | ||
prms_out = least_squares( | ||
sigmoid_fitting_function, | ||
x0=np.array([2.*gdppc[i_lh,i_region], historic_stocks_pc[-1,i_region,i_good]]), | ||
gtol=1.e-12 | ||
) | ||
|
||
pure_prediction[:, i_region, i_good] = prms_out.x[0] / ( | ||
1. + np.exp(prms_out.x[1] / gdppc[:, i_region]) | ||
) | ||
elif fitting_function_type == 'exponential': | ||
prms_out = least_squares( | ||
exponential_fitting_function, | ||
x0=np.array([400, 1]), | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is this supposed to be the same x0 as used in the sigmoid case? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. No, I think they should be different, at least the second one (the first one should be the saturation level which I guess should be similar with both approaches, but the way it is calculated in the sigmoid case is I think specific to the plastics model [I guess it might make sense to change that structure at some point as this is in the common section now - e.g. by specifying in the config whether and how the initial guess of saturation level and other parameters should be calculated?]). In the former steel model, we had different version of this regression, one, where the parameters where optimised like here and one where we calculated it numerically. As I changed the inflow driven model calculation, I steel need to look at the numbers to make this decision so this might change soon again. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Okay, I didn't release this was a model specific initial guess. If you were using both options (exponential and sigmoid) for the steel model (e.g. to compare results from different approaches), you would then have the same x0, right? Or is it also dependent on the fitting function? Maybe, we could pass x0 as a parameter to the gdp_regression function, but I think I would keep hard-coded values somewhere (and not require the user to specify it in the config), maybe directly in the steel and plastic models, or putting (or calling) this gdp_regression in the steel / plastic directories rather than common. For now, please just add a little comment to the code with your explanation above and I will think what to do with it later! There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The initial guess x0 would be different for the different fitting functions, at least for the second parameter. In both fitting functions here the first parameter describes the saturation level, whilst the second one defines some other valuable that influences the curvature of the function which will be very different depending on the form of the function. One could also imagine fitting functions which don't use a saturation level at all in the future, hence they should always be different. I will add the comment like you said and as said I also need to look at the numbers a little more myself - am happy to change the location of the x0 values with whatever you propose! |
||
# TODO These values are model dependent and shouldn't be hard coded here | ||
# TODO for now, these values are placeholders that work for the steel model but might produce | ||
# TODO unrealistic predictions. | ||
# TODO Possibly, the x0 guesses can be passed as a parameter to this function, see PR 8 discussion. | ||
gtol=1.e-12 | ||
) | ||
|
||
pure_prediction[:, i_region, i_good] = (prms_out.x[0] * | ||
(1 - np.exp(-prms_out.x[1]*gdppc[:,i_region]))) | ||
else: | ||
raise RuntimeError(f'Fitting function type needs to be either \n' | ||
f'sigmoid or exponential, not {fitting_function_type}.') | ||
|
||
assert prms_out.success | ||
|
||
|
||
# def fitting_function(prms): | ||
# return 2.*gdppc[i_lh,0] / (1. + np.exp(prms[0]/gdppc[cfg.i_historic,0])) - stocks_pc[:,0,i] | ||
# prms_out = least_squares(fitting_function, x0=np.array([stocks_pc[-1,0,i]])) | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,190 @@ | ||
import numpy as np | ||
from numpy.linalg import inv | ||
|
||
from sodym import MFASystem | ||
|
||
from simson.common.inflow_driven_mfa import InflowDrivenHistoricMFA | ||
from simson.common.common_cfg import CommonCfg | ||
|
||
class InflowDrivenHistoricSteelMFASystem(InflowDrivenHistoricMFA): | ||
|
||
def compute(self): | ||
""" | ||
Perform all computations for the MFA system. | ||
""" | ||
self.compute_historic_flows() | ||
self.compute_historic_in_use_stock() | ||
self.check_mass_balance() | ||
|
||
|
||
def compute_historic_flows(self): | ||
prm = self.parameters | ||
flw = self.flows | ||
stk = self.stocks | ||
scp = self.scalar_parameters | ||
|
||
aux = { | ||
'net_intermediate_trade': self.get_new_array(dim_letters=('h','r','i')), | ||
'fabrication_by_sector': self.get_new_array(dim_letters=('h','r','g')), | ||
'fabrication_loss': self.get_new_array(dim_letters=('h','r','g')), | ||
'fabrication_error': self.get_new_array(dim_letters=('h','r')) | ||
} | ||
|
||
flw['sysenv => forming'][...] = prm['production_by_intermediate'] | ||
flw['forming => ip_market'][...] = prm['production_by_intermediate'] * prm['forming_yield'] | ||
flw['forming => sysenv'][...] = flw['sysenv => forming'] - flw['forming => ip_market'] | ||
|
||
# Todo: add trade | ||
|
||
aux['net_intermediate_trade'][...] = flw['ip_trade => ip_market'] - flw['ip_market => ip_trade'] | ||
flw['ip_market => fabrication'][...] = flw['forming => ip_market'] + aux['net_intermediate_trade'] | ||
|
||
aux['fabrication_by_sector'][...] = self._calc_sector_flows( | ||
flw['ip_market => fabrication'], | ||
prm['good_to_intermediate_distribution']) | ||
|
||
aux['fabrication_error'] = flw['ip_market => fabrication'] - aux['fabrication_by_sector'] | ||
|
||
flw['fabrication => use'][...] = aux['fabrication_by_sector'] * prm['fabrication_yield'] | ||
aux['fabrication_loss'][...] = aux['fabrication_by_sector'] - flw['fabrication => use'] | ||
flw['fabrication => sysenv'][...] = aux['fabrication_error'] + aux['fabrication_loss'] | ||
# Todo: add intermediate trade | ||
|
||
return | ||
|
||
def _calc_sector_flows(self, intermediate_flow, gi_distribution): | ||
""" | ||
Estimate the fabrication by in-use-good according to the inflow of intermediate products | ||
and the good to intermediate product distribution. | ||
""" | ||
|
||
# The following calculation is based on | ||
# https://en.wikipedia.org/wiki/Overdetermined_system#Approximate_solutions | ||
# gi_values represents 'A', hence the variable at_a is A transposed times A | ||
# 'b' is the intermediate flow and x are the sector flows that we are trying to find out | ||
|
||
gi_values = gi_distribution.values.transpose() | ||
at_a = np.matmul(gi_values.transpose(), gi_values) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can we have some more self-explanatory variable names here please? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. sure, definitely. I wasn't really sure how though - gi_values I can definitely rename, at_a here means matrix a transposed times matrix a, basically I am solving for a parameter here according to a standard optimization approximation Jakob proposed (to minimize Ax-b). I guess the options are to either to
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Okay, yes it's difficult when there isn't really a physical meaning, and squashing all of that into line 74 is probably too much. I also think a comment is the best option. |
||
inverse_at_a = inv(at_a) | ||
inverse_at_a_times_at = np.matmul(inverse_at_a, gi_values.transpose()) | ||
sector_flow_values = np.einsum('gi,hri->hrg',inverse_at_a_times_at, intermediate_flow.values) | ||
|
||
# don't allow negative sector flows | ||
sector_flow_values = np.maximum(0, sector_flow_values) | ||
|
||
sector_flows = self.get_new_array(dim_letters=('h','r','g')) | ||
sector_flows.values = sector_flow_values | ||
|
||
return sector_flows | ||
|
||
def compute_historic_in_use_stock(self): | ||
flw = self.flows | ||
stk = self.stocks | ||
stk['in_use'].inflow[...] = flw['fabrication => use'] + flw['indirect_trade => use'] - flw['use => indirect_trade'] | ||
stk['in_use'].compute() | ||
|
||
return | ||
|
||
class StockDrivenSteelMFASystem(MFASystem): | ||
|
||
def compute(self): | ||
""" | ||
Perform all computations for the MFA system. | ||
""" | ||
self.compute_flows() | ||
self.compute_other_stocks() | ||
self.check_mass_balance() | ||
|
||
def compute_flows(self): | ||
# abbreviations for better readability | ||
prm = self.parameters | ||
flw = self.flows | ||
stk = self.stocks | ||
scp = self.scalar_parameters | ||
|
||
|
||
# auxiliary arrays; | ||
# It is important to initialize them to define their dimensions. See the NamedDimArray documentation for details. | ||
# could also be single variables instead of dict, but this way the code looks more uniform | ||
aux = { | ||
'total_fabrication': self.get_new_array(dim_letters=('t', 'e', 'r', 'g', 's')), | ||
'production': self.get_new_array(dim_letters=('t', 'e', 'r', 'i', 's')), | ||
'forming_outflow': self.get_new_array(dim_letters=('t', 'e', 'r', 's')), | ||
'scrap_in_production': self.get_new_array(dim_letters=('t', 'e', 'r', 's')), | ||
'available_scrap': self.get_new_array(dim_letters=('t', 'e', 'r', 's')), | ||
'eaf_share_production': self.get_new_array(dim_letters=('t', 'e', 'r', 's')), | ||
'production_inflow': self.get_new_array(dim_letters=('t', 'e', 'r', 's')), | ||
'max_scrap_production': self.get_new_array(dim_letters=('t', 'e', 'r', 's')), | ||
'scrap_share_production': self.get_new_array(dim_letters=('t', 'e', 'r', 's')), | ||
'bof_production_inflow': self.get_new_array(dim_letters=('t', 'e', 'r', 's')), | ||
} | ||
|
||
# Slicing on the left-hand side of the assignment (foo[...] = bar) is used to assign only the values of the flows, not the NamedDimArray object managing the dimensions. | ||
# This way, the dimensions of the right-hand side of the assignment can be automatically reduced and re-ordered to the dimensions of the left-hand side. | ||
# For further details on the syntax, see the NamedDimArray documentation. | ||
|
||
# Pre-use | ||
|
||
flw['fabrication => use'][...] = stk['use'].inflow | ||
aux['total_fabrication'][...] = flw['fabrication => use'] / prm['fabrication_yield'] | ||
flw['fabrication => fabrication_buffer'][...] = aux['total_fabrication'] - flw['fabrication => use'] | ||
flw['ip_market => fabrication'][...] = aux['total_fabrication'] * prm['good_to_intermediate_distribution'] | ||
flw['forming => ip_market'][...] = flw['ip_market => fabrication'] | ||
aux['production'][...] = flw['forming => ip_market'] / prm['forming_yield'] | ||
aux['forming_outflow'][...] = aux['production'] - flw['forming => ip_market'] | ||
flw['forming => sysenv'][...] = aux['forming_outflow'] * scp['forming_losses'] | ||
flw['forming => fabrication_buffer'][...] = aux['forming_outflow'] - flw['forming => sysenv'] | ||
|
||
# Post-use | ||
|
||
flw['use => outflow_buffer'][...] = stk['use'].outflow | ||
flw['outflow_buffer => eol_market'][...] = flw['use => outflow_buffer'] * prm['recovery_rate'] | ||
flw['outflow_buffer => obsolete'][...] = flw['use => outflow_buffer'] - flw['outflow_buffer => eol_market'] | ||
flw['eol_market => recycling'][...] = flw['outflow_buffer => eol_market'] | ||
flw['recycling => scrap_market'][...] = flw['eol_market => recycling'] | ||
flw['fabrication_buffer => scrap_market'][...] = flw['forming => fabrication_buffer'] + flw['fabrication => fabrication_buffer'] | ||
|
||
|
||
# PRODUCTION | ||
|
||
aux['production_inflow'][...] = aux['production'] / scp['production_yield'] | ||
aux['max_scrap_production'][...] = aux['production_inflow'] * scp['max_scrap_share_base_model'] | ||
aux['available_scrap'][...] = flw['recycling => scrap_market'] + flw['fabrication_buffer => scrap_market'] | ||
aux['scrap_in_production'][...] = aux['available_scrap'].minimum(aux['max_scrap_production']) # using NumPy Minimum functionality | ||
flw['scrap_market => excess_scrap'][...] = aux['available_scrap'] - aux['scrap_in_production'] | ||
aux['scrap_share_production']['Fe'][...] = aux['scrap_in_production']['Fe'] / aux['production_inflow']['Fe'] | ||
aux['eaf_share_production'][...] = aux['scrap_share_production'] - scp['scrap_in_bof_rate'] | ||
aux['eaf_share_production'][...] = aux['eaf_share_production'] / (1 - scp['scrap_in_bof_rate']) | ||
aux['eaf_share_production'][...] = aux['eaf_share_production'].minimum(1).maximum(0) | ||
flw['scrap_market => eaf_production'][...] = aux['production_inflow'] * aux['eaf_share_production'] | ||
flw['scrap_market => bof_production'][...] = aux['scrap_in_production'] - flw['scrap_market => eaf_production'] | ||
aux['bof_production_inflow'][...] = aux['production_inflow'] - flw['scrap_market => eaf_production'] | ||
flw['sysenv => bof_production'][...] = aux['bof_production_inflow'] - flw['scrap_market => bof_production'] | ||
flw['bof_production => forming'][...] = aux['bof_production_inflow'] * scp['production_yield'] | ||
flw['bof_production => sysenv'][...] = aux['bof_production_inflow'] - flw['bof_production => forming'] | ||
flw['eaf_production => forming'][...] = flw['scrap_market => eaf_production'] * scp['production_yield'] | ||
flw['eaf_production => sysenv'][...] = flw['scrap_market => eaf_production'] - flw['eaf_production => forming'] | ||
|
||
return | ||
|
||
def compute_other_stocks(self): | ||
stk = self.stocks | ||
flw = self.flows | ||
|
||
# in-use stock is already computed in compute_in_use_stock | ||
|
||
stk['obsolete'].inflow[...] = flw['outflow_buffer => obsolete'] | ||
stk['obsolete'].compute() | ||
|
||
stk['excess_scrap'].inflow[...] = flw['scrap_market => excess_scrap'] | ||
stk['excess_scrap'].compute() | ||
|
||
# TODO: Delay buffers? | ||
|
||
stk['outflow_buffer'].inflow[...] = flw['use => outflow_buffer'] | ||
stk['outflow_buffer'].outflow[...] = flw['outflow_buffer => eol_market'] + flw['outflow_buffer => obsolete'] | ||
stk['outflow_buffer'].compute() | ||
|
||
stk['fabrication_buffer'].inflow[...] = flw['forming => fabrication_buffer'] + flw['fabrication => fabrication_buffer'] | ||
stk['fabrication_buffer'].outflow[...] = flw['fabrication_buffer => scrap_market'] | ||
stk['fabrication_buffer'].compute() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure whether this would work, but:
If we do the above, could we take out the loop over region and calculate using arrays?
e.g.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm good point, I actually just adopted this from what was here which I am assuming then stemmed from the plastics model (or my memory is getting quite bad and I coded this some time ago :) ). I can try it out, I guess it could even be possible to take out the loop over goods as well, we might need to reshape the initial parameters then which might look messy, depends a bit on scipy, I'm going to look into it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, yes I know this was not actually in your pr, but the whole function is a bit messy, so I was thinking how we can make it a bit cleaner. Yeah, reshaping parameters might be less readable than one loop (or maybe not, I'm not sure). It would likely be more efficient to reshape instead of looping, but I don't think we have an efficiency problem at the moment, so readability would be my priority!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I looked into it now and SciPy requires x0 to be maximum one dimensional, meaning that if we canceled one or both loops, we would need to 'flatten' x0 so that (if we just looped over regions) x0 would look something like this [sat_level_good_0, sat_level_good_1, sat_level_good_2, sat_level_good_3, curvature_prm_good_0, curvature_prm_good_1, curvature_prm_good_2, curvature_prm_good_3], which I think would severely effect readability and following up from what you said I think I would leave it as it is for now (unless we find another solution).