Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generic monte carlo simulation #1296

Merged
merged 30 commits into from
Nov 29, 2023
Merged

Generic monte carlo simulation #1296

merged 30 commits into from
Nov 29, 2023

Conversation

sbenthall
Copy link
Contributor

@sbenthall sbenthall commented Jun 30, 2023

This PR is a start to an implementing a generic sim_one_period method for Monte Carlo approaches. . See #1295

It is a standalone function that takes as arguments:

  • an ordered mapping of variable names to callable functions to set their value.
    • the parameter names to these 'function equations' correspond to model variable names!
    • a special Control object can be used instead of a function here to show a variable needs a decision rule
  • a mapping from variable names to start-of-period values
  • a mapping from control variable names to decision rules

The transition function is then very simple.

Not yet implemented:

  • drawing for exogenous shocks
  • what else?
  • Tests for new functionality/models or Tests to reproduce the bug-fix in code.
  • Updated documentation of features that add new functionality.
  • Update CHANGELOG.md with major/minor changes.


###############################################################3

'''
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

whoops ignore the stuff below this line. I was going to develop more robust tests from this material later.

@codecov
Copy link

codecov bot commented Jun 30, 2023

Codecov Report

Attention: 11 lines in your changes are missing coverage. Please review.

Comparison is base (87918ed) 73.02% compared to head (a329473) 73.33%.
Report is 1 commits behind head on master.

Files Patch % Lines
HARK/models/perfect_foresight_normalized.py 0.00% 5 Missing ⚠️
HARK/simulation/monte_carlo.py 96.57% 5 Missing ⚠️
HARK/core.py 66.66% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1296      +/-   ##
==========================================
+ Coverage   73.02%   73.33%   +0.30%     
==========================================
  Files          79       83       +4     
  Lines       13377    13596     +219     
==========================================
+ Hits         9769     9971     +202     
- Misses       3608     3625      +17     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@sbenthall
Copy link
Contributor Author

Current draw_shocks implementation here doesn't support age-dependent draws. I think it can easily be extended to do so using this feature of TimeVaryingDiscreteDistribution:
https://github.com/econ-ark/HARK/blob/master/HARK/distribution.py#L1439-L1472

This feature is not yet used in the ConsIndShockModel get_shocks method. I suspect that's because nobody's bothered to do that yet. But I'm mentioning this here in case there's a principled reason.

https://github.com/econ-ark/HARK/blob/master/HARK/ConsumptionSaving/ConsIndShockModel.py#L2229-L2295

The broader issue is what assumptions we make about aging and mortality in the Monte Carlo simulator, and to what extent they are part of 'dynamics' or something else. My sense is that the AgentType class makes somewhat strong assumptions about these things in order to, for example, guarantee a constant size population.

I'd like for this PR to do as much as it can without making assumptions about this part of the framework.

@sbenthall
Copy link
Contributor Author

Thinking this through -- I think the next move on this is a MonteCarloSimulator class, which takes a model and a decision rule, and does the work of the AgentType simulation parts. This will build on #1292.

We've discussed this as the right direction. I suppose it will be smoother if it mimics AgentType's functionality as much as possible. The goal would be for it to replace the AgentType monte carlo simulator entirely, which would be a forcing function for AgentType models to include model dynamics in a logically separate way.

@sbenthall sbenthall changed the title [WIP] Generic monte carlo sim_one_period [WIP] Generic monte carlo simulation Jul 21, 2023
@sbenthall
Copy link
Contributor Author

sbenthall commented Jul 21, 2023

This PR now has a draft of an AgentTypeMonteCarloSimulator class.

HARK.core.AgentType is currently ~950 lines of code.
HARK.simulation.monte_carlo, in this PR, is currently ~550 lines of code.

The aim is to implement the basic current AgentType simulation functionality in HARK.simulation.monte_carlo, making some key architectural changes along the way.
AgentType code and models can later be refactored to take advantage of this generic functionality.

Some changes that are coming up:

  • Instead of depending on an AgentType model, the simulator takes:
    • parameters,
    • shocks (dictionary-like of distributions)
    • dynamics (transition equations)
    • decision rules (governing any control/action variables)
  • Drop 'shocks', 'states', 'controls', and 'post-states' distinction at the data structure level. Instead, use flat 'vars'.
  • draw from shocks and simulate dynamics using generic functions rather than model inheritance
  • I'm just tracking all the variables in history.

This design is feeling lighter weight.

A few things I haven't worked out yet are:

  • What to do with the rather heavy make_shock_history and read_shock functionality. I understand why it's there, but I wonder if there's a way to do this more cleanly with the new design.
  • AgentType currently depends on model inheritance for sim_death and sim_birth but these need to be abstracted somehow.

Other things I'm thinking about:

  • How to handle age-dependent and aggregate shock draws.

@sbenthall
Copy link
Contributor Author

Current choice is to have an initial input to the simulator that works just like the shocks input, but is only drawn from when an agent is born. I'm wondering if this is general enough to cover the existing models in HARK.

@sbenthall
Copy link
Contributor Author

@mnwhite I am thinking that we need a kind of modeling primitive to deal with termination/death.

I think the cleanest way to do this is to stipulate that there is a particular variable, Live (or Die, ... you'll get the idea....) If Live == 1, then the agent continues as normal. But when Live == 0, the agent dies.

Then, setting this state can be part of the dynamics, or it can be drawn as a shock -- the simulation architecture doesn't care how that variable is set. It just monitors it and, if it's 0, it initiates the death routine.

Does that make sense? Any alternate ideas?

@sbenthall
Copy link
Contributor Author

Having who_dies be a variable that can be due to either shocks or dynamics breaks the assumptions in make_shock_history, according to which who_dies can be age-dependent, but not endogenous.

However, this assumption that who_dies is not endogenous is not guaranteed by the framework, because sim_death can be overwritten by subclasses. This assumption is also not documented.

Given that we have control over the random seed, I would assume that this feature is rather rarely used. The make_shock_history and read_shocks code is long; I think I can streamline it in the Monte Carlo solver and get the same feel.

@sbenthall
Copy link
Contributor Author

@Mv77 has touched the read_shock_history bits, so is another good reviewer for this PR.

@mnwhite
Copy link
Contributor

mnwhite commented Jul 27, 2023 via email

@sbenthall
Copy link
Contributor Author

Ok, so after exchanging with Matt and Mateo about this, I've implemented mortality in what I think is a very clean and flexible way.

The MonteCarloSimulator expects agents to have a variable, live. If live is less than or equal to zero, they are considered dead and are reborn in the following period.

live can be set in a number of ways:

  • as an initial conditions of agents. (For most models, agents will start with live == 1 and this may never change.
  • as an exogenous shock, i.e.: 'live' : Bernoulli(p=0.98)
  • as a dynamic, i.e.: 'live' : lambda omega, health: health < omega

There is no more sim_death in this implementation because all model behavior is supposed to be passed in through the configuration objects (parameters, shocks, dynamics, decision rules, initial conditions).

I believe this is:

  • expressive enough to cover any of the models currently implemented in HARK
  • more expressive than those models, because no the cause of death can be exogenous

I don't think it does everything on @mnwhite 's simulation bucket list, which I think may be out of scope of this PR, but does some of it. See this comment: #1295 (comment)

I believe this PR is now ready for review. Requesting feedback from @mnwhite and @Mv77

@sbenthall sbenthall requested review from mnwhite and Mv77 July 27, 2023 14:20
@sbenthall sbenthall changed the title [WIP] Generic monte carlo simulation Generic monte carlo simulation Jul 27, 2023
@Mv77
Copy link
Contributor

Mv77 commented Jul 27, 2023

Cool, that sounds like an improvement. I'll check this (hopefully) soon.

@llorracc
Copy link
Collaborator

llorracc commented Jul 27, 2023 via email

@sbenthall
Copy link
Contributor Author

OK, there is now a notebook in this PR that directly compares the HARK 0.13 Perfect Foresight simulation with the results from using the HARK 0.13 Perfect Foresight solution with the generic Monte Carlo simulator I've been working on here.

The good news is that it executes in reasonable-looking way while using the Python PF model definition.
I did need to make one small change to the model compared to how it was originally merged in:
5ef3a19

This means that in the Python configured model, the update of income y is not lagged behind the update of permanent income p as I believe it was in the HARK version.

The bad news is that the two simulation outputs look different! I'm not sure if this is caused by the model change I just described, or a deeper problem.

Here's mean normalized market resources over time from H0.13 PF:

image

Here's mean normalized market resources over time from the current state of this PR:

image
Which is pretty weird.

Here is the notebook
https://github.com/econ-ark/HARK/blob/b95b0f4a20092651481413cf400203040d055137/examples/MonteCarlo/Generic%20Monte%20Carlo%20Perfect%20Foresight.ipynb

I can keep trying to fix this up.

But I think that at this point, the PR would benefit hugely from a review from @mnwhite .

@mnwhite
Copy link
Contributor

mnwhite commented Oct 9, 2023 via email

@sbenthall
Copy link
Contributor Author

Thanks @mnwhite
Yes, infinite horizon.
I believe newborns are being replaced with fresh draws of a from the initial log normal distribution.

@llorracc
Copy link
Collaborator

llorracc commented Oct 11, 2023 via email

@sbenthall
Copy link
Contributor Author

sbenthall commented Oct 11, 2023 via email

@sbenthall
Copy link
Contributor Author

sbenthall commented Oct 11, 2023 via email

@mnwhite
Copy link
Contributor

mnwhite commented Oct 11, 2023 via email

@mnwhite
Copy link
Contributor

mnwhite commented Oct 11, 2023 via email

@llorracc
Copy link
Collaborator

llorracc commented Oct 11, 2023 via email

@sbenthall
Copy link
Contributor Author

Hmmm.

With PerfForesightConsumerType with "aNrmInitStd": 0 and "LivPrb": [1.0], plotting the log(mNrm - min(mNrm)) with https://gist.github.com/sbenthall/0e2edc3e080d340a7f33729409c2d82b I get the following not very linear plot:

image

But I suppose there are good numerical explanations for this.

Turning off mortality and variance in starting wealth didn't fix the major ways the generic simulation is off.

Which makes me think that the dynamics are not captured in the new python configuration properly.

This is the HARK 0.13 transition function:
https://github.com/econ-ark/HARK/blob/master/HARK/ConsumptionSaving/ConsIndShockModel.py#L1848-L1864

This is the new python configuration for perfect foresight. It has some differences.
https://github.com/econ-ark/HARK/blob/b95b0f4a20092651481413cf400203040d055137/HARK/models/perfect_foresight.py

One key one being that I don't think m is normalized.

So, I'll work on a comparable model that is normalized and closer to the HARK 0.13 version.

…eneric monte carlo example shows exact match
@sbenthall
Copy link
Contributor Author

I'm pleased to report that using the new normalized perfect foresight configuration (which took just a few minutes to code up):
https://github.com/econ-ark/HARK/pull/1296/files#diff-0f9274be91788fd3a4aab1181e4568fb480634a988033633da6284b90daa3af6

and with no mortality and initial wealth variance, I am getting an exact match between HARK 0.13 and generic monte carlo normalized m.

image

Numerical tests show that there is indeed 0 difference.

Now, I believe this PR is ready for your review, @mnwhite

@sbenthall
Copy link
Contributor Author

As of the most recent commit, the example notebook in this PR uses the read_shocks functionality of the new simulator class to read in the shocks from the HARK 0.13 model. This allows a direct comparison of the simulations with stochastic mortality.

@sbenthall
Copy link
Contributor Author

sbenthall commented Oct 25, 2023

There is a small discrepancy between the results in this PR and the results in HARK 0.13
@mnwhite and I spoke about this on Monday and it has to do with the handling of mortality shocks.

In HARK 0.13, the simulation loop at time t works like this (in core.py and in ConsIndShock, where sim_death() is implemented):

  • get_mortality
    • sim_death: determine who lives and dies at t
  • sim_birth: reset these agent's post-state values, in vars_now based on the initialization parameters
  • back up vars_now to vars_prev.
  • get shocks, states, controls, and post-states for t
  • increment age
  • track history, based on current values stored in memory.
  • increment t

When tracking the who_dies value, which is not in HARK 0.13 either a shock, state, control, or post-state, what's happening is that the who_dies at t actually refers to who is reborn at t. The post-states at t-1 are adjusted retroactively. So the history will record the newborn's starting deceased's ending aNrm at t-1 and their rebirth event at t.

I believe that in conversation, @mnwhite has said that this is a mistaken design.
At the very least, it might be a bug in the history tracker. Arguably, for who_dies and only who_dies it should log that value to the t-1 spot.

I'm aiming to correct this mistake in the PR.

How should it work?

I think it should go like this: live is a variable like any other; it can be a shock, or an endogenous state.
If the agent has live <= 0 at time t, then they die at time t.
Their post-states are set at the values for the newborns, which then begin life at time t+1.

Is that right?

@sbenthall
Copy link
Contributor Author

What I'll do is patch HARK 0.13 so that 'who_dies' is recorded into history at t-1, so that I can get the results to line up.
This is kludgy, but will be replaced by the newer code in time so it's a temporary fix to show parity.

@sbenthall
Copy link
Contributor Author

The generic Monte Carlo perfect foresight example now matches the HARK 0.13 results exactly.

@sbenthall
Copy link
Contributor Author

Finally got it passing all the pre-commit checks. Phew.

@sbenthall
Copy link
Contributor Author

Polished up the documentation. All the checkboxes are checked. I think this is ready to merge.
Can you please give it a look @mnwhite ?

@sbenthall sbenthall merged commit 1ad4731 into econ-ark:master Nov 29, 2023
18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants