Merge pull request #65 from mikhailsirenko/develop

Develop
mikhailsirenko · Apr 16, 2024 · 8c76e2b · 8c76e2b
2 parents 9eb29eb + 4d13738
commit 8c76e2b
Show file tree

Hide file tree

Showing 35 changed files with 29,094 additions and 542 deletions.
diff --git a/.gitignore b/.gitignore
@@ -111,26 +111,17 @@ experiments/Saint Lucia
 # Input data preparation scripts
 /scripts
 
-# Temporary files
-/temp
-
-# WIP notebooks
-/wip
-
-# Internal notes
-NOTES.md
-
-# Internal TODOs
-TODO.md
-
 # Exclude Dominica-specific config
 /config/Dominica.yaml
 
 # Exclude Saint-Lucia specific config
 /config/SaintLucia.yaml
 
+# Exclude Nigeria specific config
+/config/Nigeria.yaml
+
 # Exclude reports
 /reports
 
 # Exclude experiment results
-/experiments
+/results
diff --git a/README.md b/README.md
@@ -1,9 +1,10 @@
 # What is unbreakable?
 Unbreakable is a stochastic simulation model for assessing the resilience of households to natural disasters. 
 
-## Introduction
+## Background
+Disasters of all kinds are becoming more frequent and more severe. While, 
 
-## Features
+## Model Overview
 
 ## Getting Started
 
@@ -19,22 +20,89 @@ Clone the repository from GitHub using the following:
 git clone https://github.com/mikhailsirenko/unbreakable
 ```
 
+### Repository Structure
+```
+unbreakable
+
+├── config                  <- Configuration files for the model
+
+
+```
+
 ### Usage
 You run the model from the command line with the default parameters using:
 ```bash
 python unbreakable/run.py
 ```
 
-### Examples
-
 ## Documentation
 Detailed model documentation, including function descriptions and usage examples, is available [here](https://mikhailsirenko.github.io/unbreakable/src.html).
 
-## Repository Structure
-```
-unbreakable
 
-```
+
+## How-to guide
+
+### Adding a New Case Study
+
+Imagine you would like to use the model for a new case study. What do you have to do?
+
+First, you must ensure you have two datasets: a household survey and asset damage. Next, you have to fine-tune parameters: constants and uncertainties. You may change policies as well. However, make sure that you follow the defined naming convention. Now, let's dive into the details of each of the steps.
+
+#### Household survey
+By household survey, we mean a dataset that contains information about households in the country. Usually, it is a survey that the government or an international organization conducts. The survey must be nationally representative (weighted): each household has a weight of how many households it represents.
+
+The survey must contain the following information:
+| No. | Variable description                        | Variable name  | Type of values | Potential ranges |
+|-----|---------------------------------------------|----------------|----------------|------------------|
+| 1   | Household id                                | hh_id          | Integer        | N/A              |
+| 2   | Household weight                            | hh_weight      | Float          | N/A              |
+| 3   | District id                                 | district_id    | Integer        | N/A              |
+| 4   | District name                               | district_name  | String         | N/A              |
+| 5   | Household size                              | hh_size        | Integer        | N/A              |
+| 6   | Household income (adult equivalent)         | ae_inc         | Float          | N/A              |
+| 7   | Household expenditure (adult equivalent)    | ae_exp         | Float          | N/A              |
+| 8   | Household savings (adult equivalent)        | ae_sav         | Float          | N/A              |
+| 9   | Whether the household rents or owns a house | rent_own       | String         | "Rent", "Own"    |
+| 10  | House price (if the household owns a house) | house_price    | Float          | N/A              |
+| 11  | Rent price (if the household rents a house) | rent_price     | Float          | N/A              |
+| 12  | Walls material                              | walls_material | String         | N/A              |
+| 13  | Roof material                               | roof_material  | String         | N/A              |
+
+
+
+#### Asset damage
+The asset  
+
+## Parameters
+### Constants
+
+The model has a few constants. Note that one could treat them as uncertainties. However, we decided to keep them as constants for the sake of simplicity. The constants are:
+
+1. Average productivity of capital (`average_productivity`)
+
+To update this constant, we use the Penn World Table. You can download it from the [Penn World Table website](https://www.rug.nl/ggdc/productivity/pwt/). To estimate the average productivity of capital, we take the average Output-side real GDP at current PPPs (`cgdpo`) to Capital stock at the current PPPs (`cn`) ratio for the last five years. For example, for Dominica, these values are:
+
+| Year | Output-side real GDP at current PPPs (`cgdpo`) | Capital stock at current PPPs (`cn`) |
+|------|------------------------------------------------|--------------------------------------|
+| 2015 | 709.8839111                                    | 2500.742676                          |
+| 2016 | 771.4655762                                    | 2497.276855                          |
+| 2017 | 722.5056763                                    | 2560.524658                          |
+| 2018 | 704.8121948                                    | 2742.36377                           |
+| 2019 | 746.8662109                                    | 2821.069092                          |
+
+Thus, the average productivity of capital is 0.28.
+
+1. Consumption utility (`consump_util`)
+1. Discount rate (`discount_rate`) 
+1. Income and expenditure growth (`income_and_expenditure_growth`)
+1. Savings rate (`savings_rate`)
+1. Poverty line (`poverty_line`)
+
+Besides these, we must adjust the names of the `country` and `districts`. The first one indicates with which country we work, and the second one is a list of districts in the country. Note that those could be called differently: neighborhoods or parishes. For simplicity in the model, we call all of them districts.
+
+### Uncertainties
+
+### Policies
 
 ## Contributing
 

diff --git a/experiments/__init__.py b/experiments/__init__.py
diff --git a/experiments/config_manager.py b/experiments/config_manager.py
@@ -0,0 +1,102 @@
+import yaml
+from pathlib import Path
+
+
+def load_config(country: str, return_period: int, disaster_type: str, is_conflict: bool = False) -> dict:
+    '''Load configuration for the specified case country.
+    
+    Args:
+        country (str): The country for which to load the configuration.
+        return_period (int): The return period for the disaster.
+        disaster_type (str): The type of disaster.
+        is_conflict (bool): Whether the country is in conflict.
+    
+    Returns:
+        dict: The configuration for the specified case country.
+    '''
+
+    config_path = Path(f"../config/{country}.yaml")
+
+    if not config_path.exists():
+        raise FileNotFoundError(
+            f"Config file for {country} not found at {config_path}")
+
+    with open(config_path, "r") as file:
+        config = yaml.safe_load(file)
+
+    check_config_parameters(config)
+
+    config['constants']['return_period'] = return_period
+    config['constants']['disaster_type'] = disaster_type
+
+    if is_conflict:
+        config['constants']['is_conflict'] = True
+    else:
+        config['constants']['is_conflict'] = False
+
+    return config
+
+def check_config_parameters(config: dict) -> None:
+    '''Check if the configuration parameters are valid.
+    
+    Args:
+        config (dict): The configuration to check.
+    
+    Returns:
+        None
+    
+    Raises:
+        ValueError: If the configuration parameters are not valid.
+    '''
+    return_periods = [10, 50, 100, 250, 500, 1000]
+    disaster_types = ['hurricane', 'flood']
+
+    if 'return_period' not in config['constants']:
+        raise ValueError("Return period not specified in configuration.")
+
+    if 'disaster_type' not in config['constants']:
+        raise ValueError("Disaster type not specified in configuration.")
+
+    if 'return_period' not in return_periods:
+        raise ValueError(
+            f"Return period {config['constants']['return_period']} not in available return periods: {return_periods}")
+
+    if 'disaster_type' not in disaster_types:
+        raise ValueError(
+            f"Disaster type {config['constants']['disaster_type']} not in available disaster types: ['hurricane', 'flood']")
+
+    neccessary_parameters = ['country', 'avg_prod', 'inc_exp_growth', 'cons_util', 'disc_rate', 'disaster_type', 'calc_exposure_params', 'identify_aff_params', 'add_inc_loss', 'pov_bias', 'lambda_incr', 'yrs_to_rec', 'rnd_inc_params', 'rnd_sav_params', 'rnd_rent_params', 'rnd_house_vuln_params', 'min_households', 'atol', 'save_households', 'save_consumption_recovery', 'regions', 'levers', 'uncertainties']
+    exposure_neccessary_parameters = ['distr', 'high', 'low']
+    identify_aff_neccessary_parameters = ['delta_pct', 'distr', 'high', 'low', 'num_masks']
+    rnd_inc_neccessary_parameters = ['randomize', 'distr', 'delta']
+    rnd_sav_neccessary_parameters = ['randomize', 'distr', 'avg', 'delta']
+    rnd_rent_neccessary_parameters = ['randomize', 'distr', 'avg', 'delta']
+    rnd_house_vuln_neccessary_parameters = ['randomize', 'distr', 'low', 'high', 'min_thresh', 'max_thresh']
+
+    for parameter in neccessary_parameters:
+        if parameter not in config['constants']:
+            raise ValueError(f"Parameter {parameter} not found in configuration.")
+
+    for parameter in exposure_neccessary_parameters:
+        if parameter not in config['constants']['calc_exposure_params']:
+            raise ValueError(f"Parameter {parameter} not found in calc_exposure_params.")
+
+    for parameter in identify_aff_neccessary_parameters:
+        if parameter not in config['constants']['identify_aff_params']:
+            raise ValueError(f"Parameter {parameter} not found in identify_aff_params.")
+
+    for parameter in rnd_inc_neccessary_parameters:
+        if parameter not in config['constants']['rnd_inc_params']:
+            raise ValueError(f"Parameter {parameter} not found in rnd_inc_params.")
+
+    for parameter in rnd_sav_neccessary_parameters:
+        if parameter not in config['constants']['rnd_sav_params']:
+            raise ValueError(f"Parameter {parameter} not found in rnd_sav_params.")
+
+    for parameter in rnd_rent_neccessary_parameters:
+        if parameter not in config['constants']['rnd_rent_params']:
+            raise ValueError(f"Parameter {parameter} not found in rnd_rent_params.")
+
+    for parameter in rnd_house_vuln_neccessary_parameters:
+        if parameter not in config['constants']['rnd_house_vuln_params']:
+            raise ValueError(f"Parameter {parameter} not found in rnd_house_vuln_params.")
diff --git a/experiments/experiment_runner.py b/experiments/experiment_runner.py
@@ -0,0 +1,44 @@
+from pathlib import Path
+from ema_workbench import perform_experiments, MultiprocessingEvaluator, save_results, Model
+
+
+def run_experiments(experimental_setup: dict) -> None:
+    '''Run experiments with the specified setup with the use of EMA Workbench and save the results.
+    
+    Args:
+        experimental_setup (dict): A dictionary containing the setup for the experiments.
+
+    Returns:
+        None
+    '''
+    country = experimental_setup['country']
+    return_period = experimental_setup['return_period']
+    model = experimental_setup['model']
+    n_scenarios = experimental_setup['n_scenarios']
+    n_policies = experimental_setup['n_policies']
+    multiprocessing = experimental_setup['multiprocessing']
+    n_processes = experimental_setup['n_processes']
+
+    if multiprocessing:
+        with MultiprocessingEvaluator(model, n_processes=n_processes) as evaluator:
+            results = evaluator.perform_experiments(
+                scenarios=n_scenarios, policies=n_policies)
+    else:
+        results = perform_experiments(
+            models=model, scenarios=n_scenarios, policies=n_policies)
+
+    save_experiment_results(country, return_period, model,
+                            results, n_scenarios, n_policies)
+
+
+def save_experiment_results(country: str, return_period: int, model: Model, results: dict, n_scenarios: int, n_policies: int):
+    """Saves experiment results to a file, taking into account if there was a conflict."""
+    results_path = Path(f'../results/{country}')
+    results_path.mkdir(parents=True, exist_ok=True)
+
+    is_conflict = getattr(model.constants._data.get(
+        'is_conflict'), 'value', False)
+
+    conflict_str = ", conflict=True" if is_conflict else ""
+    filename = f"return_period={return_period}, scenarios={n_scenarios}, policies={n_policies}{conflict_str}.tar.gz"
+    save_results(results, results_path / filename)
diff --git a/experiments/model_setup.py b/experiments/model_setup.py
@@ -0,0 +1,48 @@
+from ema_workbench import Model
+from ema_workbench.em_framework.parameters import IntegerParameter, CategoricalParameter, Constant
+from ema_workbench.em_framework.outcomes import ArrayOutcome
+from unbreakable.model import model
+
+
+def setup_model(config: dict) -> Model:
+    """
+    Set up the EMA Workbench model based on the provided configuration.
+
+    Args:
+        config (dict): Configuration dictionary loaded from the YAML file.
+
+    Returns:
+        Model: Configured EMA Workbench model.
+    """
+    my_model = Model(name="model", function=model)
+
+    # Extract and set up uncertainties, constants, and levers from the config
+    # uncertainties = config.get("uncertainties", {})
+    constants = config.get("constants", {})
+    levers = config.get("levers", {})
+
+    # Define seed as an uncertainty for multiple runs,
+    # By specifying a wider range, we want to ensure that the seed is likely to be different for each run
+    seed_start = 0
+    seed_end = 1000000000
+
+    # Fix seed to ensure reproducibility
+    # NOTE: If running multiple instances of the model in parallel, the seed will be the same for all instances
+    # np.random.seed(42)
+
+    my_model.uncertainties = [IntegerParameter(
+        "random_seed", seed_start, seed_end)]
+
+    # Constants
+    my_model.constants = [Constant(key, value)
+                          for key, value in constants.items()]
+
+    # Levers
+    my_model.levers = [CategoricalParameter(
+        'current_policy', [values for _, values in levers.items()])]
+
+    # Outcomes
+    my_model.outcomes = [ArrayOutcome(region)
+                         for region in constants.get('regions', [])]
+
+    return my_model
diff --git a/unbreakable/run.py → main.py b/unbreakable/run.py → main.py
@@ -1,16 +1,20 @@
-from unbreakable.model import load_config, setup_model, run_experiments
+from experiments.config_manager import load_config
+from experiments.model_setup import setup_model
+from experiments.experiment_runner import run_experiments
 from ema_workbench import ema_logging
 
 ema_logging.log_to_stderr(ema_logging.INFO)
 
-if __name__ == "__main__":
+
+def main():
     try:
-        country = 'Nigeria'
+        country = 'Dominica'
+        disaster_type = 'hurricane'
         return_period = 100
-        conflict = True
-
-        config = load_config(country, return_period, conflict)
-        model = setup_model(config, replicator=False)
+        is_conflict = False
+        config = load_config(country, return_period,
+                             disaster_type, is_conflict)
+        model = setup_model(config)
 
         experimental_setup = {
             'country': country,
@@ -26,3 +30,7 @@
 
     except Exception as e:
         print(f"An error occurred: {e}")
+
+
+if __name__ == "__main__":
+    main()