diff --git a/README.md b/README.md index 363b8f6..b44a6ab 100644 --- a/README.md +++ b/README.md @@ -39,12 +39,22 @@ For each experiment, the exact config can also be found under `configs/` where t ## How to use -Any experiment needs a config file, see e.g. `configs/test.json`. +The main runner functions are `run.py` (or `run.ipynb` if you prefer notebooks). Any experiment needs a config file, see e.g. `configs/test.json`. +In general, the name of the config file serves as experiment ID; it is used later for storing the output, plotting etc. -* In the config you can specify at each key a list or a single entry. For every list entry, a cartesian product will be run. -* The same is true for the hypeprparameters of each entry in the `opt` key of the config file. -* Multiple runs can be done using the key `n_runs`. In each run the seed for shuffling the `DataLoader` changes. -* The name of the config file serves as experiment ID, used later for running and storing the output. +There are two ways of specifying a config for `run.py`. + +1) *dict-type* configs + +* Here, the config JSON is a dictionary where you can specify at each key a list or a single entry. +* The same is true for the hyperparameters of each entry in the `opt` key of the config file. +* A cartesian product of all lists entrys will be run (ie. potentially many single training runs in sequence). +* Multiple repetitions can be done using the key `n_runs`. This will use different seeds for shuffling the `DataLoader`. + +2) *list-type* configs + +* The config JSON is a list, where each entry is a config for a **single training run**. +* This format is intended only when you want to launch multiple runs in parallel. You should first create a dict-type config, and then use utilities for creating temporary list-type configs (see an example [here](configs/README.md#example-for-splitting-up-a-config)). You can run an experiment with `run.py` or with `run.ipynb`. A minimal example is: @@ -73,4 +83,6 @@ For the entries in `history`, the following keys are important: * `train_loss`: loss function value over training set * `val_loss`: loss function value over validation set * `train_score`: score function (eg accuracy) over training set -* `val_score`: score function (eg accuracy) over validation set \ No newline at end of file +* `val_score`: score function (eg accuracy) over validation set + +In [`stepback.utils.py`](stepback/utils.py) you can find several helper functions for merging or filtering output files. \ No newline at end of file diff --git a/configs/README.md b/configs/README.md new file mode 100644 index 0000000..3d4ec81 --- /dev/null +++ b/configs/README.md @@ -0,0 +1,31 @@ +## Remarks on config management + +1) The simple option: Create a dict-type config (e.g. like [test.json](test.json)). The file name (in this example we use ``my_exp.json``) will serve as an identifier ``exp_id`` in the next steps. You can then run all entries of the config with one job. + +2) The more complicated, but versatile option (e.g. when one single run is expensive): You can split a dict-type config into subsets (which are then stored as temporary list-type configs). This allows to only create configs which have not been run before. + +*Case a)* Assume we want to rerun everything. Choose a `job_name` which will serve as folder name for temporary config files. Specify `splits` as the number of splits you wish (if not specified, it splits into lists of length one). + +```python +from stepback.utils import split_config +split_config(exp_id='my_exp', job_name=job_name, config_dir='configs/', splits=None, only_new=False) +``` + + +*Case b)* Assume you have already ran some settings and only want to run new settings. The function will determine whether a specific setting has been run, by looking into the output from a ``output_dir`` which belong to ``exp_id``. **This is an experimental feature and should be handled with caution!**. You can run + +```python +from stepback.utils import split_config +split_config(exp_id='my_exp', job_name=job_name, config_dir='configs/', splits=None, only_new=True, output_dir='output/') +``` + + +In both cases, this will create temporary list-type config files, stored in `configs/job_name/`, which can then be launched separately. +The temproary config files will follow the name pattern + +``` +my_exp-001.json +my_exp-002.json +my_exp-003.json +... +``` diff --git a/output/README.md b/output/README.md index a66d541..e82d08d 100644 --- a/output/README.md +++ b/output/README.md @@ -2,16 +2,12 @@ We store the results of all experiments here. -In the [plotting script](../show.py), for a given experiment ID `EXP_ID`, all output files in this folder are collected if their name is either +**Important:** The [``Record``](../stepback/record.py) object - which serves for plotting, analyzing results etc - will collect output from multiple files for a given experiment ID `EXP_ID`. Specifically, it loads the output from all files in this folder if the file name is in ``` .json -1.json, -2.json, ... ``` -This has the following reason: it might be useful to split up config files even though they belong together. If we want to run parts of the same config in parallel, it should be safer to write to different output files. Hence, if desired, you can split your config into the same structure: - -``` -.json --1.json, -2.json, ... -``` \ No newline at end of file +We do this because it might be useful to split up output of different runs which actually *belong together* into different files. +You can however also easily merge multiple output files (or all files in a subdirectory) with the utilities in [`stepback.utils.py`](../stepback/utils.py). \ No newline at end of file diff --git a/run.py b/run.py index 3ea007c..3b37cdf 100644 --- a/run.py +++ b/run.py @@ -6,9 +6,9 @@ import argparse import torch -from stepback.utils import prepare_config, create_exp_list from stepback.base import Base from stepback.log import Container +from stepback.config import ConfigManager from stepback.defaults import DEFAULTS @@ -64,12 +64,8 @@ def run_one(exp_id: str, """ # load config - with open(config_dir + f'{exp_id}.json') as f: - exp_config = json.load(f) - - # prepare list of configs (cartesian product) - exp_config = prepare_config(exp_config) - exp_list = create_exp_list(exp_config) + Conf = ConfigManager(exp_id=exp_id, config_dir=config_dir) + exp_list = Conf.create_config_list() print(f"Created {len(exp_list)} different configurations.") diff --git a/stepback/config.py b/stepback/config.py new file mode 100644 index 0000000..11c75d9 --- /dev/null +++ b/stepback/config.py @@ -0,0 +1,133 @@ +import copy +import json +import os +import itertools + +from .defaults import DEFAULTS + +class ConfigManager: + """ + For managing config files. + + We distinguish two types of config files: + * dict-type, where each value can be a list. This will be converted into a cross-product of single run configs. + You should always set up this type of config, and then create list-type configs only for temporary use. + + * list-type. This is essentially a subset of the cross-product that comes from a dict-type config. Intenden mainly for running many jobs in parallel. + + """ + def __init__(self, + exp_id: str, + config_dir: str=DEFAULTS.config_dir + ): + + self.exp_id = exp_id + self.config_dir = config_dir + + + def create_config_list(self): + """ + Creates a list of all configs for single runs. + + Operation depends on which type of config the JSON with name ``self.exp_id`` is: + + * If dict-type, then the cross-product is created here and returned. + * If list-type, then the list is returned. + + """ + + with open(os.path.join(self.config_dir, self.exp_id) + '.json') as f: + exp_config = json.load(f) + + # Check whether it is dict-typ or not, and do some sanity checks + if isinstance(exp_config, dict): + self.dict_type = True + assert 'n_runs' in exp_config.keys(), 'Dict-type config must specify the number of runs (e.g. "n_runs": 1).' + elif isinstance(exp_config, list): + self.dict_type = False + for c in exp_config: + assert 'run_id' in c.keys(), 'List-type config must contain "run_id" for every list element.' + else: + raise KeyError("Config has unknown format, must be dict or list.") + + if self.dict_type: + exp_config = prepare_config(exp_config) + exp_list = create_exp_list(exp_config) # cartesian product + else: + exp_list = copy.deepcopy(exp_config) + + self.exp_list = exp_list + + return self.exp_list + + +""" +Utility functions for Experiments. +""" + +def prepare_config(exp_config: dict) -> dict: + """ + Given an experiment config, we do the following preparations: + + * Convert n_runs to a list of run_id (integer values) + * Convert each element of opt to a list of opt configs. + """ + c = copy.deepcopy(exp_config) + + c['run_id'] = list(range(c['n_runs'])) + del c['n_runs'] + + + assert isinstance(c['opt'], list), f"The value of 'opt' needs to be a list, but is given as {c['opt']}." + + all_opt = list() + for this_opt in c['opt']: + + # make every value a list + for k in this_opt.keys(): + if not isinstance(this_opt[k], list): + this_opt[k] = [this_opt[k]] + + # cartesian product + all_opt += [dict(zip(this_opt.keys(), v)) for v in itertools.product(*this_opt.values())] + + c['opt'] = all_opt + + return c + +def create_exp_list(exp_config: dict): + """ + This function was adapted from: https://github.com/haven-ai/haven-ai/blob/master/haven/haven_utils/exp_utils.py + + Creates a cartesian product of a experiment config. + + Each value of exp_config should be a single entry or a list. + For list values, every entry of the list defines a single realization. + + Parameters + ---------- + exp_config : dict + + Returns + ------- + exp_list: list + A list of configs, each defining a single run. + """ + exp_config_copy = copy.deepcopy(exp_config) + + # Make sure each value is a list + for k, v in exp_config_copy.items(): + if not isinstance(exp_config_copy[k], list): + exp_config_copy[k] = [v] + + # Create the cartesian product + exp_list_raw = ( + dict(zip(exp_config_copy.keys(), v)) for v in itertools.product(*exp_config_copy.values()) + ) + + # Convert into a list + exp_list = [] + for exp_dict in exp_list_raw: + exp_list += [exp_dict] + + return exp_list \ No newline at end of file diff --git a/stepback/utils.py b/stepback/utils.py index d198abe..f393491 100644 --- a/stepback/utils.py +++ b/stepback/utils.py @@ -7,10 +7,13 @@ import itertools import copy import os +import warnings +import json from sklearn.linear_model import Ridge, LogisticRegression from .log import Container +from .config import ConfigManager #%% """ @@ -134,81 +137,91 @@ def merge_output_files(exp_id_list, fname, output_dir='output/', merged_dir=None merged.store() return - #%% """ -Utility functions for Experiments. +Utility functions for Config files. """ -def prepare_config(exp_config: dict) -> dict: - """ - Given an experiment config, we do the following preparations: - - * Convert n_runs to a list of run_id (integer values) - * Convert each element of opt to a list of opt configs. - """ - c = copy.deepcopy(exp_config) - - c['run_id'] = list(range(c['n_runs'])) - del c['n_runs'] - - - assert isinstance(c['opt'], list), f"The value of 'opt' needs to be a list, but is given as {c['opt']}." - - all_opt = list() - for this_opt in c['opt']: - - # make every value a list - for k in this_opt.keys(): - if not isinstance(this_opt[k], list): - this_opt[k] = [this_opt[k]] - - # cartesian product - all_opt += [dict(zip(this_opt.keys(), v)) for v in itertools.product(*this_opt.values())] - - c['opt'] = all_opt - - return c +def split_config(exp_id: str, job_name: str, config_dir: str, splits: int=None, only_new: bool=False, output_dir: str='output/'): + """Splits a dict-type config into parts. -def create_exp_list(exp_config: dict): - """ - This function was adapted from: https://github.com/haven-ai/haven-ai/blob/master/haven/haven_utils/exp_utils.py - - Creates a cartesian product of a experiment config. - - Each value of exp_config should be a single entry or a list. - For list values, every entry of the list defines a single realization. - Parameters ---------- - exp_config : dict - - Returns - ------- - exp_list: list - A list of configs, each defining a single run. + exp_id : str + The name of the dict-type config. + job_name : str + Folder name where the temporary config files will be stored. + config_dir : str + Directory where ``exp_id.json```is stored. Temporary configs will be created in this directory as well. + splits : int, optional + How many parts (of roughtly equal size) you want to split, by default None. + If not specified, then one single config per file. + only_new : bool, optional + Whether to only create configs which have not been run, by default False. + + Use this option with caution. We will look up all files that start with ``exp_id-`` (or are ``exp_id``) in the output directory specified. + Any config that can be found in those files will be disregarded. + output_dir : str, optional + Directory of output files, by default 'output'. + Only relevant if ``only_new=True``. """ - exp_config_copy = copy.deepcopy(exp_config) - # Make sure each value is a list - for k, v in exp_config_copy.items(): - if not isinstance(exp_config_copy[k], list): - exp_config_copy[k] = [v] + if os.path.exists(os.path.join(config_dir, job_name)): + warnings.warn("A folder with the same job__name already exists, files will be overwritten.") + else: + os.mkdir(os.path.join(config_dir, job_name)) + + # Load config_list + Conf = ConfigManager(exp_id=exp_id, config_dir=config_dir) + config_list = Conf.create_config_list() + assert Conf.dict_type, "For splitting a config, it should be of dict-type" + print(f"Initial config contains {len(config_list)} elements.") + + # If only new, load output files and keep only the ones that have not been run yet + # Check all output files which start with 'exp_id-' + if only_new: + print("Screening for existing runs...") + existing_files = get_output_filenames(exp_id, output_dir=output_dir) + existing_configs = list() + + for _e in existing_files: + print(f"Looking in output data from {output_dir+_e}") + C = Container(name=_e, output_dir=output_dir, as_json=True) + C.load() # load data + existing_configs += [copy.deepcopy(_d['config']) for _d in C.data] + del C + + to_remove = list() + for _conf in config_list: + + # Base adds empty kwargs if not specified; we do this here to ensure correct comparison + if 'dataset_kwargs' not in _conf.keys(): + _conf['dataset_kwargs'] = dict() - # Create the cartesian product - exp_list_raw = ( - dict(zip(exp_config_copy.keys(), v)) for v in itertools.product(*exp_config_copy.values()) - ) + if 'model_kwargs' not in _conf.keys(): + _conf['model_kwargs'] = dict() - # Convert into a list - exp_list = [] - for exp_dict in exp_list_raw: - exp_list += [exp_dict] + # Check if exists --> remove + if _conf in existing_configs: + to_remove.append(_conf) + else: + pass - return exp_list + config_list = [_conf for _conf in config_list if _conf not in to_remove] # remove all existing runs + print(f"Screening ended: {len(config_list)} elements remaining.") + # Split config_list evenly + if splits is None: + splits = len(config_list) + list_of_config_lists = [list(a) for a in np.array_split(config_list, splits)] + # store + for j, _this_list in enumerate(list_of_config_lists): + with open(os.path.join(config_dir, job_name, exp_id) + f'-{j:02d}.json', "w") as f: + json.dump(_this_list, f, indent=4, sort_keys=True) + + return #%% """