Skip to content

A Testing Framework for Decision-Optimization Model Learning Algorithms

License

Notifications You must be signed in to change notification settings

IBM/doframework

Repository files navigation

DOFramework

doframework is a testing framework for decision-optimization model learning algorithms. Such algorithms learn part or all of a decision-optimization model from data and solve the model to produce a predicted optimal solution.

doframework randomly generates multiple optimization problems (f,O,D,x*) for your algorithm to learn and solve:

  • f is a continuous piece-wise linear function defined over a domain in d-dimensional space (d>1),
  • O is a feasibility region in dom(f) defined by linear constraints,
  • D = (X,y) is a dataset derived from f,
  • x* is the true optimum of f in O (minimum or maximum).

doframework feeds your algorithm constraints and data (O,D) and collects its predicted optimum. The algorithm's predicted optimal value can then be compared to the true optimal value f(x*). By comparing the two over multiple randomly generated optimization problems, doframework produces a prediction profile for your algorithm.

doframework integrates with your algorithm (written in Python).

Design

doframework was designed for optimal cloud distribution following an event-driven approach.

doframework was built on top of ray for cloud distribution and rayvens for event driven management.

Requirements

doframework was written for Python version >= 3.8.0.

doframework can run either locally or remotely. For optimal performance, run it on a Kubernetes cluster. Cloud configuration is currently available for AWS and IBM Cloud OpenShift clusters.

The framework uses storage (local or S3) to interact with simulation products. Configuration is currently available for AWS or IBM Cloud Object Storage COS.

Install

To run doframework locally, install with

$ pip install doframework

Configs

Storage specifications are provided in a configs.yaml. You'll find examples under ./configs/*.

The configs.yaml includes the list of source and target bucket names (under buckets). If necessary, S3 credentials are added under designated fields.

Here is the format of the configs.yaml either for local storage

local:
    buckets:
        inputs: '<inputs-folder>'
        inputs_dest: '<inputs-dest-folder>'
        objectives: '<objectives-folder>'
        objectives_dest: '<objectives-dest-folder>'
        data: '<data-folder>'
        data_dest: '<data-dest-folder>'
        solutions: '<solutions-folder>'

or S3

s3:
    buckets:
        inputs: '<inputs-bucket>'
        inputs_dest: '<inputs-dest-bucket>'
        objectives: '<objectives-bucket>'
        objectives_dest: '<objectives-dest-bucket>'
        data: '<data-bucket>'
        data_dest: '<data-dest-bucket>'
        solutions: '<solutions-bucket>'
    aws_secret_access_key: 'xxxx'
    aws_access_key_id: 'xxxx'
    endpoint_url: 'https://xxx.xxx.xxx'
    region: 'xx-xxxx'
    cloud_service_provider: 'aws'

Currently, two S3 providers are available under s3:cloud_service_provider: either aws or ibm. The endpoint_url is optional for AWS.

Bucket / folder names must be distinct.

Inputs

input.json files provide the necessary metadata for the random genration of optimization problems.

doframework will run end to end, once input.json files are uploaded to <inputs-bucket> / <inputs-folder>.

The jupyter notebook ./notebooks/inputs.ipynb allows you to automatically generate input files and upload them to <inputs-bucket>.

Here is an example of an input file (see input samples input_basic.json under ./inputs).

{     
    "f": {
        "vertices": {
            "num": 7,
            "range": [[5.0,20.0],[0.0,10.0]],
        },
        "values": {
            "range": [0.0,5.0]
        },
    },
    "omega" : {
        "ratio": 0.8
    },
    "data" : {
        "N": 750,
        "noise": 0.01,
        "policy_num": 2,
        "scale": 0.4
    },
    "input_file_name": "input_basic.json"
}

f:vertices:num: number of vertices in the piece-wise linear graph of f.
f:vertices:range: f domain will be inside this range.
f:values:range: range of f values.
omega:ratio: vol(O) / vol(dom(f)) >= ratio.
data:N: number of data points to sample.
data:noise: response variable noise.
data:policy_num: number of centers in Gaussian mix distribution of data.
data:scale: max STD of Gaussian mix distribution of data (as a ratio of domain diameter).

It's a good idea to start experimenting on low-dimensional problems.

User App Integration

Your algorithm will be integrated into doframework once it is decorated with doframework.resolve.

A doframework experiment runs with doframework.run(). The run() utility accepts the decorated model and an absolute path to the configs.yaml.

Here is an example a user application module.py.

import doframework as dof

@dof.resolve
def alg(data: np.array, constraints: np.array, **kwargs):
    ...    
    return optimal_arg, optimal_val, regression_model

if __name__ == '__main__':
    
    dof.run(alg, 'configs.yaml', objectives=5, datasets=3, **kwargs)

doframework provides the following inputs to your algorithm:

data: 2D np.array with features X = data[ : , :-1] and response variable y = data[ : ,-1].
constraints: linear constraints as a 2D numpy array A. A data point x satisfies the constraints when A[ : , :-1]*x + A[ : ,-1] <= 0.

It feeds your algorithm additional inputs in kwargs:

lower_bound: lower bound per feature variable.
upper_bound: upper bound per feature variable.
init_value: optional initial value.

The run() utility accepts the arguments:

objectives: number of objective targets to generate per input file.
datasets: number of datasets to generate per objective target.
distribute: True to run distributively, False to run sequentially.
logger: True to see doframework logs, False otherwise.
after_idle_for: stop running when event stream is idle after this many seconds.
alg_num_cpus: number of CPUs to dedicate to your algorithm on each optimization task.
data_num_cpus: number of CPUs to dedicate to data generation (useful in high dimensions).

Algorithm Prediction Profile

Once you are done running a doframework experiment, run the notebook notebooks/profile.ipynb. It will fetch the relevant experiment products from the target buckets and produce the algorithm's prediction profile and prediction probabilities.

doframework produces three types of experiment product files:

  • objective.json: containing information on (f,O,x*)
  • data.csv: containing the dataset the algorithm accepts as input
  • solution.json: containing the algorithm's predicted optimum

See sample files under ./outputs.

Kubernetes Cluster

To run doframework on a K8S cluster, make sure you are on the cluster's local kubectl context. Log into your cluster, if necessary (applicable to OpenShift, see ./doc/openshift.md).

You can check your local kubectl context and change it if necessary with

$ kubectl config current-context
$ kubectl config get-contexts
$ kubectl config use-context cluster_name
>> Switched to context "cluster_name".

Now cd into your project's folder and run the setup bash script doframework-setup.sh. The setup script will generate the cluster configuration file doframework.yaml in your project's folder. The setup script requires the absolute path to your configs.yaml. Running the setup .sh script will establish the ray cluster.

$ cd <user_project_folder>
$ doframework-setup.sh --configs ~/path/to/configs.yaml

You have the option to adapt doframework.yaml to fit your application.

Use the flag --project-requirements to specify the absolute path to your requirements.txt file. It will be pip install -r requirements.txt on your cluster nodes.

Use the flag --project-dir to specify the absolute path to your project. It will be pip installed on your cluster nodes.

$ doframework-setup.sh --configs ~/path/to/configs.yaml --project-requirements <absolute_requirements_path> --project-dir <absolute_project_path>

Use the --skip flag to skip re-generating the doframework.yaml.

$ doframework-setup.sh --skip

Or, in case you are familiar with ray, run instead

$ ray up doframework.yaml --no-config-cache --yes

Upload input.json file(s) to your <inputs_bucket>. Now you can submit your application module.py to the cluster

$ ray submit doframework.yaml module.py

Ray Cluster

To observe the ray dashboard, connect to http://localhost:8265 in your browser. See ./doc/openshift.md for OpenShift-specific instructions.

Some useful health-check commands:

  • Check the status of ray pods
$ kubectl get pods -n ray
  • Check the status of the ray head node
$ kubectl describe pod rayvens-cluster-head-xxxxx -n ray
  • Monitor autoscaling with
$ ray exec doframework.yaml 'tail -n 100 -f /tmp/ray/session_latest/logs/monitor*'
  • Connect to a terminal on the head node
$ ray attach doframework.yaml
$ ...
$ exit
  • Get a remote shell to the cluster manually (find the head node ID with kubectl describe)
$ kubectl -n ray exec -it rayvens-cluster-head-z97wc -- bash

After introducing manual changes to doframework.yaml, update with

$ ray up doframework.yaml --no-config-cache --yes

Shutdown the ray cluster with

$ ray down -y doframework.yaml

Test

Run the setup bash script doframework-setup.sh with the --example flag to generate the test script doframework_example.py in your project folder.

$ cd <user_project_folder>
$ doframework-setup.sh  --configs ~/path/to/configs.yaml --example

To run the test script locally, use

$ python doframework_example.py --configs ~/path/to/configs.yaml

To run the test script on your K8S cluster, use

$ ray submit doframework.yaml doframework_example.py --configs configs.yaml

[NOTE: we are using the path to the configs.yaml that was mounted on cluster nodes under $HOME.]

Make sure to upload input json files to <inputs-bucket> / <inputs-folder> once you run doframework_example.py.