Skip to content

Examples

Josh Fogg edited this page Aug 9, 2024 · 5 revisions

What would documentation be without some examples? If anything here is unclear, please do open an issue.

Toy Example (n = 2)

When first testing the theory behind our methods, Julian came up with a small two candidate example excluding sex data. The variables are

$$ \Sigma = \begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix},\quad \boldsymbol{\bar{\mu}} = \begin{bmatrix} 1 \\ 2 \end{bmatrix},\quad \Omega = \begin{bmatrix} \frac{1}{9} & 0 \\ 0 & 4 \end{bmatrix}. $$

On examining the data, we see that candidate two has a higher average expected breeding value ($\bar\mu_1 < \bar\mu_2$), but that it also comes with a far higher variance ($\omega_{11} < \omega_{22}$). Since the relationships are equivalent ($\sigma_{11} = \sigma_{22}$), we'd expect standard optimization to go all in on candidate two, but robust optimization to deemphasise it in favour of the safer first candidate.

This toy can easily be extended with sex data and then solved using the RobustOCS module, which we import as below.

import robustocs

Problem Variables

To extend the problem to include sex data we reflect those original variable across both sires and dams, filling in the remainder with zeros.

$$ \Sigma = \begin{bmatrix} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{bmatrix},\quad \boldsymbol{\bar{\mu}} = \begin{bmatrix} 1 \ 1 \ 2 \ 2 \end{bmatrix},\quad \Omega = \begin{bmatrix} \frac{1}{9} & 0 & 0 & 0 \\ 0 & \frac{1}{9} & 0 & 0 \\ 0 & 0 & 4 & 0 \\ 0 & 0 & 0 & 4 \end{bmatrix},\quad \mathcal{S} = \lbrace 1, 3 \rbrace,\quad \mathcal{D} = \lbrace 2, 4 \rbrace. $$

While we could solve this problem by loading the datafiles (as in the repository's example.py, we can also do this by defining the variables directly in Python.

import numpy as np

sigma = numpy.eye(4)
mubar = numpy.array([1, 2, 2, 2])
omega = numpy.diagflat([1/9, 1/9, 4, 4])
sires = [0, 2]
dams  = [1, 3]

Solving with Gurobi

In the example.py, we used HiGHS to solve the problem. RobustOCS also defines functions for solving the standard and robust genetic selection problems using the Python interface to Gurobi. For evaluation purposes, this will work for small problems without a Gurobi license.

Below these are used to solve the above problem with parameters $\lambda = 0.5$ and $\kappa = 1$:

lam = 0.5
kap = 1

# computes the standard and robust genetic selection solutions
w_std, obj_std = robustocs.gurobi_standard_genetics(sigma, mubar, sires, dams, lam, n)
w_rbs, z_rbs, obj_rbs = robustocs.gurobi_robust_genetics(sigma, mubar, omega, sires, dams, lam, kap, n)

# print a comparison of the two solutions
robustocs.print_compare_solutions(w_std, w_rbs, obj_std, obj_rbs, z2=z_rbs, name1="w_std", name2="w_rbs")

This might produce the following output

i  w_std    w_rbs
1  0.00000  0.38200
2  0.00000  0.38200
3  0.50000  0.11800
4  0.50000  0.11800

w_std objective: 1.87500
w_rbs objective: 0.77684 (z = 0.37924)
Maximum change: 0.38200
Average change: 0.38200
Minimum change: 0.38200

which aligns with what we proposed may happen before beginning.

Checking

Finally, we can use explore how close our $z\geq \sqrt{\mathbf{w}^{T}\Omega \mathbf{w}}$ constraint (see modelling) came to equality.

if not robustocs.check_uncertainty_constraint(z_rbs, w_rbs, omega, debug=True):
    raise ValueError

This produces the following output

     z: 0.37923871642022844
w'*Ω*w: 0.3792386953366983
  Diff: 2.1083530143961582e-08

showing the constraint shows a good fit to equality.

Small Example (n = 50)

Looking slightly larger, we can use a small 50 candidate cohort to explore how the solutions differ between Gurobi and HiGHS. This time the matrix variables $\Sigma$, $\boldsymbol{\bar{\mu}}$, and $\Omega$ are stored in A50.txt, EBV50.txt, and S50.txt respectively.

These are loaded into Python using load_problem, noting the use of the issparse parameter. Since we intend to use HiGHS this time, it is convenient to have the matrices ready in a sparse format using SciPy.

import robustocs
import time

# key problem variables loaded from standard format txt files
sigma, mubar, omega, n, _, _, _ = robustocs.load_problem(
    "A50.txt",
    "EBV50.txt",
    "S50.txt",
    issparse=True
)

# NOTE our test data is structured so that candidates alternate
# between sires (which have even indices) and dams (which have
# odd indices), letting us use this trick for the index sets.
sires = range(0, n, 2)
dams = range(1, n, 2)

lam = 0.5
kap = 1

# Gurobi using conic programming
t0 = time.time()
w_grb, z_grb, obj_grb = robustocs.gurobi_robust_genetics(
    sigma, mubar, omega, sires, dams, lam, kap, n)
t1 = time.time()
print(f"Gurobi took {t1-t0:.5f} seconds (conic)")

# Gurobi using sequential quadratic programming
t0 = time.time()
w_grb, z_grb, obj_grb = robustocs.gurobi_robust_genetics_sqp(
    sigma, mubar, omega, sires, dams, lam, kap, n)
t1 = time.time()
print(f"Gurobi took {t1-t0:.5f} seconds (SQP)")

# HiGHS using sequential quadratic programming
t0 = time.time()
w_hig, z_hig, obj_hig = robustocs.highs_robust_genetics_sqp(
    sigma, mubar, omega, sires, dams, lam, kap, n)
t1 = time.time()
print(f"HiGHS took  {t1-t0:.5f} seconds (SQP)")

print("\nSQP Methods:")
robustocs.print_compare_solutions(
    w_grb, w_hig, obj_grb, obj_hig, z1=z_grb, z2=z_hig,
    name1="Gurobi", name2="HiGHS ", tol=1e-7
)

This produces the following output.

Gurobi took 0.10033 seconds (conic)
Gurobi took 0.11359 seconds (SQP)
HiGHS took  0.02440 seconds (SQP)

SQP Methods:
i   Gurobi    HiGHS 
02  0.07812  0.07812
03  0.12466  0.12470
08  0.08073  0.08069
10  0.03533  0.03530
15  0.10595  0.10595
16  0.06713  0.06712
25  0.03398  0.03394
32  0.04456  0.04455
33  0.09337  0.09337
36  0.01041  0.01043
37  0.14205  0.14204
38  0.07915  0.07918
42  0.08136  0.08136
48  0.02322  0.02324

Gurobi objective: 1.92338 (z = 0.15064)
HiGHS  objective: 1.92338 (z = 0.15065)
Maximum change: 0.00004
Average change: 0.00000
Minimum change: 0.00000

We see that HiGHS beats Gurobi regardless which method the latter uses. We also see that there's some slight variation in the solution produced by the two SQP methods. This is to be expected from slight variations in the underlying solver methods used.

Scaling to Realistic Examples

This poses the question

HiGHS is beating Gurobi now, but will it still as the problem grows?

Gregor Gorjanc and Ivan Pocrnić at the Roslin Institute generated more realistic examples at much larger sizes. They provided datasets with 1,000 and 10,000¹ candidates to allow us to test the limits of these methods.

Looking at our four example cohorts of increasing size, the table below shows the time in seconds to solve OCS using standard quadratic optimization (Gurobi and HiGHS), robust conic optimization (Gurobi), and robust SQP (Gurobi and HiGHS).

Size Gurobi (standard) HiGHS (standard) Gurobi (conic) Gurobi (SQP) HiGHS (SQP)
4 2.95e-3 4.78e-4 4.71e-3 1.74e-2 5.98e-3
50 4.46e-3 1.02e-3 1.02e-2 5.52e-2 1.84e-2
1000 6.76e-1 2.04e-1 2.75e+0 2.64e+1 1.68e+0
10000 8.63e+1 2.58e+1 DNF² 1.56e+3 1.06e+2

We see that HiGHS consistently outperforms Gurobi across both the standard and robust problems, and that the HiGHS SQP method in particular is the only robust method which remains tractable for large $n$. Thus, unless there are other factors at play, we recommend using it to solve these problems.


1: The RobustOCS repository doesn't contain the 10,000 candidate data files due to GitHub's storage limitations. Feel free to contact the authors for access.

2: Gurobi crashed without displaying an error message when attempting to solve using conic programming for the largest problem.