diff --git a/README.md b/README.md index d790ed7..3f36825 100644 --- a/README.md +++ b/README.md @@ -17,17 +17,20 @@ The package supports: * models/functions on GPU * batched evaluation and Jacobian computation -* ineqaulity and equality constraints, which can depend only on a subset of the optimization variable +* inequality and equality constraints, which can depend only on a subset of the optimization variable -Quick links: -1) How to specify functions? -2) How to call the solver? -3) Advances options and infos for the SQPGS solver +#### Table of contents +1. [Installation](#installation) +2. [Getting started](#main-solver) + - [Solver interface](#solver-interface) + - [The `ObjectiveOrConstraint` class](#the-objectiveorconstraint-class) + - [Functionalities](#functionalities) +3. [Examples](#examples) +4. [References](#refernces) - -**DISCLAIMER:** +#### Disclaimer 1) We have not (yet) extensively tested the solver on large-scale problems. 2) The implemented solver is designed for nonsmooth, nonconvex problems, and as such, can solve a very general problem class. If your problem has a specific structure (e.g. convexity), then you will almost certainly get better performance by using software/solvers that are specifically written for the respective problem type. As starting point, check out [`cvxpy`](https://www.cvxpy.org/). @@ -42,16 +45,12 @@ For an editable version of this package in your Python environment, run the comm python -m pip install --editable . ``` - -## Main Solver - -The main solver implemented in this package is called SQP-GS, and has been developed by Curtis and Overton in [1]. -The SQP-GS algorithm can solve problems with nonconvex and nonsmooth objective and constraints. For details, we refer to [our documentation](src/ncopt/sqpgs/README.md) and the original paper [1]. - - ## Getting started ### Solver interface + +The main solver implemented in this package is called SQP-GS, and has been developed by Curtis and Overton in [1]. See [the detailed documentation](src/ncopt/sqpgs/). + The SQP-GS solver can be called via ```python @@ -80,7 +79,7 @@ For example, a linear constraint function `Ax - b <= 0` can be implemented as fo g.model.bias.data = -b # pass b ``` -### More functionalities +### Functionalities Let's assume we have a `torch.nn.Module`, call it `model`, which we want to use as objective/constraint. For the solver, we can pass the function as @@ -94,7 +93,7 @@ f = ObjectiveOrConstraint(model) * **Input preparation**: Different constraints might only need a part of the optimization variable as input, or might require additional preparation such as reshaping from vector to image. (Note that the optimization variable is handled always as vector) For this, you can specify a callable `prepare_input` when initializing a `ObjectiveOrConstraint` object. Any reshaping or cropping etc. can be handled with this function. Please note that `prepare_input` should be compatible with batched forward passes. -### Example +## Examples A full example for solving a nonsmooth Rosenbrock function, constrained with a maximum function can be found [here](example_rosenbrock.py). This example is taken from Example 5.1 in [1]. The picture below shows the trajectory of the SQP-GS solver for different starting points. The final iterates are marked with the black plus while the analytical solution is marked with the golden star. We can see that the algorithm finds the minimizer consistently. diff --git a/src/ncopt/sqpgs/README.md b/src/ncopt/sqpgs/README.md index 2907bae..55c00d5 100644 --- a/src/ncopt/sqpgs/README.md +++ b/src/ncopt/sqpgs/README.md @@ -16,14 +16,14 @@ x = problem.solve() Below we briefly describe the SQP-GS algorithm. **For more details on the algorithm, we refer to the paper [1].** The iteration cost of the algorithm splits mainly into two steps: -1) Sample function value and gradient/Jacobian of each nonsmooth function at multiple points in a neighborhood of the current iterate. +1) Evaluate and compute gradient/Jacobian for each function at multiple points in a neighborhood of the current iterate. -2) SQP approximates the original problem in each iteration by a quadratic program (QP). We need to solve this QP to compute the update direction. +2) Approximate the original problem by a quadratic program (QP). Solve this QP to compute the update direction. -The technique in 1) is called Gradient Sampling (GS) and is a widely used, robust technique for handling nonsmooth objective or constraint functions. -As all functions are Pytorch modules in this package, step 1) amounts to batch evaluation and Jacobian computation. This can be done efficiently using the `autograd` functionalities of Pytorch. +The technique in 1. is called Gradient Sampling (GS) and is a widely used, robust technique for handling nonsmooth objective or constraint functions. +As all functions are Pytorch modules in this package, this amounts to **batch evaluation and Jacobian computation**. This can be done efficiently using the `autograd` functionalities of Pytorch. -For 2), we solve the QP with the package `osqp`. We also implement a general interface to `cvxpy`, which seems slightly slower due to overhead costs, but more flexible as the solver can be exchanged easily. +For 2., we solve the QP with the package `osqp`. We also implement a general interface to `cvxpy`, which seems slightly slower due to overhead costs, but more flexible as the solver can be exchanged easily. Further, the quadratic approximation of SQP naturally involves an approximation of the Hessian, which is done in L-BFGS style.