Skip to content

Python library for solving large-scale inverse problems

License

Notifications You must be signed in to change notification settings

fpicetti/occamypy

Repository files navigation

occamypy

OccamyPy: an object-oriented optimization framework for small- and large-scale problems

We present an object-oriented optimization framework that can be employed to solve small- and large-scale problems based on the concept of vectors and operators. By using such a strategy, we implement different iterative optimization algorithms that can be used in combination with architecture-independent vectors and operators, allowing the minimization of single-machine or cluster-based problems with a unique codebase. We implement a Python library following the described structure with a user-friendly interface. We demonstrate its flexibility and scalability on multiple inverse problems, where convex and non-convex objective functions are optimized with different iterative algorithms.

Installation

Preferred way is through Python Package Index:

pip install occamypy

In order to have Cupy-based vectors and operators, you should install also Cupy and cuSIGNAL. They are not included in this installation as they are dependent on the target CUDA device and compiler.

As this library strongly relies on Numpy, we suggest installing OccamyPy in a conda environment like this with:

conda env create -n MYENV -f env.yml

History

This library was initially developed at Stanford Exploration Project for solving large scale seismic problems. Inspired by Equinor's PyLops we publish this library as our contribution to scientific community.

How it works

This framework allows for the definition of linear and non-linear mapping functions that operate on abstract vector objects that can be defined to use heterogeneous computational resources, from personal laptops to HPC environments.

  • vector class: this is the building block for handling data. It contains the required mathematical operations such as norm, scaling, dot-product, sum, point-wise multiplication. These methods can be implemented using existing libraries (e.g., Numpy, Cupy, PyTorch) or user-defined ones (e.g., SEPLib). See the vector subpackage for details and implementations.

  • operator class: a mapping function between a domain vector and a range vector. It can be linear and non-linear. Linear operators require the definition of both the forward and adjoint functions; non-linear operators require the forward mapping and its Jacobian operator. See the operator subpackage for details and implementations.

  • problem class: it represents the objective function related to an optimization problem. Defined upon operators (e.g., modeling and regularization) and vectors (observed data, priors). It contains the methods for objective function and gradient computation, as our solvers are mainly gradient based. See the problem subpackage for details and implementations.

  • solver class: it aims at finding the solution to a problem by employing methods defined within the vector, operator and problem classes. Additionally, it allows to restart an optimization method from an intermediate result written as serialized objects on permanent computer memory. We have a number of linear and nonlinear solver, along with some stepper algorithms. See the solver subpackage for details and implementations. Solvers come with a Logger object that we found helpful for saving large-scale inversions. Check it out in the tutorials!

Features at a glance

vector engines operators problems solvers
numpy linear least squares Conjugate Gradient
cupy nonlinear symmetric least squares Steepest Descent
torch distributed L2-reg least squares LSQR
LASSO symmetric Conjugate Gradient
generalized LASSO nonlinear Conjugate Gradient
nonlinear least squares L-BFGS
L2-reg nonlinear least squares L-BFGS-B
regularized Variable Projection Truncated Newton
Markov Chain Monte Carlo
ISTA and Fast-ISTA
ISTC (ISTA with cooling)
Split-Bregman

Scalability

The main objective of the described framework and implemented library is to solve large-scale inverse problems. Any vector and operator can be split into blocks to be distributed to multiple nodes. This is achieved via custom Dask vector and operator classes. See the dask subpackage for details and implementations.

Tutorials

We provide some tutorials that demonstrate the flexibility of occamypy. Please refer to them as a good starting point for developing your own code. If you have a good application example, contact us! We will be happy to see OccamyPy in action.

Check out the tutorial we gave at SWUNG's Transform 2022!

Contributing

Follow the following instructions and read carefully the CONTRIBUTING file before getting started.

We have a lot of ideas that might be helpful to scientists! We are currently working on:

  • wrapping linear operators to PyTorch optimizers: see the LS-RTM tutorial! This can be useful for using neural networks and operators (i.e., deep priors and physical modeling).
  • using PyTorch's functorch library to compute the Jacobian-vector product of nonlinear operators: see the first step!
  • implement computation-demanding operators natively in OccamyPy, so that they can be used on CPU/GPU and HPC clusters.

Authors

Citation

@article{biondi2021object,
  title = {An object-oriented optimization framework for large-scale inverse problems},
  author = {Ettore Biondi and Guillaume Barnier and Robert G. Clapp and Francesco Picetti and Stuart Farris},
  journal = {Computers & Geosciences},
  volume = {154},
  pages = {104790},
  year = {2021},
  doi = {https://doi.org/10.1016/j.cageo.2021.104790},
}

Publications using OccamyPy

  • E. Biondi, G. Barnier, R. G. Clapp, F. Picetti, and S. Farris. "Object-Oriented Optimization for Small- and Large-Scale Seismic Inversion Procedures", in European Association of Geophysicists and Engineers (EAGE) Workshop on High Performance Computing for Upstream, 2021. link.
  • E. Biondi, G. Barnier, R. G. Clapp, F. Picetti, and S. Farris. "Object-oriented optimization for large-scale seismic inversion of ocean-bottom-node pressure data", in International Conference on Parallel Computational Fluid Dynamics (ParCFD), 2021. link.

If you have one to add, reach us out!