Skip to content

Commit

Permalink
fix for latex formatting?
Browse files Browse the repository at this point in the history
  • Loading branch information
sydneyvernon committed Oct 3, 2024
1 parent 1c6513a commit 8d20d2c
Showing 1 changed file with 17 additions and 8 deletions.
25 changes: 17 additions & 8 deletions docs/src/accelerators.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,32 +6,41 @@ While developed for use with ensemble Kalman inversion (EKI), Accelerators have

## "Momentum" Acceleration in Gradient Descent

In traditional gradient descent, one iteratively solves for $x^*$, the minimizer of a function $f(x)$, by performing the update step $x_{k+1} = x_{k} + \alpha \nabla f(x_{k})$, where $\alpha$ is a step size parameter.
In 1983, Nesterov's momentum method was introduced to accelerate gradient descent. In the modified algorithm, the update step becomes $x_{k+1} = x_{k} + \beta (x_{k} - x_{k-1}) + \alpha \nabla f(x_{k} + \beta (x_{k} - x_{k-1}))$, where $\beta$ is a momentum coefficient. Intuitively, the method mimics a ball gaining speed while rolling down a constantly-sloped hill.
In traditional gradient descent, one iteratively solves for $x^*$, the minimizer of a function $f(x)$, by performing the update step

```math
x_{k+1} = x_{k} + \alpha \nabla f(x_{k}),
```

where $\alpha$ is a step size parameter.
In 1983, Nesterov's momentum method was introduced to accelerate gradient descent. In the modified algorithm, the update step becomes

```math
x_{k+1} = x_{k} + \beta (x_{k} - x_{k-1}) + \alpha \nabla f(x_{k} + \beta (x_{k} - x_{k-1})),
```

where $\beta$ is a momentum coefficient. Intuitively, the method mimics a ball gaining speed while rolling down a constantly-sloped hill.

## Implementation in Ensemble Kalman Inversion Algorithm

EKI can be understood as an approximation of gradient descent ([Kovachki and Stuart 2019](https://iopscience.iop.org/article/10.1088/1361-6420/ab1c3a)). This fact inspired the implementation of Nesterov momentum-inspired accelerators in the EKP package. For an overview on the algorithm without modification, please see the documentation for Ensemble Kalman Inversion.

The traditional update step for EKI is as follows, with $j = 1, ..., J$ denoting the ensemble member and $k$ denoting iteration number.
```math
\tag{2}
u_{k+1}^j = u_{k}^j + \Delta t C_{k}^{u\mathcal{G}} (\frac{1}{\Delta t}\Gamma + C^{\mathcal{G}\mathcal{G}}_k)^{-1} \left(y - \mathcal{G}(u_k^j)\right)
u_{k+1}^j = u_{k}^j + \Delta t C_{k}^{u\mathcal{G}} (\frac{1}{\Delta t}\Gamma + C^{\mathcal{G}\mathcal{G}}_k)^{-1} \left(y - \mathcal{G}(u_k^j)\right)
```

When using the ``NesterovAccelerator``, this update step is modified to include a term reminiscent of that in Nesterov's momentum method for gradient descent.

We first compute intermediate values:

```math
\tag{3}
v_k^j = u_k^j+ \beta_k (u_k^j - u_{k-1}^j)
v_k^j = u_k^j+ \beta_k (u_k^j - u_{k-1}^j)
```
We then update the ensemble:

```math
\tag{4}
u_{k+1}^j = v_{k}^j + \Delta t C_{k}^{u\mathcal{G}} (\frac{1}{\Delta t}\Gamma + C^{\mathcal{G}\mathcal{G}}_k)^{-1} \left(y - \mathcal{G}(v_k^j)\right)
u_{k+1}^j = v_{k}^j + \Delta t C_{k}^{u\mathcal{G}} (\frac{1}{\Delta t}\Gamma + C^{\mathcal{G}\mathcal{G}}_k)^{-1} \left(y - \mathcal{G}(v_k^j)\right)
```

The momentum coefficient $\beta_k$ here is recursively computed as $\beta_k = \theta_k(\theta_{k-1}^{-1}-1)$ in the ``NesterovAccelerator``, as derived in ([Su et al](https://jmlr.org/papers/v17/15-084.html)). Alternative accelerators are the ``FirstOrderNesterovAccelerator``, which uses $\beta_k = 1-3k^{-1}$, and the ``ConstantNesterovAccelerator``, which uses a specified constant coefficient, with the default being $\beta_k = 0.9$. The recursive ``NesterovAccelerator`` coefficient has generally been found to be the most effective in most test cases.
Expand Down

0 comments on commit 8d20d2c

Please sign in to comment.