Skip to content

Commit

Permalink
Lecture 12: Polishing touches
Browse files Browse the repository at this point in the history
  • Loading branch information
sadda committed May 10, 2021
1 parent ccfb70f commit c07c8f8
Show file tree
Hide file tree
Showing 5 changed files with 550 additions and 401 deletions.
6 changes: 3 additions & 3 deletions docs/src/lecture_11/sparse.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,10 +21,10 @@ Often, a regularization term is added. There are two possibilities. The [ridge r
Both approaches try to keep the norm of parameters ``w`` small to prevent overfitting. The first approach results in a simpler numerical method, while the second one induces sparsity. Before we start with both topics, we will briefly mention matrix decompositions which plays a crucial part in numerical computations.


## Theory of matrix eigendecomposition
## [Theory of matrix eigendecomposition](@id matrix-eigen)


Consider a square matrix ``A\in \mathbb R^{n\times n}`` with real-valued entries. We there exist ``\lambda\in\mathbb R`` and ``v\in\mathbb^n`` such that
Consider a square matrix ``A\in \mathbb R^{n\times n}`` with real-valued entries. We there exist ``\lambda\in\mathbb R`` and ``v\in\mathbb R^n`` such that

```math
Av = \lambda v,
Expand Down Expand Up @@ -85,7 +85,7 @@ Since this formula uses only matrix-vector multiplication and an inversion of a



## Theory of LASSO
## [Theory of LASSO](@id lasso)


Unlike ridge regression, LASSO does not have a closed-form solution. Since it is a structured convex problem, it can be solved the [ADMM algorithm](https://web.stanford.edu/~boyd/papers/pdf/admm_distr_stats.pdf). It is a primal-dual algorithm, which employs the primal original variable ``w``, the primal auxiliary variable ``z`` and the dual variable ``u`` with the iterative updates:
Expand Down
118 changes: 79 additions & 39 deletions docs/src/lecture_12/diff_eq.md
Original file line number Diff line number Diff line change
@@ -1,20 +1,29 @@
# Julia package

To solve differential equations, we will use package [DifferentialEquations](https://diffeq.sciml.ai/stable/).
To solve differential equations, we use the package [DifferentialEquations](https://diffeq.sciml.ai/stable/), which consider ODEs in the form

## Introduction

DifferentialEquations consider ODEs in the form
```math
\dot u(t) = f(t, u(t), p(t))
```
with the initial condition ``u(t_0)= u_0``. While ``u`` is the solution, ``p`` described external parameters.

We will start with a simple problem
with the initial condition ``u(t_0)= u_0``. While ``u`` is the solution, ``p`` describes external parameters.




## Introduction

We start with the following simple problem:

```math
\dot u(t) = 0.98u
\begin{aligned}
\dot u(t) &= 0.98u, \\
u(0) &= 1.
\end{aligned}
```
with initial condition ``u(0) = 1``. This has closed-form solution ``u(t) = e^{0.98t}``. To solve this ODE by ```DifferentialEquations```, we first need to create the problem ```prob``` by supplying function ``f``, the initial point ``u_0`` and the time interval ``[t_0,t_1]`` to the constructor ```ODEProblem```

This equation has the closed-form solution ``u(t) = e^{0.98t}``. To solve it by `DifferentialEquations`, we first need to create the problem `prob` by supplying the function ``f``, the initial point ``u_0`` and the time interval ``[t_0,t_1]`` to the constructor `ODEProblem`.

```@example intro
using DifferentialEquations
Expand All @@ -27,49 +36,65 @@ prob = ODEProblem(f, u0, tspan)
nothing # hide
```
We can solve the ODE by

Then we use the `solve` function to solve the equation.

```@example intro
sol = solve(prob)
```
The first line specifies that the solution was successful. We can automatically check whether the solution was successful by ```sol.retcode == :Success```. The second line specifies the interpolation method. Even though the solution was evaluated at only 5 points ```sol.t``` with values ```sol.u```, the interpolation

The first line says that the solution was successful, which can be automatically checked by accessing the field `sol.retcode`. The second line specifies the interpolation method, and the following lines the solution. Even though the solution was evaluated at only five points, the interpolation allows plotting a smooth function.

```@example intro
using Plots
plot(sol)
plot(sol; label="")
savefig("intro.svg") # hide
```

![](intro.svg)

The ```sol``` structure is heavily overloaded. It can be used to evaluate the solution ``u`` at any time
The `sol` structure can be used to evaluate the solution ``u``.

```@example intro
sol(0.8)
```

The next exercise shows how to specify the interpolation technique and compares the resutlts.






The following exercise shows how to specify the interpolation technique and compares the results.

```@raw html
<div class = "exercise-body">
<header class = "exercise-header">Exercise:</header><p>
```
When calling the ```solve``` function, we can specify the interpolation way. Solve the ODE with linear interpolation (```dense=false```) and the Runge-Kutta method of fourth order (```RK4()```). Plot the results and compare them with the default and original solutions.

When calling the `solve` function, we can specify the interpolation way. Solve the ODE with linear interpolation (`dense=false`) and the Runge-Kutta method of the fourth order (`RK4()`). Plot the results and compare them with the default and original solutions.

```@raw html
</p></div>
<details class = "solution-body">
<summary class = "solution-header">Solution:</summary><p>
```
To compues the additional solutions, we add the arguments as specified above

To compute the additional solutions, we add the arguments as specified above.

```@example intro
sol2 = solve(prob, dense=false)
sol3 = solve(prob, RK4())
nothing # hide
```
For plotting, we create a discretization ```ts``` of the time interval and then plot the four functions

We create a discretization ```ts``` of the time interval and then plot the four functions.

```@example intro
ts = collect(range(tspan[1], tspan[2], length=100))
ts = range(tspan...; length = 100)
plot(ts, t->exp(0.98*t), label="True solution", legend=:topleft)
plot!(ts, t->sol(t), label="Default")
Expand All @@ -84,23 +109,26 @@ savefig("Comparison.svg") # hide

![](Comparison.svg)

We see that all solutions are the same with the exception of the linear approximation.
We see that all solutions are the same except for the linear approximation.



## Lorenz system

[Lorenz system](https://en.wikipedia.org/wiki/Lorenz_system) is a prime example of the [butterfly effect](https://en.wikipedia.org/wiki/Butterfly_effect) in the chaos theory. There, a small changes in the initial conditions results in large changes after a long time. This effect was first described in 1961 during work on weather modelling.
The [Lorenz system](https://en.wikipedia.org/wiki/Lorenz_system) is a prime example of the [butterfly effect](https://en.wikipedia.org/wiki/Butterfly_effect) in the chaos theory. There, small changes in the initial conditions result in large changes in the solution. This effect was first described in 1961 during work on weather modelling.

The following equations describe the three-dimensional Lorenz system:

The three-dimensional Lorenz system is described by the set of equations
```math
\begin{aligned}
\frac{\partial x}{\partial t} &= \sigma (y - x), \\
\frac{\partial y}{\partial t} &= x (\rho - z) - y, \\
\frac{\partial z}{\partial t} &= x y - \beta z.
\end{aligned}
```
We define the right-hand side

We first define the right-hand side of the system.

```@example intro
function lorenz(u, p, t)
σ, ρ, β = p
Expand All @@ -112,7 +140,9 @@ end
nothing # hide
```
The parameters are saved in a tuple or array ```p```. Since the right-hand side of the Lorenz system is a vector, we need to return a vector as well. Now, we compute the solution in the same way as before.

The parameters are saved in a tuple or array `p`. Since the right-hand side of the Lorenz system is a vector, we need to return a vector as well. Now, we compute the solution in the same way as before.

```@example intro
u0 = [1.0; 0.0; 0.0]
p = [10; 28; 8/3]
Expand All @@ -124,7 +154,9 @@ sol = solve(prob)
nothing # hide
```
Using the same function to plot

We plot the solution:

```@example intro
plot(sol)
Expand All @@ -133,7 +165,8 @@ savefig("lorenz0.svg") # hide

![](lorenz0.svg)

results in two-dimensional graph of all coorinates. To plot 3D, we need to specify it
Since this is a two-dimensional graph of all coordinates, we need to specify that we want to plot a 3D graph.

```@example intro
plt1 = plot(sol, vars=(1,2,3), label="")
Expand All @@ -142,7 +175,8 @@ savefig("lorenz1.svg") # hide

![](lorenz1.svg)

We see again the power of interpolation. If we used linear interpolation (connected the points)
We see the power of interpolation again. If we used linear interpolation, which amounts to connecting the points, we would obtain a much coarse graph.

```@example intro
plot(sol, vars=(1,2,3), denseplot=false; label="")
Expand All @@ -151,34 +185,34 @@ savefig("lorenz2.svg") # hide

![](lorenz2.svg)

we would obtain a much coarse graph. This shows the strength of the ```DifferentialEquations``` package. With a small computational effort, it is able to compute a good solution. Note that the last plotting call is equivalent to
This graph shows the strength of the `DifferentialEquations` package. With a small computational effort, it can compute a good solution. Note that the last plotting call is equivalent to:

```julia
traj = hcat(sol.u...)
plot(traj[1,:], traj[2,:], traj[3,:]; label="")
```

In the introduction to this part, we mentioned the chaos theory. We will elaborate on this in the next exercise.
In the introduction, we mentioned chaos theory. We will elaborate on this in the following exercise.

```@raw html
<div class = "exercise-body">
<header class = "exercise-header">Exercise:</header><p>
```
Consider the same setting as above but perturb the first parameter of ```p``` by the smallest possible value (with respect to the machine precision). Then solve the Lorenz system again and compare results by plotting the two trajectories next to each other.

Use the `nextfloat` function to perturb the first parameter of `p` by the smallest possible value. Then solve the Lorenz system again and compare the results by plotting the two trajectories next to each other.

```@raw html
</p></div>
<details class = "solution-body">
<summary class = "solution-header">Solution:</summary><p>
```
The machine precision can be obtained by ```eps(T)```, where ```T``` is the desired type. However, when we add this to ```p[1]```, we obtain the same number

We start with the smallest possible perturbation of the initial value.

```@example intro
p[1] + eps(eltype(p)) == p[1]
p0 = (nextfloat(p[1]), p[2:end]...)
```
The reason is that ```p[1]``` has value 10 and the sum exceeds the allowed number of valid digits and it is truncated back to 10. We therefore cheat a bit and manually modify the number
```@example intro
p0 = (10.000000000000001,28,8/3)

nothing # hide
```
Then we plot the graphs as before
```@example intro
prob0 = ODEProblem(lorenz, u0, tspan, p0)
Expand All @@ -196,31 +230,37 @@ savefig("lorenz4.svg") # hide

![](lorenz4.svg)

The solutions look obviously different. Comparing the terminal states of both solutions
The solutions look different. Comparing the terminal states of both solutions

```@example intro
hcat(sol(tspan[2]), sol0(tspan[2]))
```

shows that they are different by a large margin. This raises a natural question.


```@raw html
<div class = "exercise-body">
<header class = "exercise-header">Exercise:</header><p>
```

Can we trust the solutions? Why?

```@raw html
</p></div>
<details class = "solution-body">
<summary class = "solution-header">Solution:</summary><p>
```

Unfortunately, we cannot. Numerical methods always introduce some errors by
- *Rounding errors* due to representing real numbers in machine precision.
- *Discretization errors* for continuous systems when the derivative is approximated by some kind of finite difference.
However, if the system itself is unstable in the way that an extremely small perturbation results in big differences in solutions, the numerical method even enhances these errors. The solution could be trusted on some small interval but not after it.
- *Discretization errors* for continuous systems when the finite difference method approximates the derivative.
However, if the system itself is unstable and an extremely small perturbation results in big differences in solutions, the numerical method even enhances these errors. The solution could be trusted on some small interval but not after it.

```@raw html
</p></details>
```

The next section will show a situation where we try to mitigate this possible effect by using mathematical formulas to compute the exact solution as long as possible. This delays the necessary discretization and may bring a better stability.
The following section shows a situation where we try to mitigate this possible effect by using mathematical formulas to compute the exact solution as long as possible. This aproach delays the necessary discretization and may bring better stability.


Loading

0 comments on commit c07c8f8

Please sign in to comment.