diff --git a/docs/src/lecture_11/sparse.md b/docs/src/lecture_11/sparse.md index f42b1d54f..d935754ef 100644 --- a/docs/src/lecture_11/sparse.md +++ b/docs/src/lecture_11/sparse.md @@ -21,10 +21,10 @@ Often, a regularization term is added. There are two possibilities. The [ridge r Both approaches try to keep the norm of parameters ``w`` small to prevent overfitting. The first approach results in a simpler numerical method, while the second one induces sparsity. Before we start with both topics, we will briefly mention matrix decompositions which plays a crucial part in numerical computations. -## Theory of matrix eigendecomposition +## [Theory of matrix eigendecomposition](@id matrix-eigen) -Consider a square matrix ``A\in \mathbb R^{n\times n}`` with real-valued entries. We there exist ``\lambda\in\mathbb R`` and ``v\in\mathbb^n`` such that +Consider a square matrix ``A\in \mathbb R^{n\times n}`` with real-valued entries. We there exist ``\lambda\in\mathbb R`` and ``v\in\mathbb R^n`` such that ```math Av = \lambda v, @@ -85,7 +85,7 @@ Since this formula uses only matrix-vector multiplication and an inversion of a -## Theory of LASSO +## [Theory of LASSO](@id lasso) Unlike ridge regression, LASSO does not have a closed-form solution. Since it is a structured convex problem, it can be solved the [ADMM algorithm](https://web.stanford.edu/~boyd/papers/pdf/admm_distr_stats.pdf). It is a primal-dual algorithm, which employs the primal original variable ``w``, the primal auxiliary variable ``z`` and the dual variable ``u`` with the iterative updates: diff --git a/docs/src/lecture_12/diff_eq.md b/docs/src/lecture_12/diff_eq.md index ed870ee7b..214a306ec 100644 --- a/docs/src/lecture_12/diff_eq.md +++ b/docs/src/lecture_12/diff_eq.md @@ -1,20 +1,29 @@ # Julia package -To solve differential equations, we will use package [DifferentialEquations](https://diffeq.sciml.ai/stable/). +To solve differential equations, we use the package [DifferentialEquations](https://diffeq.sciml.ai/stable/), which consider ODEs in the form -## Introduction - -DifferentialEquations consider ODEs in the form ```math \dot u(t) = f(t, u(t), p(t)) ``` -with the initial condition ``u(t_0)= u_0``. While ``u`` is the solution, ``p`` described external parameters. -We will start with a simple problem +with the initial condition ``u(t_0)= u_0``. While ``u`` is the solution, ``p`` describes external parameters. + + + + +## Introduction + +We start with the following simple problem: + ```math -\dot u(t) = 0.98u +\begin{aligned} +\dot u(t) &= 0.98u, \\ +u(0) &= 1. +\end{aligned} ``` -with initial condition ``u(0) = 1``. This has closed-form solution ``u(t) = e^{0.98t}``. To solve this ODE by ```DifferentialEquations```, we first need to create the problem ```prob``` by supplying function ``f``, the initial point ``u_0`` and the time interval ``[t_0,t_1]`` to the constructor ```ODEProblem``` + +This equation has the closed-form solution ``u(t) = e^{0.98t}``. To solve it by `DifferentialEquations`, we first need to create the problem `prob` by supplying the function ``f``, the initial point ``u_0`` and the time interval ``[t_0,t_1]`` to the constructor `ODEProblem`. + ```@example intro using DifferentialEquations @@ -27,49 +36,65 @@ prob = ODEProblem(f, u0, tspan) nothing # hide ``` -We can solve the ODE by + +Then we use the `solve` function to solve the equation. + ```@example intro sol = solve(prob) ``` -The first line specifies that the solution was successful. We can automatically check whether the solution was successful by ```sol.retcode == :Success```. The second line specifies the interpolation method. Even though the solution was evaluated at only 5 points ```sol.t``` with values ```sol.u```, the interpolation + +The first line says that the solution was successful, which can be automatically checked by accessing the field `sol.retcode`. The second line specifies the interpolation method, and the following lines the solution. Even though the solution was evaluated at only five points, the interpolation allows plotting a smooth function. ```@example intro using Plots -plot(sol) +plot(sol; label="") savefig("intro.svg") # hide ``` ![](intro.svg) -The ```sol``` structure is heavily overloaded. It can be used to evaluate the solution ``u`` at any time +The `sol` structure can be used to evaluate the solution ``u``. + ```@example intro sol(0.8) ``` -The next exercise shows how to specify the interpolation technique and compares the resutlts. + + + + + + +The following exercise shows how to specify the interpolation technique and compares the results. ```@raw html
Exercise:

``` -When calling the ```solve``` function, we can specify the interpolation way. Solve the ODE with linear interpolation (```dense=false```) and the Runge-Kutta method of fourth order (```RK4()```). Plot the results and compare them with the default and original solutions. + +When calling the `solve` function, we can specify the interpolation way. Solve the ODE with linear interpolation (`dense=false`) and the Runge-Kutta method of the fourth order (`RK4()`). Plot the results and compare them with the default and original solutions. + ```@raw html

Solution:

``` -To compues the additional solutions, we add the arguments as specified above + +To compute the additional solutions, we add the arguments as specified above. + ```@example intro sol2 = solve(prob, dense=false) sol3 = solve(prob, RK4()) nothing # hide ``` -For plotting, we create a discretization ```ts``` of the time interval and then plot the four functions + +We create a discretization ```ts``` of the time interval and then plot the four functions. + ```@example intro -ts = collect(range(tspan[1], tspan[2], length=100)) +ts = range(tspan...; length = 100) plot(ts, t->exp(0.98*t), label="True solution", legend=:topleft) plot!(ts, t->sol(t), label="Default") @@ -84,15 +109,16 @@ savefig("Comparison.svg") # hide ![](Comparison.svg) -We see that all solutions are the same with the exception of the linear approximation. +We see that all solutions are the same except for the linear approximation. ## Lorenz system -[Lorenz system](https://en.wikipedia.org/wiki/Lorenz_system) is a prime example of the [butterfly effect](https://en.wikipedia.org/wiki/Butterfly_effect) in the chaos theory. There, a small changes in the initial conditions results in large changes after a long time. This effect was first described in 1961 during work on weather modelling. +The [Lorenz system](https://en.wikipedia.org/wiki/Lorenz_system) is a prime example of the [butterfly effect](https://en.wikipedia.org/wiki/Butterfly_effect) in the chaos theory. There, small changes in the initial conditions result in large changes in the solution. This effect was first described in 1961 during work on weather modelling. + +The following equations describe the three-dimensional Lorenz system: -The three-dimensional Lorenz system is described by the set of equations ```math \begin{aligned} \frac{\partial x}{\partial t} &= \sigma (y - x), \\ @@ -100,7 +126,9 @@ The three-dimensional Lorenz system is described by the set of equations \frac{\partial z}{\partial t} &= x y - \beta z. \end{aligned} ``` -We define the right-hand side + +We first define the right-hand side of the system. + ```@example intro function lorenz(u, p, t) σ, ρ, β = p @@ -112,7 +140,9 @@ end nothing # hide ``` -The parameters are saved in a tuple or array ```p```. Since the right-hand side of the Lorenz system is a vector, we need to return a vector as well. Now, we compute the solution in the same way as before. + +The parameters are saved in a tuple or array `p`. Since the right-hand side of the Lorenz system is a vector, we need to return a vector as well. Now, we compute the solution in the same way as before. + ```@example intro u0 = [1.0; 0.0; 0.0] p = [10; 28; 8/3] @@ -124,7 +154,9 @@ sol = solve(prob) nothing # hide ``` -Using the same function to plot + +We plot the solution: + ```@example intro plot(sol) @@ -133,7 +165,8 @@ savefig("lorenz0.svg") # hide ![](lorenz0.svg) -results in two-dimensional graph of all coorinates. To plot 3D, we need to specify it +Since this is a two-dimensional graph of all coordinates, we need to specify that we want to plot a 3D graph. + ```@example intro plt1 = plot(sol, vars=(1,2,3), label="") @@ -142,7 +175,8 @@ savefig("lorenz1.svg") # hide ![](lorenz1.svg) -We see again the power of interpolation. If we used linear interpolation (connected the points) +We see the power of interpolation again. If we used linear interpolation, which amounts to connecting the points, we would obtain a much coarse graph. + ```@example intro plot(sol, vars=(1,2,3), denseplot=false; label="") @@ -151,34 +185,34 @@ savefig("lorenz2.svg") # hide ![](lorenz2.svg) -we would obtain a much coarse graph. This shows the strength of the ```DifferentialEquations``` package. With a small computational effort, it is able to compute a good solution. Note that the last plotting call is equivalent to +This graph shows the strength of the `DifferentialEquations` package. With a small computational effort, it can compute a good solution. Note that the last plotting call is equivalent to: + ```julia traj = hcat(sol.u...) plot(traj[1,:], traj[2,:], traj[3,:]; label="") ``` -In the introduction to this part, we mentioned the chaos theory. We will elaborate on this in the next exercise. +In the introduction, we mentioned chaos theory. We will elaborate on this in the following exercise. ```@raw html

Exercise:

``` -Consider the same setting as above but perturb the first parameter of ```p``` by the smallest possible value (with respect to the machine precision). Then solve the Lorenz system again and compare results by plotting the two trajectories next to each other. + +Use the `nextfloat` function to perturb the first parameter of `p` by the smallest possible value. Then solve the Lorenz system again and compare the results by plotting the two trajectories next to each other. + ```@raw html

Solution:

``` -The machine precision can be obtained by ```eps(T)```, where ```T``` is the desired type. However, when we add this to ```p[1]```, we obtain the same number + +We start with the smallest possible perturbation of the initial value. + ```@example intro -p[1] + eps(eltype(p)) == p[1] +p0 = (nextfloat(p[1]), p[2:end]...) ``` -The reason is that ```p[1]``` has value 10 and the sum exceeds the allowed number of valid digits and it is truncated back to 10. We therefore cheat a bit and manually modify the number -```@example intro -p0 = (10.000000000000001,28,8/3) -nothing # hide -``` Then we plot the graphs as before ```@example intro prob0 = ODEProblem(lorenz, u0, tspan, p0) @@ -196,10 +230,12 @@ savefig("lorenz4.svg") # hide ![](lorenz4.svg) -The solutions look obviously different. Comparing the terminal states of both solutions +The solutions look different. Comparing the terminal states of both solutions + ```@example intro hcat(sol(tspan[2]), sol0(tspan[2])) ``` + shows that they are different by a large margin. This raises a natural question. @@ -207,20 +243,24 @@ shows that they are different by a large margin. This raises a natural question.

Exercise:

``` + Can we trust the solutions? Why? + ```@raw html

Solution:

``` + Unfortunately, we cannot. Numerical methods always introduce some errors by - *Rounding errors* due to representing real numbers in machine precision. -- *Discretization errors* for continuous systems when the derivative is approximated by some kind of finite difference. -However, if the system itself is unstable in the way that an extremely small perturbation results in big differences in solutions, the numerical method even enhances these errors. The solution could be trusted on some small interval but not after it. +- *Discretization errors* for continuous systems when the finite difference method approximates the derivative. +However, if the system itself is unstable and an extremely small perturbation results in big differences in solutions, the numerical method even enhances these errors. The solution could be trusted on some small interval but not after it. + ```@raw html

``` -The next section will show a situation where we try to mitigate this possible effect by using mathematical formulas to compute the exact solution as long as possible. This delays the necessary discretization and may bring a better stability. +The following section shows a situation where we try to mitigate this possible effect by using mathematical formulas to compute the exact solution as long as possible. This aproach delays the necessary discretization and may bring better stability. diff --git a/docs/src/lecture_12/ode.md b/docs/src/lecture_12/ode.md index 0265fa292..ac3a40ff4 100644 --- a/docs/src/lecture_12/ode.md +++ b/docs/src/lecture_12/ode.md @@ -1,119 +1,104 @@ # Wave equation -[Wave equation](https://en.wikipedia.org/wiki/Wave_equation) is one of the most important differential equation. It models propagation of waves and has numerous applications in acoustics, electromagnetics or fluid dynamics. +[Wave equation](https://en.wikipedia.org/wiki/Wave_equation) is one of the most important differential equation. It models wave propagation and has numerous applications in acoustics, electromagnetics or fluid dynamics. ## Statement -We consider the simplest case of one-dimensional wave equation such as a string. The wave equation on ``t\in[0,T]`` has the form +We consider the simplest case of a one-dimensional wave equation, such as a string. The wave equation in ``t\in[0,T]`` has the form + ```math \frac{\partial^2 y(t,x)}{\partial t^2} = c^2 \frac{\partial^2 y(t,x)}{\partial x^2}. ``` + The function ``y:[0,T]\times [0,L]\to\mathbb R`` describes the displacement of the string. To obtain a complete formulation, we need to add boundary (in space) and initial (in time) conditions. Assuming that the string is fixed on its edges, the boundary conditions + ```math y(\cdot,0) = y_0, \quad y(\cdot,L) = y_L ``` + are time-independent. The initial conditions are prescribed by functions + ```math \begin{aligned} y(0,\cdot) &= f(\cdot), \\ \frac{\partial y(0,\cdot)}{\partial t} &= g(\cdot), \end{aligned} ``` + which vary in space. For consistency, we need ``f(0)=y_0`` and ``f(L)=y_L``. + + ## Solving the wave equation -The following few exercises show how to solve the wave equation via the [finite differences](@ref comp-grad) technique. +The following few exercises show how to solve the wave equation via the [finite differences](@ref comp-grad) technique. It discretizes both time and space into equidistant discretization. For a function ``h`` and a discretization stepsize ``\Delta x``, the approximation of the first derivative reads + +```math +\frac{\partial}{\partial x}{h(x)} = \frac{h(x+\Delta x) - h(x)}{\Delta x}. +``` + +A similar formula for the second derivatives reads + +```math +\frac{\partial^2}{\partial x^2}{h(x)} = \frac{h(x+\Delta x) - 2h(x) + h(x-\Delta x)}{\Delta x^2}. +``` + +The following exercise derives the mathematical formulas needed for solving the wave equation. + ```@raw html
Exercise:

``` -Design a numerical method to solve the one-dimensional wave equation on ``[0,T]\times [0,L]`` by applying finite differences in time and space. Derive the formulas. + +Consider equidistant discretizations with stepsizes ``\Delta t`` and ``\Delta x``. Derive mathematical formulas for solving the one-dimensional wave equation on ``[0,T]\times [0,L]`` by applying finite differences in time and space. Do not write any code. + +**Hint**: Start with the initial time and compute the solution after each time step. Use the condition on ``f`` at the first time step, the condition on ``g`` at the second time step and the wave equation at further steps. + ```@raw html

Solution:

``` -To discretize, we need to choose stepsizes ``\Delta t`` and ``\Delta x``. For simplicity, we assume that the discretization is uniform (the length of the interval ``T`` can be divided by the time step ``\Delta t`` and similarly for ``L`` and ``\Delta x``). -The initial conditions prescribe the value +The wave equation needs to satisfy the boundary conditions + ```math -y(0,x) = f(x) +y(t,0) = f(0),\qquad y(t,L) = f(L) \qquad\text{ for all }t\in\{0,\Delta t,2\Delta t,\dots,T\} ``` -for all ``x\in\{0,\Delta x,2\Delta x,\dots,L\}``. For the values at ``\Delta t``, we approximate the initial condition for the derivative by the finite difference and get -```math -y(\Delta t, x) = y(0, x) + \Delta t g(x). -``` -Since the finite difference approximation of the first derivative is + +and the initial conditions + ```math -\frac{\partial y(t,x)}{\partial t} \approx \frac{y(t + \Delta t,x) - y(t,x)}{\Delta t}, +y(0,x) = f(x) \qquad\text{ for all }x\in\{\Delta x,2\Delta x,\dots,L-\Delta x\}. ``` -the finite difference approximation of the second derivative can be obtained in the same way by + +We exclude ``x\in \{0,L\}`` from the last equation because the boundary conditions already prescribe these values. + +Now we start increasing time. For the values at ``\Delta t``, we approximate the initial condition for the derivative by the finite difference and get + ```math -\begin{aligned} -\frac{\partial^2 y(t,x)}{\partial t^2} &= \frac{1}{\Delta t}\left(\frac{y(t + \Delta t,x) - y(t,x)}{\Delta t} - \frac{y(t,x) - y(t-\Delta t,x)}{\Delta t}\right) \\ -&= \frac{y(t+\Delta t,x) - 2y(t,x) + y(t-\Delta t,x)}{\Delta t^2}. -\end{aligned} +y(\Delta t, x) = y(0, x) + \Delta t g(x). ``` -Doing the same discretization for ``x`` and plugging the result into the wave equation yields + +At further times, we use the finite difference approximation of the second derivative to arrive at + ```math \frac{y(t+\Delta t,x) - 2y(t,x) + y(t-\Delta t,x)}{\Delta t^2} = c^2 \frac{y(t,x+\Delta x) - 2y(t,x) + y(t,x-\Delta x)}{\Delta x^2}. ``` -The computation now is the same as for a normal ODE. We know the initial state ``y(\cdot,0)``, then we compute ``y(\cdot,\Delta t)``, then ``y(\cdot, 2\Delta t)`` and so on. These states can be computed by rearranging the previous formula to + +Since we already know the values at ``t`` and ``t - \Delta t``, we rearrange the previous formula to obtain the values at the next time. This yields the final formula: + ```math y(t + \Delta t,x) = \frac{c^2\Delta t^2}{\Delta x^2} \Big(y(t,x + \Delta x) - 2y(t,x) + y(t,x - \Delta x)\Big) + 2y(t,x) - y(t - \Delta t,x). ``` -This formula can be applied for ``t=2\Delta t, 3\Delta t, \dots,T``. + ```@raw html

``` -Before writing any code, it may be a good idea to decide on its structure. The following exercise aims to do so. -```@raw html -
-
Exercise:

-``` -Write function declaration (function name, inputs and outputs) which will be needed to solve the wave equation and to plot the solution. Do not write any code. -```@raw html -

-
-Solution:

-``` -Before we can solve the wave equation, we need to perform discretization of time and space. We write the ```discretize``` function, whose inputs are the limits ```xlims```. We use optional keyword arguments, which may specify the stepsize or the number of discretized points. The output will be the discretization ```xs```. Even though we denote all inputs and outputs with ``x``, this function will be used for time as well. -```julia -function discretize(xlims; kwargs...) - ... - return xs -end -``` -The simplest way to work with objects with lots of parameters, is to create a structure to save these parameters. We therefore create ```struct Wave``` with fields ```f```, ```g``` and ```c```. We do not use the boundary values ``y_0`` and ``y_L`` as they can be computed from ``f``. -```julia -struct Wave - f - g - c -end -``` -The wave equation is solved in ```solve_wave(ts, xs, wave::Wave)```. Its inputs are the time discretization ```ts```, the space discretization ```xs``` and the wave parameters stored in ```wave```. It returns a matrix ```y``` with dimensions equals the number of time and space points. -```julia -function solve_wave(ts, xs, wave::Wave) - ... - return y -end -``` -Finally, to plot, we define ```plot_wave``` function, where the inputs are the computed wave equation ```y``` and a name file name ```file_name``` to save the animation to. The optional arguments can be the number of frames per second ```fps``` and any additional arguments used for plotting. -```julia -function plot_wave(y, file_name; fps=60, kwargs...) - ... - return nothing -end -``` -```@raw html -

-``` @@ -121,9 +106,12 @@ end -The most difficult part is done. We have done the thinking and finished the code structure. Now we just need to do manual labour and fill the empty functions with code. -```@setup wave + + +The most challenging part is done: We have finished the discretization scheme. Now we need to code it. We will employ a structure storing the wave equation parameters. + +```@example wave struct Wave f g @@ -131,82 +119,102 @@ struct Wave end ``` +The first exercise solves the wave equation. + ```@raw html
Exercise:

``` -Write the functions from the previous exercise. Do not forget to include any pacakges needed. + +Write the function `solve_wave(T, L, wave::Wave; n_t=100, n_x=100)` that solves the wave equation. + +**Hint**: Follow the procedure from the previous exercise. Discretize time and space, initialize the solution, add the boundary conditions, add the initial conditions and finally, iterate over time. + ```@raw html

Solution:

``` -The discretization will use the ```range``` function. We pass to it the keyword arguments, which will usually be either the length of the sequence ```length``` or the discretization step ```step```. To obtain an array, we use the ```collect``` function. If the step is specified, the last point of ```xs``` may be different from ```xlims[2]```. In such a case, we add it and throw a warning. -```@example wave -using Plots -using LinearAlgebra -function discretize(xlims; kwargs...) - xs = range(xlims[1], xlims[2]; kwargs...) |> collect - if xs[end] != xlims[2] - @warn "Discretization not equidistant." - push!(xs, xlims[2]) - end - return xs -end +We first discretize both time and space by the `range` function. Then we initialize the matrix `y`. We decide that the first dimension corresponds to time and the second one to space. We set the boundary conditions and fill `y[:,1]` with `wave.f(0)` and `y[:,end]` with `wave.f(L)`. Since the wave at the initial moment equals to ``f``, we set `y[1,2:end-1] = wave.f.(xs[2:end-1])`. Since the condition at ``t=\Delta t`` amount to -nothing # hide +```math +y(\Delta t, x) = y(0, x) + \Delta t g(x), ``` -To solve the wave equation, we first perform a check that both discretizations are uniform. The better way would be to write a function which admits a non-equidistant discretization but we did not derive the formulas for it. If ```ts``` is equidistant, then ```diff(ts)``` should be a vector of constants and therefore ```diff(diff(ts))``` should be a vector of zeros. +we write `y[2,2:end-1] = y[1,2:end-1] + Δt*wave.g.(xs[2:end-1])`. We must not forget to exclude the boundary points because the string position is attached there. For the remaining times, we use the formula -Then we initialize ```y``` with zeros and set the initial condition ```y[1,:]``` via ```wave.f``` and the boundary conditions ```y[:,1]``` and ```y[:,end]``` which are fixed and, therefore, the same as ```y[1,1]``` and ```y[1,end]```, respectively. We recall that the formulas for computing the solution are ```math -\begin{aligned} -y(\Delta t, x) &= y(0, x) + \Delta t g(x), \\ -y(t + \Delta t,x) &= \frac{c^2\Delta t^2}{\Delta x^2} \Big(y(t,x + \Delta x) - 2y(t,x) + y(t,x - \Delta x)\Big) + 2y(t,x) - y(t - \Delta t,x). -\end{aligned} +y(t + \Delta t,x) = \frac{c^2\Delta t^2}{\Delta x^2} \Big(y(t,x + \Delta x) - 2y(t,x) + y(t,x - \Delta x)\Big) + 2y(t,x) - y(t - \Delta t,x). ``` -Since the boundary conditions are prescribed, we set the first condition to ```y[2,2:end-1]``` and the other to ```y[i+1,2:end-1]```. -```@example wave -function solve_wave(ts, xs, wave::Wave) - norm(diff(diff(ts))) <= 1e-10 || error("Time discretization must be equidistant.") - norm(diff(diff(xs))) <= 1e-10 || error("Space discretization must be equidistant.") - n_t = length(ts) - n_x = length(xs) +This gives rise to the following function. + +```@example wave +function solve_wave(T, L, wave::Wave; n_t=100, n_x=100) + ts = range(0, T; length=n_t) + xs = range(0, L; length=n_x) + Δt = ts[2] - ts[1] + Δx = xs[2] - xs[1] + y = zeros(n_t, n_x) - y = zeros(n_t, n_x) - y[1,:] = wave.f.(xs) - y[:,1] .= y[1,1] - y[:,end] .= y[1,end] - y[2,2:end-1] = y[1,2:end-1] + (ts[2]-ts[1])*wave.g.(xs[2:end-1]) - - s = wave.c^2 * (ts[2]-ts[1])^2 / (xs[2]-xs[1])^2 - for i in 2:n_t-1 - y[i+1,2:end-1] .= s*(y[i,3:end]+y[i,1:end-2]) + 2*(1-s)*y[i,2:end-1] - y[i-1,2:end-1] + # boundary conditions + y[:,1] .= wave.f(0) + y[:,end] .= wave.f(L) + + # initial conditions + y[1,2:end-1] = wave.f.(xs[2:end-1]) + y[2,2:end-1] = y[1,2:end-1] + Δt*wave.g.(xs[2:end-1]) + + # solution for t = 2Δt, 3Δt, ..., T + for t in 2:n_t-1, x in 2:n_x-1 + ∂y_xx = (y[t, x+1] - 2*y[t, x] + y[t, x-1])/Δx^2 + y[t+1, x] = c^2 * Δt^2 * ∂y_xx + 2*y[t, x] - y[t-1, x] end + return y end nothing # hide ``` -The best visualization of the wave equation is via an animation. Each frame will be a plot of a row of ```y```. We use the keyword arguments ```kwargs```. We run the for loop over all rows, create the animation via the ```@animate``` macro and save it into ```anim```. To save the animation to the hard drive, we use the ```gif``` function, for which we specify fps. + +```@raw html +

+``` + + + + + + + +The best visualization of the wave equation is via animation. Each frame will be a plot of a row of `y`. We use the keyword arguments `kwargs`, where we store additional arguments for plotting. We run the for loop over all rows, create the animation via the `@animate` macro and save it into `anim`. To save the animation to the hard drive, we use the `gif` function. + + ```@example wave -function plot_wave(y, file_name; fps=60, kwargs...) - anim = @animate for y_row in eachrow(y) - plot(y_row; kwargs...) +using Plots + +function plot_wave(y, file_name; fps = 60, kwargs...) + anim = @animate for (i, y_row) in enumerate(eachrow(y)) + plot( + y_row; + title = "t = $(i-1)Δt", + xlabel = "x", + ylabel = "y(t, x)", + legend = false, + linewidth = 2, + kwargs... + ) end - gif(anim, file_name, fps=fps) + gif(anim, file_name; fps, show_msg = false) + return nothing end nothing # hide ``` -```@raw html -

-``` + @@ -218,48 +226,47 @@ Now we can finally plot the solution.
Exercise:

``` + Solve the wave equation for ``L=\frac32\pi``, ``T=240``, ``c=0.02`` and the initial conditions + ```math \begin{aligned} -f(x) &= 2e^{-(x-\frac L2)^2} + \frac{y_L}{L}x, \\ +f(x) &= 2e^{-(x-\frac L2)^2} + \frac{x}{L}, \\ g(x) &= 0. \end{aligned} ``` -Do the time discretization with stepsize ``\Delta t=1`` and the space discretization with ``N_x=101`` and ``N_x=7`` steps (plot two graphs, each for one ``N_x``). + +Use time discretization with stepsize ``\Delta t=1`` and the space discretization with number of points ``n_x=101`` and ``n_x=7`` steps. Plot two graphs. + ```@raw html

Solution:

``` + First, we assign the parameters + ```@example wave -f(x,L,y_L) = 2*exp(-(x-L/2)^2) + y_L*x/L +f(x,L) = 2*exp(-(x-L/2)^2) + x/L g(x) = 0 L = 1.5*pi T = 240 c = 0.02 -y_L = 1 nothing # hide ``` -Since we want to do the same task for two different ``N_x``, we write a function ```run_wave```, which performs the discretizations, creates the ```Wave``` structure, solves the wave equation and finally plots the wave. We use different keywords for the ```discretize``` function as we have stepsize for the temporal discretization and number of steps for the spatial discretization. -```@example wave -function run_wave(L, T, Δ_t::Float64, N_x::Int64, file_name; kwargs...) - ts = discretize((0,T); step=Δ_t) - xs = discretize((0,L); length=N_x) - wave = Wave(x -> f(x,L,y_L), g, c) - y = solve_wave(ts, xs, wave) - plot_wave(y, file_name; kwargs...) -end +Now we create the `wave` structure, compute the solution and plot it for with different values of ``n_x``. -nothing # hide -``` -Now we call this function with different values of ``N_x``. All keyword arguments are passed to the ```plot``` function inside ```plot_wave```. ```@example wave -run_wave(L, T, 1., 101, "wave1.gif"; ylims=(-2,3), label="") -run_wave(L, T, 1., 7, "wave2.gif"; ylims=(-2,3), label="") +wave = Wave(x -> f(x,L), g, c) + +y1 = solve_wave(T, L, wave; n_t=241, n_x=101) +plot_wave(y1, "wave1.gif"; ylims=(-2,3), label="") + +y2 = solve_wave(T, L, wave; n_t=241, n_x=7) +plot_wave(y2, "wave2.gif"; ylims=(-2,3), label="") nothing # hide ``` @@ -271,6 +278,4 @@ nothing # hide ![](wave2.gif) -If you see these two waves in different phases (positions), please refresh the page (the animations have already run for some time and the error accumulated). - -After the potential reload, the waves should start from the same location and move at the same speed. This is an important property of any physical system: it is consistent. If we use a different discretization, their behaviour should be *roughly* similar. Of course, the finer spatial discretization results in smoother lines but both waves have similar shapes and move at similar speeds. If we see that one moves two times faster, there is a mistake in the code. +If there are two waves in different phases (positions), please refresh the page. The waves should start from the same location and move at the same speed. This is an important property of any physical system: it is consistent. If we use a different discretization, their behaviour should be *roughly* similar. Of course, a finer spatial discretization results in smoother lines, but both waves have similar shapes and move at similar speeds. If we see that one moves significantly faster, there is a mistake in the code. diff --git a/docs/src/lecture_12/optimal_control.md b/docs/src/lecture_12/optimal_control.md index c453eed45..fab2b2c7f 100644 --- a/docs/src/lecture_12/optimal_control.md +++ b/docs/src/lecture_12/optimal_control.md @@ -26,64 +26,44 @@ end # Optimal control -This section considers the optimal control, which combines ordinary differential equations with optimization. It was extensively studied many decades ago, when it was used to steer rockets in space. +Optimal control combines ordinary differential equations with optimization. It was extensively studied many decades ago when it was used to steer rockets in space. + + + ## Permanent magnet synchronous motors -We will consider the problem of optimal steering of a PMSM (permanent magnet synchronous motor), which appear in electrical drives. The motor can be described via a linear equation +We will consider optimal steering of a PMSM (permanent magnet synchronous motor), which appears in electrical drives. It can be described via the linear equation: + ```math \dot x(t) = Ax(t) + q(t) + Bu(t), ``` -where ``x(t)`` is the state, ``q(t)`` is the bias and ``u(t)`` is the control term. More specifically, we have -```math -A = -\begin{pmatrix} \frac{R_1}{L_1} & 0 \\ 0 & \frac{R_2}{L_2} \end{pmatrix} - \omega \begin{pmatrix} 0 & -1 \\ 1 & 0 \end{pmatrix}, \qquad B = \begin{pmatrix} 1 & 0 \\ 0 & 1\end{pmatrix}, \qquad q(t) = \begin{pmatrix} \frac{R_1}{L_1}\psi_{\rm pm} \\ 0 \end{pmatrix}, -``` -where ``R`` is the resistance, ``L`` the inductance, ``\psi`` the flux and ``\omega`` the rotor speed. The state ``x(t)`` are the currents in the ``dq`` reference frame and the control ``u(t)`` is the provided voltage. -For simplicity we assume that the ratio of resistances and inductances is the same and that the bias is constant: -```math -\rho := \frac{R_1}{L_1} = \frac{R_2}{L_2},\qquad q:=q(t) -``` -The goal is to apply such voltage so that the system reaches the desired position ``x_{\rm tar}`` from an initial position ``x_0`` in minimal possible time. With maximal possible allowed voltage ``U_{\rm max}`` this amounts to solving + +where ``x(t)`` is the state, ``q(t)`` is the bias and ``u(t)`` is the control term. In the simplest case, we have + ```math -\begin{aligned} -\text{minimize}\qquad &\tau \\ -\text{subject to}\qquad &\dot x(t) = Ax(t) + q + u(t), \qquad t\in[0,\tau], \\ -&||u(t)||\le U_{\rm max},\qquad t\in[0,\tau], \\ -&x(0) = x_0,\ x(\tau)=x_{\rm tar}. -\end{aligned} +A = -\begin{pmatrix} \rho & 0 \\ 0 & \rho \end{pmatrix} - \omega \begin{pmatrix} 0 & -1 \\ 1 & 0 \end{pmatrix}, \qquad B = \begin{pmatrix} 1 & 0 \\ 0 & 1\end{pmatrix}, \qquad q(t) = \begin{pmatrix} \rho\psi_{\rm pm} \\ 0 \end{pmatrix}, ``` -Discretizing the problem and solving it by means of non-linear programming would result in a large number of variables (their number would also be unknown due to the minimal time ``\tau``) and is not feasible. Instead, we analyze the problem and try to simplify it. +where ``\rho=\frac RL`` is the ratio of resistance and inductance, ``\psi`` the flux and ``\omega`` the rotor speed. Vector ``q`` does not depend on time. The state ``x(t)`` are the currents in the ``dq``-reference frame, and the control ``u(t)`` is the provided voltage. -## Computing trajectories +From the theoretical part, we know that the trajectory of the ODE equals to -From the theoretical part, we know that the optimal solution of the ODE equals to ```math \begin{aligned} x(t) &= e^{At}\left(x_0 + \int_0^t e^{-As}(q+u(s))ds\right) \\ &= e^{At}\left(x_0 + A^{-1}(I-e^{-At})q + \int_0^t e^{-As}u(s)ds\right). \end{aligned} ``` -This term contains the matrix exponential ``e^{At}``. To compute it, we may run ```exp(A)```. It is important to realize that matrix exponential is different from elementwise exponential ```exp.(A)``` (try it). We set up the parameters -```@example oc -ω = 2 -ρ = 0.01 -A = -ρ*[1 0; 0 1] -ω*[0 -1; 1 0] +This term contains the matrix exponential ``e^{At}``. It can be shown that the [eigendecomposition](@ref matrix-eigen) of ``A=Q\Lambda Q^\top`` is -nothing # hide -``` -By computing the eigenvalues and eigenvectors -```@example oc -using LinearAlgebra - -λ, V = eigen(A) -``` -we can deduce that eigendecomposition ```math A = \frac 12\begin{pmatrix} i & -i \\ 1 & 1 \end{pmatrix} \begin{pmatrix} -\rho - i\omega & 0\\ 0 & -\rho+i\omega \end{pmatrix} \begin{pmatrix} i & 1 \\ -i & 1 \end{pmatrix}. ``` -We have divided the expression by ``2`` because all eigenvectors should have unit norm. Then the matrix exponential is + +We have divided the expression by ``2`` because the eigenvectors have a unit norm. Then it is simple to compute the matrix exponential. + ```math \begin{aligned} e^{At} &= \frac 12\begin{pmatrix} i & -i \\ 1 & 1 \end{pmatrix} \begin{pmatrix} e^{-\rho t - i\omega t} & 0\\ 0 & e^{-\rho t+i\omega t} \end{pmatrix} \begin{pmatrix} i & 1 \\ -i & 1 \end{pmatrix} \\ @@ -92,134 +72,182 @@ e^{At} &= \frac 12\begin{pmatrix} i & -i \\ 1 & 1 \end{pmatrix} \begin{pmatrix} ``` -```@raw html -

-
Exercise:

-``` -Verify that the matrix exponential is computed correctly and that the it is different from elementwise exponential. -```@raw html -

-
-Solution:

+ + + + +## Computing trajectories with no control + + +We start with no control term, therefore ``u(t)=0``. Then the trajectory simplifies to: + +```math +x(t) = e^{At}\left(x_0 + A^{-1}(I-e^{-At})q \right). ``` -A simple way to verify is to fix some ``t`` and evaluate all the expressions derived above + +Similarly to the wave equation, this system has multiple parameters. To prevent accidentally changing them, we save them in a structure. + ```@example oc -t = 5 -exp0 = exp.(t*A) -exp1 = exp(t*A) -exp2 = V*diagm(exp.(λ*t))*V' -exp3 = exp(-ρ*t)*[cos(ω*t) sin(ω*t); -sin(ω*t) cos(ω*t)] +struct PMSM{T<:Real} + ρ::T + ω::T + A::Matrix{T} + invA::Matrix{T} + + function PMSM(ρ, ω) + A = -ρ*[1 0; 0 1] -ω*[0 -1; 1 0] + return new{eltype(A)}(ρ, ω, A, inv(A)) + end +end nothing # hide ``` -Now, ```exp1```, ```exp2``` and ```exp3``` must be identical and differ from ```exp0```. Since there are rounding errors for different methods, the matrices will not be exactly identical and we need to check whether they norm is almost zero. + +Besides ``\rho``, ``\omega`` and ``A``, we also store the inverse matrix ``A^{-1}`` so that we do not have to recompute it. We now write the `expA` function, which computes the matrix exponential ``e^{At}``. + ```@example oc -norm(exp1 - exp0) >= 1e-10 || error("Matrices are wrong") -norm(exp1 - exp2) <= 1e-10 || error("Matrices are wrong") -norm(exp1 - exp3) <= 1e-10 || error("Matrices are wrong") +function expA(p::PMSM, t) + ρ, ω = p.ρ, p.ω + return exp(-ρ*t)*[cos(ω*t) sin(ω*t); -sin(ω*t) cos(ω*t)] +end nothing # hide ``` -Since the computation resulted in no error (note the opposite sign for ```exp0```), our computation seems to be correct. -```@raw html -

+ +For the rest of this section, we will work with the following parameter setting. + +```@example oc +ρ = 0.1 +ω = 2 +x0 = [0;-0.5] +q = [1;0] + +ps = PMSM(ρ, ω) + +nothing # hide ``` -Similarly to the wave equation, this system has many parameters. To keep track of them, and to prevent accidently changing them in a script, we should save them in a structure. We will create this structure so that it can also compute the matrix exponential and other useful functions. + + + + + +The first exercise checks that we computed the matrix exponential correctly. + ```@raw html
Exercise:

``` -We define the structure -```@example oc -struct Params - ρ - ω - A - invA - expA - expAT - n -end -nothing # hide -``` -which besides ``\rho``, ``\omega`` and ``A`` also stores the inverse matrix ``A^{-1}``, the matrix exponential functions ``t\mapsto e^{At}``, ``t\mapsto e^{A^\top t}`` and the size of ``A``. +Verify that the matrix exponential is computed correctly and that it is different from the elementwise exponential. + +**Hint**: The matrix exponential can also be computed directly by the `exp` function from the `LinearAlgebra` package. -Write a constructor (function) ```Params(ρ, ω)```, which creates this object. ```@raw html

Solution:

``` -The inverse matrix can be obtained by ```inv(A)```. The rest is obtained as the formulas above. for the transposition of the exponential, we need to convert it back to ```Matrix```, otherwise we could have problems later. + +A simple way to verify is to fix some ``t`` and evaluate the expressions above. + ```@example oc -function Params(ρ, ω) - A = -ρ*[1 0; 0 1] -ω*[0 -1; 1 0] - invA = inv(A) - expA(t) = exp(-ρ*t)*[cos(ω*t) sin(ω*t); -sin(ω*t) cos(ω*t)] - expAT(t) = Matrix(expA(t)') - n = size(A,1) - return Params(ρ, ω, A, invA, expA, expAT, n) -end +using LinearAlgebra + +t = 5 +exp0 = exp.(t*ps.A) +exp1 = exp(t*ps.A) +exp2 = expA(ps, t) + +nothing # hide +``` + +While `exp1` and `exp2` must be identical, they must differ from ```exp0```. Since there are rounding errors for different methods, the matrices will not be identical, and we need to check whether their norm is almost zero. + +```@example oc +norm(exp1 - exp0) >= 1e-10 || error("Matrices are wrong") +norm(exp1 - exp2) <= 1e-10 || error("Matrices are wrong") nothing # hide ``` + +Since the computation resulted in no error (note the opposite sign for ```exp0```), our computation seems to be correct. + ```@raw html

``` -For the rest of this section, we will work with the following parameter setting -```@example oc -ρ = 0.1 -ω = 2 -x0 = [0;-0.5] -q = [1;0] -ps = Params(ρ, ω) -nothing # hide -``` + + + + + + + + Now we can finally plot the trajectories of the electric motor. ```@raw html
Exercise:

``` -Consider the case of time interval ``[0,10]``. The other parameters are specified directly above this exercise. -Compute the trajectory ``x(t)`` for no control (``u(t)=0``) using finite differences with ``\Delta t=0.01``. Then compute the exact solution using the formulas derived above. Plot the trajectories as a plot (no animations). +Write two function `trajectory_fin_diff` and `trajectory_exact` which compute the trajectory. The first one should use the finite difference method to discretize the time, while the second one should use the closed-form formula. + +Plot both trajectories on time interval ``[0,10]`` with time discretization step ``\Delta t=0.01``. Since ``x(t)`` is a two-dimensional vector, plot each component on one axis. + ```@raw html

Solution:

``` -We store the solution obtained by finite differences in ```xs1``` and the true solution in ```xs2```. We initialize both arrays and add ```x0``` at the first time instant. Then we use the discretization formula. All the parameters connected with ``A`` are retrieved from the ```ps``` structure. -```@example oc -using Plots -Δt = 0.01 -ts = 0:Δt:10 +Both functions create an empty structure for the solution and then iterate over time. Since finite differences compute the solution at the next time, the loop is one iteration shorter. We compute the iteration based on the formulas derived above. The exact method does not need values at the previous point, which implies that numerical errors do not accumulate due to discretization errors. + +```@example oc +function trajectory_fin_diff(p::PMSM, x0, ts, q) + xs = zeros(length(x0), length(ts)) + xs[:, 1] = x0 -xs1 = zeros(2, length(ts)) -xs1[:,1] = x0 -xs2 = zeros(2, length(ts)) -xs2[:,1] = x0 + for i in 1:length(ts)-1 + xs[:, i+1] = xs[:, i] + (ts[i+1]-ts[i])*(p.A * xs[:, i] + q) + end + return xs +end -eye(n) = Diagonal(ones(n)) +function trajectory_exact(p::PMSM, x0, ts, q) + xs = zeros(length(x0), length(ts)) -for i in 1:length(ts)-1 - xs1[:,i+1] = xs1[:,i] + Δt*(ps.A*xs1[:,i] + q) - xs2[:,i+1] = ps.expA(ts[i])*(x0 + ps.invA*(eye(ps.n)-ps.expA(-ts[i]))*q) + for (i, t) in enumerate(ts) + xs[:, i] = expA(p, t)*(x0 + p.invA * (I - expA(p, -t))*q) + end + return xs end +nothing # hide +``` + +For plotting, we create the time discretization, compute both trajectories and then plot them. + + +```@example oc +using Plots + +ts = 0:0.01:10 + +xs1 = trajectory_fin_diff(ps, x0, ts, q) +xs2 = trajectory_exact(ps, x0, ts, q) + plot(xs1[1,:], xs1[2,:], label="Finite differences") plot!(xs2[1,:], xs2[2,:], label="True value") savefig("Comparison1.svg") # hide ``` + ```@raw html

``` @@ -228,81 +256,112 @@ savefig("Comparison1.svg") # hide The trajectories are different. Something is wrong. However, when we use the time discretization ``\Delta t=0.0001``, the solutions are suddenly equal. -```@setup oc -Δt_aux = Δt -ts_aux = ts - -Δt = 0.0001 -ts = 0:Δt:10 - -xs1 = zeros(2, length(ts)) -xs1[:,1] = x0 -xs2 = zeros(2, length(ts)) -xs2[:,1] = x0 +```@example oc +ts = 0:0.0001:10 -for i in 1:length(ts)-1 - xs1[:,i+1] = xs1[:,i] + Δt*(ps.A*xs1[:,i] + q) - xs2[:,i+1] = ps.expA(ts[i])*(x0 + ps.invA*(eye(ps.n)-ps.expA(-ts[i]))*q) -end +xs1 = trajectory_fin_diff(ps, x0, ts, q) +xs2 = trajectory_exact(ps, x0, ts, q) plot(xs1[1,:], xs1[2,:], label="Finite differences") plot!(xs2[1,:], xs2[2,:], label="True value") savefig("Comparison2.svg") # hide - -Δt = Δt_aux -ts = ts_aux ``` ![](Comparison2.svg) -Can you guess why this happened? The problem is that the finite difference method performs a first order approximation of the non-linear function ``x(t)``. But since the trajectory always "bends leftwards", the finite differences follow this bending with a delay. The error accummulates over time and is quite large at the end. +Can you guess why this happened? The problem is that the finite difference method performs a first-order approximation of the non-linear function ``x(t)``. Since the trajectory always "bends leftwards", the finite differences follow this bending with a delay. The error accumulates over time and is quite large at the end. + + + + + + + ## Solving the optimal control problem -So far, we did not consider any control. This part shows how the optimal control can be computed. Using a rather complicated theory, it can be shown that for any terminal state ``x_{\rm tar}``, there is some ``p_0`` such that the optimal control has form + + + +The goal is to apply such voltage so that the system reaches the desired position ``x_{\rm tar}`` from an initial position ``x_0`` in minimal possible time. With maximal possible allowed voltage ``U_{\rm max}`` this amounts to solving +```math +\begin{aligned} +\text{minimize}\qquad &\tau \\ +\text{subject to}\qquad &\dot x(t) = Ax(t) + q + u(t), \qquad t\in[0,\tau], \\ +&||u(t)||\le U_{\rm max},\qquad t\in[0,\tau], \\ +&x(0) = x_0,\ x(\tau)=x_{\rm tar}. +\end{aligned} +``` + +Discretizing the problem and solving it using non-linear programming would result in many variables (their number would also be unknown due to the minimal time ``\tau``) and is not feasible. Instead, we analyze the problem and try to simplify it. It can be shown that for any terminal state ``x_{\rm tar}``, there is some ``p_0`` such that the optimal control ``u(t)`` has form: + ```math \begin{aligned} p(t) &= e^{-A^\top t}p_0, \\ u(t) &= U_{\rm max}\frac{p(t)}{||p(t)||}. \end{aligned} ``` -The next remark hints at the derivation of these formulas. It can be safely skipped. + + + + + + ```@raw html -
-
Connection with optimization

+

+
BONUS: Connection with optimization

``` -Optimal control forms the Hamiltonian (similar to the [Langrangian](@ref lagrangian)) + +This part hints at the derivation of the previous result and the connection to constrained optimization. Optimal control forms the Hamiltonian (similar to the [Langrangian](@ref lagrangian)) + ```math H = \tau + p(t)^\top (Ax(t) + q + u(t)) ``` + Since the constraint is time-dependent, the adjoint variable ([multiplier](@ref lagrangian)) ``p(t)`` must also depend on time. Differentiating the Hamiltonian with respect to the ``x(t)`` and setting the derivative to ``-\dot p(t)`` (instead of zero as in nonlinear optimization) results in + ```math -\dot p(t) = A^\top p(t), ``` + which has the solution + ```math p(t) = e^{-A^\top t}p_0. ``` + This is the first condition written above. The second condition can be obtained by maximizing the Hamiltonian with respect to ``u`` and arguing that the constraint ``||u(t)||=U_{\rm max}`` will always be satisfied (this goes beyond the content of this lecture). + ```@raw html

``` -It is not difficult to show that + + + + + + +It is not difficult to show + ```math e^{-At}a^{-A^\top t} = e^{2\rho t}I. ``` + We intend to compute the trajectory. The most difficult part is the integral from ``e^{-As}u(s)``. Since + ```math \begin{aligned} e^{-As}u(s) &= U_{\rm max}\frac{e^{-As}e^{-A^\top s}p_0}{||e^{-A^\top s}p_0||} = U_{\rm max}\frac{e^{-As}e^{-A^\top s}p_0}{\sqrt{p_0^\top e^{-As}e^{-A^\top s}p_0}} = U_{\rm max}\frac{e^{2\rho s}p_0}{\sqrt{p_0^\top e^{2\rho s}I p_0}} = U_{\rm max}e^{\rho s}\frac{p_0}{||p_0||}, \end{aligned} ``` + the trajectory equals to + ```math \begin{aligned} x(t) &= e^{At}\left(x_0 + A^{-1}(I-e^{-At})q + \int_0^t e^{-As}u(s)ds\right) \\ @@ -311,116 +370,153 @@ x(t) &= e^{At}\left(x_0 + A^{-1}(I-e^{-At})q + \int_0^t e^{-As}u(s)ds\right) \\ \end{aligned} ``` -This allows us to plot the optimal trajectories. +When we know ``p_0``, we can use this formula to compute the optimal trajectory with control. To compute ``p_0``, we rearrange the previous equation to -```@raw html -
-
Exercise:

+```math +\frac{U_{\rm max}}{\rho}(e^{\rho t}-1) \frac{p_0}{||p_0||} = e^{-At}x(t) - x_0 - A^{-1}(I-e^{-At})q. ``` -Write functions ```x(t, ???)``` and ```trajectory(ts, ???)``` which compute the optimal solution ``x(t)`` and the trajectory ``x(t)_{t\in {\rm ts}}`` (saved into a matrix). -The optimal trajectory depends on the normed vector ``p_0``. All such vectors form a unit circle in ``\mathbb R^2``. Therefore, they can be parameterized by an angle ``\alpha\in[0,2\pi]``. Fix ``U_{\rm max}=0.1`` and time interval ``[0,10]`` with time step ``\Delta t=0.01``. Then plot eight possible optimal trajectories (each would correspond to a different target ``x_{\rm tar}``) with uniformly distributed ``\alpha``. -```@raw html -

-
-Solution:

+We take the norm of both sides to arrive at + +```math +\frac{U_{\rm max}}{\rho}(e^{\rho t}-1) = ||e^{-tA}x(t) - x_0 - A^{-1}(I-e^{-At})q||. ``` -For functions ```x```, we need to rewrite the previous formula into code. For ```trajectory```, we call ```x``` for ```t in ts```. Since ```x``` returns a two-dimensional vector, we need to cat the results to a matrix. -```@example oc -function x(t, ps, x0, U_max, p0, q) - return ps.expA(t)*(x0 + ps.invA*(eye(ps.n)-ps.expA(-t))*q + U_max/ρ*(exp(ρ*t)-1)*p0) -end -trajectory(ts, ps, x0, U_max, p0, q) = hcat([x(t, ps, x0, U_max, p0, q) for t in ts]...) +Since this relation needs to hold for all ``t\in[0,\tau]``, we set ``t=\tau`` and use the target relation ``x(\tau)=x_{\rm tar}``: -nothing # hide +```math +\frac{U_{\rm max}}{\rho}(e^{\rho \tau}-1) = ||e^{-\tau A}x_{\rm tar} - x_0 - A^{-1}(I-e^{-A\tau})q||. ``` -For plotting, we initialize the variables -```@example oc -U_max = 0.1 -Δt = 0.01 -ts = 0:Δt:10 +Since this is one equation for one variable, we can compute the optimal time ``\tau`` from it. + + -nothing # hide -``` -then create an empty plot. We make a uniform discretization of ``[0,2\pi]`` and for each ``\alpha`` from this interval, we compute ``p_0``, the trajectory and finally plot the result. Since we plot in a loop, we need to ```display``` the plot. -```@example oc -pa = plot() -for α = 0:π/4:2*π - p0 = [sin(α); cos(α)] - traj = trajectory(ts, ps, x0, U_max, p0, q) - plot!(traj[1,:], traj[2,:], label="") -end -display(pa) -savefig("Trajectories.svg") # hide -``` -```@raw html -

-``` -![](Trajectories.svg) -Rearranging the previous equation results in -```math -\frac{U_{\rm max}}{\rho}(e^{\rho t}-1) \frac{p_0}{||p_0||} = e^{-tA}x(t) - x_0 - A^{-1}(I-e^{-At})q. -``` -We take the norm of both sides to arrive at -```math -\frac{U_{\rm max}}{\rho}(e^{\rho t}-1) = ||e^{-tA}x(t) - x_0 - A^{-1}(I-e^{-At})q||. -``` -Since this realization needs to hold true for all ``t\in[0,\tau]``, we set ``t=\tau`` and use the target relation ``x(\tau)=x_{\rm tar}`` -```math -\frac{U_{\rm max}}{\rho}(e^{\rho \tau}-1) = ||e^{-\tau A}x_{\rm tar} - x_0 - A^{-1}(I-e^{-A\tau})q|| -``` -All data besides ``\tau`` are known. Therefore, if we solve it, we obtain the optimal time ``\tau``. This is done in the next exercise. ```@raw html
Exercise:

``` -Solve the optimal control problem for ``x_{\rm tar}= (0.25, -0.5)``. Plot the optimal trajectory to this point. + +Solve the optimal time for ``x_{\rm tar}= (0.25, -0.5)`` with the maximum voltage ``U_{\rm max} = 0.1``. + +**Hint**: To solve the equation above for ``t``, use the [bisection method](@ref l7-exercises). + ```@raw html

Solution:

``` + To solve the equation above, we need to find a zero point of + ```math -f(t) = ||e^{-At}x_{\rm tar} - x0 - A^{-1}(I-e^{-At})q|| - \frac{U_{\rm max}}{\rho}(e^{\rho t}-1) +f(t) = ||e^{-At}x_{\rm tar} - x_0 - A^{-1}(I-e^{-At})q|| - \frac{U_{\rm max}}{\rho}(e^{\rho t}-1) ``` -The graph of the function (plot it) shows that it has a single zero point (for this parameter setting). It can be found by evaluating it at many points at selecting the point with the value closest to zero. A more formal approach is to use the [bisection method](@ref l7-exercises) written earlier. + +The graph of the function (plot it) shows a single zero point (for this parameter setting). It can be found by evaluating it at many points at selecting the point with the value closest to zero. A more formal approach is to use the bisection method. + ```@example oc -x_tar = [0.25;-0.5] +U_max = 0.1 +x_t = [0.25;-0.5] -f(t) = norm(ps.expA(-t)*x_tar - x0 - ps.invA*(eye(ps.n)-ps.expA(-t))*q) - U_max/ps.ρ*(exp(ps.ρ*t)-1) +f(t) = norm(expA(ps, -t)*x_t - x0 - ps.invA*(I-expA(ps, -t))*q) - U_max/ps.ρ*(exp(ps.ρ*t)-1) τ = bisection(f, minimum(ts), maximum(ts)) nothing # hide ``` -To compute the optimal control and optimal trajectory, we first need to compute ``p_0`` and then use the ```trajectory``` function. We restrict the interval ```ts``` to ``[0,\tau]``. -```@example oc -p0 = ps.ρ/(U_max*(exp(ps.ρ*τ)-1))*(ps.expA(-τ)*x_tar - x0 - ps.invA*(eye(ps.n)-ps.expA(-τ))*q) -p0 /= norm(p0) -ts_opt = ts[ts .<= τ + Δt] -traj = trajectory(ts_opt, ps, x0, U_max, p0, q) +```@raw html +

+``` + + + + + +To compute the optimal control and optimal trajectory, we rewrite one of the formulas derived above. + +```@example oc +function trajectory_control(p::PMSM, x0, ts, q, U_max, p0) + xs = zeros(length(x0), length(ts)) + + for (i, t) in enumerate(ts) + eAt = expA(p, t) + emAt = expA(p, -t) + xs[:, i] = eAt*(x0 + p.invA * (I - emAt)*q + U_max/ρ*(exp(ρ*t) - 1)*p0) + end + return xs +end nothing # hide ``` -Then we plot the trajectory and the target point. + +To compute the optimal trajectory, we compute ``p_0`` by the formula derived above. Then we plot the trajectory and the target point. + + ```@example oc +p0 = ps.ρ/(U_max*(exp(ps.ρ*τ)-1))*(expA(ps, -τ)*x_t - x0 - ps.invA*(I-expA(ps, -τ))*q) +p0 /= norm(p0) + +ts = range(0, τ; length=100) + +traj = trajectory_control(ps, x0, ts, q, U_max, p0) + plot(traj[1,:], traj[2,:], label="Optimal trajectory") -scatter!([x_tar[1]], [x_tar[2]], label="Target point") +scatter!([x0[1]], [x0[2]], label="Starting point") +scatter!([x_t[1]], [x_t[2]], label="Target point") savefig("Optimal.svg") # hide ``` + +![](Optimal.svg) + +We confirm that the optimal trajectory leads from the starting to the target point. + + + + + + + + + + + + + + +```@raw html +
+
BONUS: Plotting all optimal trajectories

+``` + +The optimal trajectory depends on the normed vector ``p_0``. All such vectors form a unit circle in ``\mathbb R^2``. Therefore, they can be parameterized by an angle ``\alpha\in[0,2\pi]``. We plot eight possible optimal trajectories, each corresponding to a different target ``x_{\rm tar}``, with uniformly distributed ``\alpha``. Since we plot in a loop, we need to initialize an empty plot and then `display` it. + +```@example oc +ts = 0:0.01:10 + +plt = plot() +for α = 0:π/4:2*π + trj = trajectory_control(ps, x0, ts, q, U_max, [sin(α); cos(α)]) + plot!(plt, trj[1,:], trj[2,:], label="") +end +display(plt) + +savefig("Trajectories.svg") # hide +``` + +![](Trajectories.svg) + + ```@raw html

``` -![](Optimal.svg) + diff --git a/docs/src/lecture_12/theory.md b/docs/src/lecture_12/theory.md index c5ea09d0c..a793e9b05 100644 --- a/docs/src/lecture_12/theory.md +++ b/docs/src/lecture_12/theory.md @@ -1,35 +1,38 @@ -```@setup optim -using Plots -``` - # Differential equations -Differential equations describe many natural phenomena. The Newton's law of motion +Differential equations describe many natural phenomena. Newton's law of motion + ```math F = \frac{\partial}{\partial t}mv ``` -is a simple but extremely important example of an ordinary differential equation. Besides applications in physics and engineering, differential equations appear in almost any (smoothly) evolving system. Examples include economics (Black-Scholes formula) or biology (population growth). There are whole fields dedicated to solving a single equation such as the wave or heat equations. The basic distinction is: + +is a simple but important example of an ordinary differential equation. Besides applications in physics and engineering, differential equations appear in almost any (smoothly) evolving system. Examples include economics (Black-Scholes formula) or biology (population growth). There are whole fields dedicated to solving a single equation, such as the wave or heat equations. The basic distinction is: - *Ordinary differential equations* (ODEs) depend only on time. -- *Partial differential equations* (PDEs) depend both on time and space. The spacial variable is usually 1D, 2D or 3D. -There are many extensions; let us name systems of differential equations, stochastic differential equations, differential algebraic equations (system of differential and non-differential equations) and others. +- *Partial differential equations* (PDEs) depend on space and may depend on time. The spatial variable is usually 1D, 2D or 3D. +There are many extensions; let us name systems of differential equations, stochastic differential equations or differential algebraic equations (system of differential and non-differential equations). ## Ordinary differential equations -Oridnary differential equations take form +Ordinary differential equations take the form + ```math \dot y(t) = f(t, y(t)) ``` -on some interval ``t\in [0,T]``. To obtain a correct definition, the initial value ``y(0)=y_0`` needs to be provided. A solution is a (sufficiently smooth) function ``y(t)`` such that the above formula is satisfied (almost) everywhere on ``[0,T]``. The existence and uniqueness is ensured by mild conditions. + +on some interval ``t\in [0,T]``. To obtain a correct definition, the initial value ``y(0)=y_0`` needs to be provided. A solution is a (sufficiently smooth) function ``y(t)`` such that the above formula is satisfied (almost) everywhere on ``[0,T]``. Mild conditions ensure its existence and uniqueness. ```@raw html
Picard–Lindelöf theorem

``` + Suppose ``f`` is uniformly Lipschitz continuous in ``y`` (the Lipschitz constant is independent of ``t``) and continuous in ``t``. Then for some value ``\varepsilon > 0``, there exists a unique solution ``y(t)`` to the initial value problem on the interval ``[t_0-\varepsilon, t_0+\varepsilon]``. + ```@raw html

``` -However, it may happen than even simple equations do not have a unique solution. + +However, it may happen that even simple equations do not have a unique solution. ```@raw html
@@ -42,14 +45,14 @@ The uniqueness of solution is not guaranteed even for simple equations. Equation y(0) &= 0 \end{aligned} ``` -has at least two solution: ``y_1(t)=0`` and ``y_2(t)=(\frac t3)^3``. This is possible because the right-hand side of the ODE has an infinite derivative at zero and the Lipschitz continuity assumption of the Picard–Lindelöf theorem is not satisfied. +has at least two solution: ``y_1(t)=0`` and ``y_2(t)=(\frac t3)^3``. This is possible because the right-hand side of the ODE has an infinite derivative at zero, and the Lipschitz continuity assumption of the Picard–Lindelöf theorem is not satisfied. ```@raw html

``` -The theory of partial differential equations is complicated because they employ a special definition of derivative (weak derivative) and the solution is defined on special spaces (Sobolev spaces). It may happen that a function has (weak) derivative but it is not continuous. +The theory of partial differential equations is complicated because they employ a special definition of derivative (weak derivative), and the solution is defined on special spaces (Sobolev spaces). -## Linear ODE +## Linear ordinary differential equations Linear ordinary equations ```math @@ -61,7 +64,7 @@ e^{A} = \sum_{k=0}^{\infty} \frac{A^k}{k!}, ``` where ``A^k`` is the standard multiplication of matrices. This is a generalization from scalars to matrices and has similar properties. For example, the derivative of the matrix exponential is the same object due to ```math -\frac{\partial}{\partial A}e^{A} = \sum_{k=0}^{\infty} A\frac{A^{k-1}}{k!} = \sum_{k=0}^{\infty} \frac{A^k}{k!} = e^{A}. +\frac{\partial}{\partial A}e^{A} = \sum_{k=1}^{\infty} \frac{kA^{k-1}}{k!} = \sum_{k=1}^{\infty} \frac{A^{k-1}}{(k-1)!} = e^{A}. ``` Then the solution of the linear equation above equals to ```math @@ -73,30 +76,35 @@ Indeed, the derivative of the previous term equals to ``` because ``e^{At}e^{-At}`` is the identity matrix (similarly to ``e^xe^{-x}=1``). The matrix exponential can be computed using matrix decompositions. -## Matrix decompositions -There are many matrix decompositions. They are closely related to finding eigenvalues and eigenvectors. We mention only two basic decompositions. +```@raw html +
+
Matrix decompositions in solving linear ODEs

+``` -#### Cholesky decomposition +The [lecture on statistics](@ref matrix-eigen) showed the eigendecomposition for square matrices ``A\in\mathbb R^{n\times n}``. The eigendecomposition -For a real positive semidefinite matrix ``A``, its Cholesky decomposition always exists and reads ```math -A = LL^\top, +A = Q\Lambda Q^{-1} ``` -where ``L`` is a real lower triangular matrix (zeros above the diagonal). This matrix is easily invertible and the decomposition is used in iterative algorithms where the matrix inversion needs to be computed many times. -#### Eigenvalue decomposition +exists whenever ``A`` has ``n`` eigenvalues, which may even be even complex. Then the matrix exponential equals to -For a square matrix ``A`` of size ``n\times n``, we assume that there are ``n`` linearly independent eigenvectors (matrices which do not satisfy this are called defective). Then the eigendecomposition equals to ```math -A = Q\Lambda Q^{-1}, +e^{At} = e^{Q(\Lambda t) Q^{-1}} = Qe^{\Lambda t} Q^{-1}. ``` -where ``\Lambda`` is a diagonal matrix with eigenvalues on the diagonal and the columns of ``Q`` are orthonormal eigenvectors. This allows us to compute the matrix exponential + +Because ``\Lambda`` is a diagonal matrix, its exponential equals to a diagonal matrix with entries + ```math -e^A = e^{Q\Lambda Q^{-1}} = Qe^\Lambda Q^{-1} +(e^{\Lambda t})_{ij} = \begin{cases} e^{\Lambda_{ii} t}&\text{if }i=j, \\ 0&\text{otherwise.}\end{cases} ``` -as ``e^\Lambda`` is a diagonal matrix with entries -```math -(e^\Lambda)_{ij} = \begin{cases} e^{\Lambda_{ii}}&\text{if }i=j \\ 0&\text{otherwise}\end{cases}. + +Since we need to compute the matrix exponential multiple times, similarly to [LASSO](@ref lasso), using the eigendecomposition saves computational times. + +```@raw html +

``` -This allows to compute the closed-form solution of the linear ODEs. + + +