Skip to content

This is a sample of my MatLab code related to scientific computing and probability. For the first update I'm defining a couple functions for visualizing and sampling random numbers from probability distributions. It can also draw histograms with confidence intervals.

License

Notifications You must be signed in to change notification settings

erik-dali/Scientific-Computing

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Intro

This is a collection of my MatLab codes related to scientific computing and probability.

1. Sampling Probability Distributions in MatLab

For the first part I'm defining a couple functions for to draw random samples from probability distributions and plot their histograms. The main function is called hst and the function that draws the samples is sampler.

I have defined a function hst which creates a histogram for a given probability distribution by generating a large number of i.i.d. samples using a sampler function called sampler. The histogram function returns a set of parameters that we are interested in. It can also plot the confidence intervals for our bins along with the pdf itself. The function is as follows:

[bins, frq, fx, cip, mu, var_dist, var_mean] = hst(@sampler, bins, samples, xmin, xmax, graph)

where on the right-hand side the functions takes as input:

  • @sampler is the function handle of the sampler of a pdf function
  • bins is the number of bins for the histogram
  • samples is the number of samples we want to be generated for the histogram
  • The argument graph takes y and n to determenie if it should plot the histogram,

and on the left-hand side the function returns:

  • bins is the vector of bins
  • frq is the vector of frequencies for each bin
  • fx is the vector of empirical (numerical) probability distribution i.e. probability of each bin
  • cip is the vector of probabilities for each bin
  • mu is the empirical mean
  • var_dist is the empirical variance
  • var_mean is the variance of the mean estimator

Note that the last bin will contain all observations larger than the upper bound of the interval.

Testing the histogram function hst

I use the hst function to plot the histogram of f_E(x) = exp(−x) in the interval [0, 10] using 50 bins and 2,500 samples. The green curve and right y-axis show the graph of f_E.

Histogram of the exponential function.

Let X_1 and X_2 be two i.i.d. random variables with pdf functions f(x) = exp(−x). Then the sum of these two random variables is also a random variable Y = X_1 + X_2 and the pdf for Y is the convolution of the pdfs for X_1 and X_2. Therefore we can define a new function to sample from Y. Here's how I derived the pdf for Y using convolution:

Sum of two i.i.d. random variables using convolution

Histogram for the new pdf

Using the new sample function derived above, 100 bins, and N = 10^5 samples on the interval [0,10] we get the following:

Histogram of sum of two i.i.d. random variables

2. Nonlinear Least-Squares Fitting

Here I fit randomly generated data to a nonlinear function with parameters y = f (x; <b>c</b>). The least-squares fit is the objective function to be minimized. Although this problem is an optimization problem, it can be thought of as an overdetermined system of nonlinear equations for the parameter c.

function

where `f` is an exponentially-damped sinusoidal curve with the unknown parameter `c`. Here `c` is a vector of four scalars: * Amplitude `c_1` * Decay `c_2` * Period `c_3` * Phase `c_4`

I generate some synthetic data of 100 points randomly and uniformly distributed in 0 <= x <= 10. Then I plot the true function and the peturbed data where c = (1, 1/2, 2, 0).

Nonlinear Fitting

I implement the Gauss-Newton method with the initial guess of c0 = (0.95, 0.45, 1.95, 0.05) and solve the normal equations:

GN

and set the stopping criteria to be 20 iterations. We find the partial derivatives of `f` by hand and compute the Jacobian matrix for each iteration. After 5 iterations the estimates converge to the actual results with full machine accuracy.

If we set the initial guess c = (1, 1, 1, 1) then the method diverges and after 8 iterations the program fails because the matrix is no longer positive-definite. Trying other initial guesses we find that the method diverges frequently when the initial guess are not close.

  • Theoretically, convergence is only quadratic when our initial guess are close enough to the actual values.
  • If we set ε = 0 then we get full machine accuracy after 6 iterations.
  • Gauss-Newton method is derived from the Newton method for optimization and it can solve non-linear least squared problems. However, Gauss-Newton method has the advantage of not needing the second derivative which can be difficult to compute. (Source Wikipedia https://en.wikipedia.org/wiki/Gauss-Newton_algorithm)

We set the initial guess to c = (1,1,1,1) the Guass-Netwon method fails to converge. Experimentally, if we set λ1 = 5.11e−3 then after 20 iterations the results converge to full machine accuracy.

Levenberg-Marquardt Algorithm

For larger values of λ1 convergence becomes faster, however after a certain threshold, increasing λ1 does not improve the speed of convergence. It is more helpful to start with a better intial guess than to rely on increasing the magnitude of λ1.

About

This is a sample of my MatLab code related to scientific computing and probability. For the first update I'm defining a couple functions for visualizing and sampling random numbers from probability distributions. It can also draw histograms with confidence intervals.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages