4. Fitting

Overview

Spinmob provides a nonlinear fitter based on scipy.optimize.leastsq(). It is capable of fitting arbitrary curves to an arbitrary number of data sets (simutameously), showing the data, curves, and residuals as you go. It also allows you to change the fit range, guess parameters, which parameters are fixed, and more on the fly.

Basic Workflow (using IPython console)

1. Import spinmob

In []: import spinmob as s

2. Create a fitter object and look at it

First let's make a fitter object:

In []: my_fitter = s.data.fitter()

Let's inspect the object by typing its name:

In []: my_fitter
Out[]:

SETTINGS
  autoplot        True
  coarsen         [1]
  fpoints         [1000]
  plot_bg         [False]
  plot_ey         [True]
  plot_fit        [True]
  plot_guess      [True]
  plot_guess_zoom [False]
  scale_eydata    [1.0]
  style_bg        [{'color': 'k', 'marker': '', 'ls': '-'}]
  style_data      [{'color': 'b', 'marker': '+', 'ls': ''}]
  style_fit       [{'color': 'r', 'marker': '', 'ls': '-'}]
  style_guess     [{'color': '0.25', 'marker': '', 'ls': '-'}]
  subtract_bg     [True]
  xlabel          [None]
  xmax            [None]
  xmin            [None]
  xscale          ['linear']
  ylabel          [None]
  ymax            [None]
  ymin            [None]
  yscale          ['linear']

Look at all those settings! You can change any of them using the set() function, e.g., my_fitter.set(ymin=32, plot_guess=False) and retrieve their values with dictionary-like indexing, e.g., my_fitter['ymin']. These settings will affect how the data is processed and visualized.

Let's now specify a function we will use to fit some data:

In []: my_fitter.set_functions(f='a*x+cos(b*x)+c', p='a,b=2,c')

Here, the argument f is the function (in this case a string), and p is a comma-delimited list of fit parameters (also a string), with optional guess values. You can always set guess values later using set() as above, e.g., my_fitter.set(c=12.2). For f you can specify a single function, as shown above, or list of functions for simultaneously fitting multiple data sets. You can furthermore specify constant parameters and background functions (see below) if you like.

Function strings are exposed to all of numpy (e.g. cos and sqrt), and you can include your own dictionary of globals via the g argument. This is especially useful if you want to define a more complex python function with, e.g., for loops or something fancy fancy inside.

All settings, constants, and parameter guesses can be set with either the f.set() or f() function. For example, f(a=3.2, xmin=1, xmax=4).

3. Give the fitter some data.

In []: my_fitter.set_data(xdata=[1,2,3,4,5], ydata=[1,2,1,4,3], eydata=0.7)

All three arguments can be numpy arrays (Perhaps databox columns? See 2. Data Handling) or lists of numpy arrays (with length matching the number of functions!). Using the default settings, the command above should pop up a plot of the data (blue) and guess (gray):

The way (or whether) these things are plotted can be changed using f.set() as described above. Note for multiple data sets, if you only supply one xdata set, it will assume the same xdata for each ydata set.

4. Fit.

In []: my_fitter.fit()

This will perform the least-squares minimization, updating the plots and residuals:

You can see the results of the fit on the plot or by again inspecting the f object as we did in step 1. The output of scipy.optimize.leastsq() is stored in f.results. It is a list, the first element is an array of fit parameters, and the second element is the covariance matrix. See the leastsq documentation for additional info.

Note that if you do not specify the y error bars (eydata) the fitter will make a completely arbitrary guess based on the max and min values of the supplied data. As a result, the reduced chi^2 will be meaningless, the parameter errors / correlations will be totally wrong, and your eyes will eventually cease to smile when your mouth does. You should always estimate and justify your errors, undergrads, because I am grading your reports. However, if you are not my student and you just want a dirty estimate of the error in the fit parameters, you can cheat a bit using f.autoscale_eydata_and_fit() a few times until your reduced chi^2 is 1. Use at your own risk: this basically amounts to estimating the error by sampling the data (which contains the science you are trying to fit!) itself, and assumes that your curve perfectly describes the data and that the errors are totally random, Gaussian, and uncorrelated. Good luck explaining this on a final report, much less how all of your reduced chi^2 values happen to be exactly 1.

Background Functions

You can also specify "background functions" by setting the parameter bg just as you do for f. For example:

In []: f = s.data.fitter(f='a/(1.0+(x-x0)**2/w**2) + b + c*x',
                  p='a=1, x0=1, w=1, b=0, c=0',
                  bg='b+c*x')
In []: f.set_data([1,2,3,4,5,6,7], [1,2,3.5,2,1,0,-1], 0.3)
In []: f.fit()

will produce this:

Notice both the guess and fit functions now display the specified background function, for reference. The background function is purely for visualization and has no effect on the fit. Specifying bg essentially amounts to specifying what part of the fit function is due to background.

If you like, you can specify that the background function be subtracted from the data prior to plotting by setting f(subtract_bg=False). The result is then:

Notice the fit background was subtracted from everything, including the guess (which is why the guess is skewed despite the guess c being 0). Prior to the fit, however, the guess background will be subtracted from everything.

Advanced Function Fitting

Sometimes it is difficult to include the fit function model into a string. Instead of inputing a string to the fitter, you can also input an arbitrarily complicated function. For example, suppose we have the function:

def loud_cos(x, a, b, c):
    """
    Loud cosine with an offset, 'a*x*cos(b*x)+c'. 
    """

    # Make some noise (or do crazy calculations)
    for n in range(10): print("Yay!")
    
    # Return some cosine
    return a * x * np.cos(b*x) + c

You can then set the f parameter of the fitter to this function directly:

f = s.data.fitter(f=loud_cos, p='a,b,c')

Note this technique requires that the function takes the "x" data as the first parameter, and the fit parameters as the rest, and that the user-specified parameter list match the order of the parameter list in the function. Alternatively (my preferred method), you can use a globally defined function within the usual function string, e.g.:

f = s.data.fitter(f='d+y(x,a,b,c)**2', p='d,b,c,a', g=dict(y=loud_cos))

In this case, you must also specify a "globals" dictionary (the g keyword argument) so the fitter knows what these objects actually are. Here I used the function y in the string, and specified that y is loud_cos so the fitter will know what to do with y when it sees it. Note you can actually send arbitrary python objects to the fitter string with the g argument.

Multiple Data Sets

You can also specify multiple data sets and a function to fit each using the same set of parameters. You accomplish this by sending a list of functions and a list of data sets, for example:

f = s.data.fitter(['a*sin(x)+b', 'a+b*x+c**2'], p='a,b,c')
f.set_data([[1,2,3,4,5], [1,2,3,4,5]], [[1,3,1,4,1], [1,5,1,5,1]], 1.2)

In this case, a figure for each data set should pop up, and you can then proceed as normal. The default arguments for s.data.fitter() and my_fitter.set_data() have another an example of this.

Handy Functions

These are some of the more commonly used functions when fitting (assumes your fitter object is called f).

f.set() or f() - allows you to change one or more parameters or settings
f.ginput() - click the specified figure to get x,y values (great for guessing!)
f.trim() - keeps only the currently visible data for fitting (use the zoom button on the graph!)
f.zoom() - zooms out
f.fix() - fix one or more of the parameters
s.tweaks.ubertidy() - auto-formats figure for Inkscape (use the save button on the graph and save in 'svg' format)
f(a=11, b=2, plot_guess=False) - change some guess values and settings (disabling the guess plot)
f(coarsen=3, subtract_bg=[True, False]) - coarsen the data (by averaging every 4 data points) prior to fitting (```coarsen=0''' means "no coarsening"), and also subtract the background function from only the first data set.

Note when changing a setting, specifying a list allows you to change settings for each plot, and specifying just a value will change the setting for all plots.

Most of the settings are self-explanatory (just try them!), but my favorites are

coarsen - how much to bin/average the data prior to fitting (0 means "no coarsening")
xmin, xmax - x-range over which we should actually fit the data

Enjoy!

Up next: 5. Settings, Dialogs, Printing

Provide feedback

Saved searches

Use saved searches to filter your results more quickly