documentation

USING MFIT

This documentation is fairly brief at the moment. We hope to include further
information, including real- and Fourier-space mathematical descriptions of the
model components, in a future release.


Running mfit/clfit/fitgui
=========================

There are three interfaces to the model-fitting code. 'mfit' should be run in a
terminal window, and prompts for input as it goes along. 'clfit' (for
command-line fit) gets all required parameters from the command line and does
not prompt for further input. Both can make plots to any specified PGPLOT
device. 'fitgui' is a graphical front-end to clfit.

The command-line arguments accepted by clfit can be listed by typing 'clfit'
without any arguments. The prompts in mfit and the labelling of GUI elements in
fitgui should be self-explanatory. If not, a little trial and error may help!


Plotting
========

mfit plots the uv coverage immediately after reading the data. It also plots
squared visibility, triple product amplitude, and triple product phase against
the (longest) projected basline, first for the initial model, then for the
best-fit model.  Flagged points are shown in red. If all points of a particular
data type are flagged, or there is no data of that type, the plot is skipped.

After each plot, the program asks 'enter x-axis range for replot ([return] to
skip)'. Enter two numbers followed by RETURN to plot the same data over a
narrower x-axis range, or just RETURN to continue.

clfit will generate at most one plot for each invocation. Use the '-p'
argument to specify the plot type, or the '-z' argument to specify the plot
type and x-axis range. Valid plot types are:

uv          uv coverage for squared-visibility data
vis2        Squared visibility against projected baseline
t3amp       Triple product amplitude against longest projected baseline in
            triangle
t3phi       Triple product phase against longest projected baseline in triangle
post N      One-dimensional cut through the minimum of the negative log
            posterior, along the axis of variable parameter N (i.e. not a
            marginalised posterior)
post2d M N  Two-dimensional slice through the minimum of the negative log
            posterior, along the axes of variable parameters M and N
vis2-wl     Squared visibility against wavelength (for data with multiple
            spectral channels)
t3amp-wl    Triple product amplitude against wavelength
t3phi-wl    Triple product phase against wavelength
vis2-mjd    Squared visibility against Modified Julian Day
t3amp-mjd   Triple product amplitude against Modified Julian Day
t3phi-mjd   Triple product phase against Modified Julian Day
vis2-st     Squared visibility against Greenwich Mean Sidereal Time
t3amp-st    Triple product amplitude against Greenwich Mean Sidereal Time
t3phi-st    Triple product phase against Greenwich Mean Sidereal Time
vis2-pa     Squared visibility against baseline position angle

fitgui will similarly make up to one plot each time 'Go' is clicked. The plot
types are as listed above.


Data files
==========

mfit and clfit (and hence fitgui) will read the following types of data file:

*fits	    OIFITS format - see
	    http://www.mrao.cam.ac.uk/~jsy1001/exchange/
	    Only the squared visibility and triple product data types are
	    used.

*calib      MSC wbCalib and nbCalib formats
            The code for reading these formats has not been tested extensively.

[The file formats below have been used for COAST data, but won't be of much
 interest to the community. Details are in the COAST Data Reduction Manual]

.vis	    Old analyse format (no errors on data points), visibility
            amplitudes only

.nvis	    New analyse format (errors on data points), visibility
            amplitudes only

.mapdat	    Mapping data format. Can contain (squared) visibility amplitudes
	    and/or triple products

Both .vis and .nvis files contain *signed* visibility amplitudes. The models
fit by analyse are all centro-symmetric, hence have purely real visibilities,
so you are supposed to assign appropriate signs (ideally using contemporaneous
closure phase data) to sqrt(V^2) values when constructing a vis or nvis file.

mfit ignores these signs and works with squared visibilities. You should put
any closure phase measurements in explicitly (you will need to create a mapdat
or OI-FITS file for this).

Model visibilities are coherently averaged over a top-hat bandpass with the
bandwidth from the data file (or supplied by the user), to take into account
bandwidth smearing.


Model file format
=================

The program reads through the model file until it comes across the keyword
'source' at the begining of a line. It then reads the text following this
keyword as the source/model name.

The program then reads in the various components of the model. It runs through
the model file until it comes across the keyword "component" at the begining of
the line. It then reads the data for that particular component: N.B. to
disregard a component simply replace "component" with "!component" etc - it
will then be ignored as the keyword marking a component info block.

Component data must follow this prescription: keyword is present on the start
of the line followed by the values associated with it.

Maximum number of components is set to 10.  Maximum order of taylor and
gauss-hermite LD types is also set to 10.

By default, models are wavelength-independent.

Keyword            Parameter information
-------            ---------------------

name     	   A one word name for the component itself, e.g. 'main' or 
		   'hot_spot' or something.

shape_type         One word specifying the shape of the component. Can use 
		   either 'ellipse' 'disc' or 'point' obviously corresponding
		   to elliptical, circular or unresolved point objects.

ld_type            One word specifying limb darkening type of the component, 
	           'uniform', 'taylor', 'hestroffer', 'square-root', 
	           'gauss-hermite' or 'gaussian'. Hankel transform functions 
		   all as defined in the maths work. if 'point' is specified 
	           as the shape_type then nothing should be put here.

		   Alternatively a filename in angle brackets <> may be given
		   (only for one model component). This should be a text file
		   specifying a limb-darkening profile numerically, i.e. r and
		   I(r), perhaps as a function of wavelength. Acceptable
		   formats for the file are described under 'Numerical
		   limb-darkening' below.

ld_order           A single integer (1-10) specifying the order of the
		   expansion in the 'taylor' and 'gauss-hermite' LD cases.
                   No value should be put here in other cases.

position           Two values follow describing the position of the component
		   with respect to the chosen phase reference point.
	           First number is the polar radius r of the component in mas.
                   Second number is the polar angle theta in degrees (measured
                   N>E)

position_prior     Two values giving the prior widths for the position data
		   above (prior assumed gaussian of standard deviation = prior
		   width)

flux               One value giving the flux B of the source.
                   Visibilties are always normalised so that unit visibility is
                   obtained on zero baselines therefore flux is a relative 
                   quantitity and has no meaning for a single component model.

flux_prior         One value giving prior width on the flux

shape_param        Up to three values giving shape info.
                   if 'point' is the shape type nothing should be put here.
                   if 'disc' is the shape type only one number should be put,
                   the diameter, a, of the disc in mas. ("diameter" is defined 
                   slightly differently depending on the limb darkening model,
                   e.g. it is FWHM for gaussian - see the maths stuff).
                   if 'ellipse' is the shape type all three numbers are needed
                   giving the major axis, a, in mas, the orientation, phi, of 
		   the major axis (measured N>E) in degrees 
		   and the eccentricity factor epsilon.

shape_param_prior  A prior for each of the numbers given above

ld_param           Up to twenty limb darkening parameters may be put here.
                   For 'uniform' and 'gaussian' LD types nothing should
                   be here. 
		   For 'hestroffer' only one paramter, the hestroffer 
		   parameter "alpha" should be put. 
		   For 'square-root' two parameters, the "alpha" and "beta"
		   coefficients of (1-mu) and (1-root(mu)) should be put.
                   'gauss-hermite' and 'taylor' cases require the same number
                   of parameters as specified in the ld_order field above.

ld_param_prior     Prior widths associated with each number supplied above


Numerical limb-darkening
========================

A filename in angle brackets <> may be given as the 'ld_type', for at most
one component of the model. The file should conform to one of the following
formats. The format corresponding to the filename extension (see below) will be
assumed.

As a side-effect of fitting the model to the data, fake visibility data for a
star with the initial diameter from the model file is written to a file
'fort.20' (this is a hidden feature!). The output is a text file with two
columns: visibility, and baseline in mega-lambda.

.clv file - Wavelength-independent limb-darkening:
--------------------------------------------------

This should be a two-column text file specifying r and I(r). The best-fit
diameter will correspond to the r=1 point. r should increase monotonically
through the file, and the first and last lines should have r=0 and I=0
respectively. Comment lines beginning with '#' and blank lines will be ignored.

Any other extension - Wavelength-dependent limb-darkening:
----------------------------------------------------------

This text file format specifies I(mu, lambda), where mu=sqrt(1-r**2). An
example of the top part of a file in this format follows (beware long lines):

# No. of mu values, followed by values themselves
18
0.000  0.010  0.025  0.050  0.075  0.100  0.125  0.150  0.200  0.250  0.300  0.400  0.500  0.600  0.700  0.800  0.900  1.000
# Each line consists of temperature(K) logg wavelength(nm) I(0) ... I(1.00)
4000    1.0     401     0.005680  0.006544  0.007839  0.009456  0.010797  0.012004  0.013138  0.014233  0.016391  0.018591  0.020900  0.025952  0.031787  0.038499  0.046143  0.054775  0.064395  0.075003
4000    1.0     403     0.005695  0.006484  0.007667  0.009172  0.010437  0.011581  0.012664  0.013717  0.015802  0.017946  0.020194  0.025215  0.031029  0.037786  0.045538  0.054343  0.064201  0.075128
...

Again, comment lines beginning with '#' and blank lines will be ignored. Any
number of lines giving I(mu) for a particular wavelength may be present, but
wavelength should increase monotonically through the file. The temperature and
logg values are ignored by mfit.

The wavelength-dependent format is derived from that used for the Kurucz models
supplied to Chris Haniff by Bill Tango in 2001 (I think).


Model Files: Advanced Features
=============================
:TODO: document the following advanced features:

* Support for wavelength-dependent models (specified model parameters
  can take different values for different wavebands)

* Support for rotating model components

* '#relto' option for model component positions - specifies position
  of model component w.r.t. position of another component


Illegal minimisations
=====================

Fitting is performed by attempting to minimise the negative logarithm of
posterior probability. Parameters with non-zero priors are considered to be
free parameters in the fitting process. The fitting routine will trigger an
error on the following "illegal" types of minimisation:

1. Attempting to fit a one component model with position angle theta and/or
   position radius r as free parameter(s)
2. Attempting to fit any components that have non-free position radius r 
   that is set to zero but have position angle theta as a free parameter 
   (if r is zero but free to be varied then this is okay)
3. Attempting to fit a one component model with flux B as a free parameter
4. Attempting to fit any elliptical components that have non-free eccentricity 
   set to unity but have orientation angle phi free to vary
5. Any model with no free parameters(!)
6. Attempting to fit non-centrosymmetric models to vis/nvis file types


Minimisation method
===================

The program attempts to minimise the unnormalised negative log posterior
probability (equal to the sum of the negative log prior and negative log
likelihood) of the model given the data. The data consist of any number of
squared-visibilities, triple product amplitudes, and triple product phases,
with their (assumed Gaussian) one-sigma errors.

The minimiser used is the routine PDA_UNCMND from the Starlink PDA library
(http://star-www.rl.ac.uk/star/docs/sun194.htx/sun194.html). This works better
than the simplex routine from Numerical Recipes that was used in the first
version of mfit.

From the book, "Numerical Methods and Software" by D. Kahaner, C. Moler,
S. Nash Prentice Hall, 1988:

     This routine uses a quasi-Newton algorithm with line search to
     minimize the function represented by the subroutine FCN [the
     negative log posterior].  At each iteration, the nonlinear
     function is approximated by a quadratic function derived from a
     Taylor series.  The quadratic function is minimized to obtain a
     search direction, and an approximate minimum of the nonlinear
     function along the search direction is found using a line search.
     The algorithm computes an approximation to the second derivative
     matrix of the nonlinear function using quasi-Newton techniques.

More information is available in the reference:

R.B. Schnabel, J.E. Koontz, and BE.E. Weiss, A modular system of algorithms for
unconstrained minimization, Report CU-CS-240-82, Comp. Sci. Dept., Univ. of
Colorado at Boulder, 1982.


Posterior, Goodness-of-fit
==========================

At the solution position the (unnormalised) negative log posterior is given
(i.e. the sum of the negative log prior and negative log likelihood without
the subtraction of the negative log evidence). The chi squared value is also
reported:

     chi sqrd = sum { ( (data-model)/(error in data) )^2 }

The number of degrees of freedom (vis + triple data points - no of free
parameters) is reported along with the chi sqrd value divided by this. Note
that closure phases that are linear combinations of other measured closure
phases are incorrectly counted as independent data points.


Errors
======

The hessian matrix Hes[ij] = (d^2/(dxi*dxj))(-log posterior) is computed at the
solution point. Hessian elements are estimated using a numerical
differentiation method:

1. for on-diagonal elements H[ii] = (d^2/dxi^2)(-log posterior) the element
   is estimated to be:
   H[ii] = { P(x+dx) + P(x-dx) - 2P(x) }/dx^2
   where x = x[i] dx = delta.x[i]

2. for off-diagonal elements H[ij] = (d^2/(dxi.dxj))(-log posterior) the
   element is estimated to be:
   H[ij] = {P(x+dx,y+dy)+P(x-dx,y-dy)-P(x+dx,y-dy)-P(x-dx,y+dy)}/{4.dx.dy}
   where x=x[i] y=x[j]

The covariance matrix Cov[ij] is the inverse of the Hes matrix. Some problems
appear to arise where diagonal elements of the matrix, corresponding to the
variances in the parameters, are negative. This could possibly point to poorly
chosen priors or a poor fit/irregularities in the negative log posterior
surface. Where such negative values occur the program warns the user.

The correlation matrix Cor[ij] = Cov[ij] / sqrt { Var[i].Var[j] } is also
computed. n.b. Var[i] is defined as Cov[ii]. Problems arise in the case of
negative elements on the Cov diagonals i.e. negative Var[i]'s - in this case
the correlation matrix will be incomplete with mathematically illegal elements
set to zero.

The "hessian-based error" of a parameter is the square root of the variance in
the quantity.

Newer mfit releases also quote a second set of error bars for the best-fit
parameters. These assume the error bars on the data should be scaled (by a
common factor for squared visibilities, triple amplitudes, and closure
phases) such that the reduced chi-squared at the solution point is unity.
This second set of estimates will be more conservative than the first in
the (usual) case of chi-squared > 1.


Model selection
===============

An approximation to the negative log evidence is also reported:

-log(posterior) - N/2 log(2pi) + 1/2 log(det(H))

where N is the number of model parameters varied. This expression is only valid
if the posterior pdf is unimodal, and the peak in the posterior can be
approximated by a multivariate Gaussian. Possible differences in prior widths
between models to be compared have been neglected.


Known deficiencies
==================

Ignores complex visibilities from OI-FITS file.

If closure phase data are supplied, these can result in discontinuities in the
posterior. The numerical calculation of the Hessian matrix can sometimes
erroneously cross such discontinuities. If you suspect this is happening, it is
advisable to plot cuts through the posterior.

Closure phases that are linear combinations of other measured closure
phases are (perhaps) incorrectly counted as independent data points when
evaluating the number of degrees of freedom.


$Id: documentation,v 1.15 2009/11/03 17:26:26 jsy1001 Exp $

Local variables:
fill-column: 79
End: