Skip to content

Commit

Permalink
Add lower bounds for populations' splits, fix documentation about it
Browse files Browse the repository at this point in the history
  • Loading branch information
noscode committed Apr 19, 2024
1 parent 6c61623 commit 0ffff5d
Show file tree
Hide file tree
Showing 10 changed files with 239 additions and 70 deletions.
40 changes: 34 additions & 6 deletions docs/source/user_manual/installation.rst
Original file line number Diff line number Diff line change
Expand Up @@ -17,18 +17,18 @@ GADMA requires the following dependencies (`requirements/minimal.txt`):
* Python3
* NumPy (>= 1.2.0)
* Scipy (>= 0.6.0)
* ruamel.yaml
* ruamel.yaml (<0.18.0)
* ``dadi`` (>= 1.7.0)
* ``moments`` (>= 1.0.0)
* ``momi``
* ``moments.LD`` (manual installation of ``moments`` with `--ld-extension` flag)
* ``moments.LD`` (is installed alongside with ``moments``)
* nlopt (for ``dadi``)
* Cython (for ``moments``)
* mpmath (for ``moments``)

To draw demographic models one should also install the following packages (`requirements/minimal.txt`):

* matplotlib (>= 0.98.1)
* matplotlib (>= 0.98.1, <3.5)
* Pillow (>= 4.2.1) - optional
* ``moments`` (>= 1.0.0)

Expand All @@ -51,11 +51,10 @@ To run Bayesian optimization `smac` of specified version is requered (`requireme
``momi`` package sometimes is not installed correctly for Windows and MacOS. If ``momi`` is not available please install it manually following the installation instructions in `momi's manual <https://momi2.readthedocs.io/en/latest/installation.html#>`_.

.. note::
``momentsLD`` - the extension of ``moments``, should be installed manually following the installation instructions in `moments's manual <https://moments.readthedocs.io/en/latest/installation.html#>`_.
``momentsLD`` - the extension of ``moments``, it is installed together with ``moments``.

Getting help for engine installation
------------------------------------

If there are some troubles installing the engine, please, first of all check the table below for the ability to install this engine on your system. You are always welcome to `open an issue <https://github.com/ctlab/GADMA/issues#>`_ on GitHub for getting help.

GADMA has automatic tests on GitHUb for engines on different systems (Linux, Windows, MacOS). The following table indicates (according to our tests) if engine could be installed on specified system:
Expand Down Expand Up @@ -88,7 +87,7 @@ GADMA has automatic tests on GitHUb for engines on different systems (Linux, Win
Installing the latest release
------------------------------

The latest release of GADMA is easily installed via ``pip`` or ``conda`` (``bioconda``):
The latest release of GADMA can be easily installed via ``pip`` or ``conda`` (``bioconda``):

.. code-block:: console
Expand All @@ -114,6 +113,35 @@ or
$ conda install -c bioconda moments
Troubleshooting
---------------

If you experience problems with dependencies, we recommend to create an empty `conda environment <https://conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html#>`_:

.. code-block:: console
$ conda create -n gadma_env python=3.10
$ conda activate gadma_env
It is possible to install versions that are used for testing by downloading file `minimal.txt` from `here <https://github.com/ctlab/GADMA/blob/master/requirements/minimal.txt#>`_ and install requirements using:

.. code-block:: console
$ pip install -r minimal.txt
$ pip install gadma
For **MacOS with M processor** we suggest the following recipe (credit to `@Enricobazzi <https://github.com/ctlab/GADMA/issues/82>`_):

.. code-block:: console
$ pip install git+https://github.com/MomentsLD/moments.git
$ conda install -c conda-forge dadi
$ conda install -c conda-forge scikit-allel
$ pip install gadma
$ pip uninstall ruamel.yaml
$ pip install "ruamel.yaml<0.18.0"
$ pip uninstall matplotlib
$ pip install "matplotlib<3.5"
Manual installation
-----------------------------
Expand Down
7 changes: 5 additions & 2 deletions docs/source/user_manual/set_model/set_model.rst
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ Types of demographic models

GADMA could infer two base types of demographic models:

* `Demographic model with structire <set_model_struct.rst>`__ (up to 3 populations). It is a more flexible model type as dynamics of population size change could be inferred and it has a lot of options for parameters.
* `Demographic model with structure <set_model_struct.rst>`__ (up to 3 populations). It is a more flexible model type as dynamics of population size change could be inferred and it has a lot of options for parameters.

.. admonition:: Related options

Expand All @@ -42,8 +42,11 @@ GADMA could infer two base types of demographic models:
* ``Inbreeding`` infers inbreeding coefficients (only for ``dadi`` engine).
* ``Selection`` infers selection coefficients.
* ``Ancestral size as parameter`` disables multinomial approach of ``dadi`` and ``moments`` engines when ancestral size is inferred implicitly.
* ``Lower bound of first split`` limits lower bound of the most ancient split.
* ``Upper bound of first split`` limits upper bound of the most ancient split.
* ``Upper bound of second split`` limits upper bounds of next to the most ancient split.
* ``Lower bound of second split`` limits lower bound of the next to the most ancient split.
* ``Upper bound of second split`` limits upper bound of the next to the most ancient split.


* `Custom demographic model <set_model_custom.rst>`__. It is a usual user-specified model like in ``dadi``, ``moments`` and other tools for demographic inference. Using such a model will give more control over parameters and could be used for inference of more than 3 populations but is less flexible.

Expand Down
22 changes: 17 additions & 5 deletions docs/source/user_manual/set_model/set_model_struct.rst
Original file line number Diff line number Diff line change
Expand Up @@ -200,16 +200,28 @@ Split could be set in two ways:
# param file
Split fractions: True # for 1) point
Upper bound of split
_____________________
Upper and lower bounds of splits
________________________________

To limit time of some split one should specify an option in the parameter file. Splits are numbered from the most ancient. So split 1 is split that occurred with the ancient population and split 2 is the next division of the second population (exist only for three populations). There are two appropriate options: ``Upper bound of first split`` and ``Upper bound of second split``.
It is possible to limit time of split events in the demographic model with structure. In order to do that one should specify one or multiple options in the parameter file that refer to lower and upper bounds of split events. Splits are numbered from the most ancient, so split 1 is a split event that occurred with the ancient population and split 2 is the next division of the second population (exist only for three populations). There are three options corresponded to split times: ``Lower bound of first split``, ``Upper bound of first split``, ``Lower bound of second split`` and ``Upper bound of second split``.

One should translate time from years into genetic units, therefore divide it by ``2 * T_g``, where ``T_g`` is time (in years) for one generation. For example, one wants to limit the last split to 2000 years. Time for one generation is estimated as 24 years, then one should specify in the parameter file:
Bounds should be specified in GENERATIONS. In order to translate time from years to generations, divide it by ``T_g``, where ``T_g`` is time (in years) for one generation. For example, assume we want the last split to be between 1000 and 2000 years. Time for one generation is estimated to be 24 years. Therefore we construct the following parameter file:

.. code-block:: none
# param_file
...
Upper bound of second split : 41.666
Lower bound of second split : 41.666
Upper bound of second split : 83.333
...
It is allowed to set any of those four options, just make sure they make sense. It is possible to set only one bound or one lower and one upper bounds for different splits:
.. code-block:: none
# param_file
...
# In that particular case upper bound for the second split exists automatically
Upper bound of first split : 30
Lower bound of second split : 10
...
17 changes: 13 additions & 4 deletions example_params
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@
# If it is resumed from other directory and output directory
# isn't set, GADMA will add '_resumed' for previous output
# directory.
Output directory: my_example_run
Output directory: my_example_run_2


#!!!
Expand Down Expand Up @@ -224,22 +224,31 @@ Inbreeding: False
# Default: False
Ancestral size as parameter: False

# It is possible to limit the time of splits.
# It is possible to limit the time of splits by bounds' specification.
# Split 1 is the most ancient split.
# !Note that time is in genetic units (2 * time for 1 generation):
# !Note that time is in generations:
# e.g. we want to limit by 150 kya, time for one generation is
# 25 years, then bound will be 150000 / (2*25) = 3000.
# 25 years, then bound will be 150000 / 25 = 6000.
#
# Lower bound for split 1 (in case of 2 or 3 populations).
# Default: None
Lower bound of first split: Null
#
# Upper bound for split 1 (in case of 2 or 3 populations).
# Default: None
Upper bound of first split: Null

# Lower bound for split 2 (in case of 3 populations).
# Default: None
Lower bound of second split: Null
#
# Upper bound for split 2 (in case of 3 populations).
# Default: None
Upper bound of second split: Null




#!!!
# Local optimization.
#
Expand Down
15 changes: 12 additions & 3 deletions gadma/cli/params_template
Original file line number Diff line number Diff line change
Expand Up @@ -224,22 +224,31 @@ Inbreeding :
# Default: False
Ancestral size as parameter :

# It is possible to limit the time of splits.
# It is possible to limit the time of splits by bounds' specification.
# Split 1 is the most ancient split.
# !Note that time is in genetic units (2 * time for 1 generation):
# !Note that time is in generations:
# e.g. we want to limit by 150 kya, time for one generation is
# 25 years, then bound will be 150000 / (2*25) = 3000.
# 25 years, then bound will be 150000 / 25 = 6000.
#
# Lower bound for split 1 (in case of 2 or 3 populations).
# Default: None
Lower bound of first split :
#
# Upper bound for split 1 (in case of 2 or 3 populations).
# Default: None
Upper bound of first split :

# Lower bound for split 2 (in case of 3 populations).
# Default: None
Lower bound of second split :
#
# Upper bound for split 2 (in case of 3 populations).
# Default: None
Upper bound of second split :




#!!!
# Local optimization.
#
Expand Down
3 changes: 3 additions & 0 deletions gadma/cli/settings.py
Original file line number Diff line number Diff line change
Expand Up @@ -66,7 +66,10 @@

# Time bounds
upper_bound_of_first_split = None
lower_bound_of_first_split = None
upper_bound_of_second_split = None
lower_bound_of_second_split = None


# Glocal optimizer
global_optimizer = "Genetic_algorithm"
Expand Down
Loading

0 comments on commit 0ffff5d

Please sign in to comment.