-
Notifications
You must be signed in to change notification settings - Fork 10
Distance tree
The executables are in the directory $TT/phylogeny/.
Optimize an existing distance tree or create a distance tree.
Main parameters:
-
-input_treetree: tree file which has been produced by the-output_treeparameter; - dissimilarity data:
-
-datadata: data file in Data Master format, see Data Master format, or an incremental distance tree directory ending with/; -
-dissim_attrdissim_attr: dissimilarity attribute in the data file;
-
- dissimilarity transformations:
-
-dissim_coeffc: dissimilarities are multiplied by c; -
-dissim_powerp: dissimilarities are raised to the power of p;
-
- dissimilarity variance:
-
-variance{ lin | sqr | pow | exp | linExp }: -
-variance_powerp: non-negative power for-variance pow; -
-variance_dissim: flag indicating that variance function is applied to dissimilarities rather than to tree distances; -
-variance_minm: minimum dissimilarity variance to be added to the computed dissimilarity variance;
-
- deletion of objects:
-
-deleteobj_list: list of objects to delete from the tree; -
-keepobj_list: list of objects to keep in the tree and delete all the other objects;
-
- optimization:
-
-optimize: flag indicating that tree must be optimized; -
-subgraph_iter_maxi: maximum number of iterations of subgraph optimizations; -
-skip_len: flag indicating that arc length optimization should be skipped; -
-reinsert: flag indicating the usage of optimization by reinsertion;
-
- fitness outliers:
-
-delete_criterion_outlierscriterion_outlier_list: output file to save the list of criterion outliers; -
-criterion_outlier_num_maxn: maximum length of criterion_outlier_list; -
-delete_deformation_outliersdeformation_outlier_list: output file to save the list of deformation outliers; -
-deformation_outlier_num_maxn: maximum length of deformation_outlier_list;
-
- hybrid outliers:
-
-hybridness_minhybridness_min: minimum hybridness of hybrid triangles; -
dissim_boundaryb: point of discontinuity in the dissimilarity distribution.
Hybrid triangles are not identified for dissimilarities close to this value; -
-delete_hybridshybrid_triangles: output file with hybrid triangles;
-
-
-reroot_atobj1:obj2: make the middle of the arc of the least common ancestor of the objects named obj1 and obj2 the root of the tree; -
-output_treetree: create a tree file in internal format; -
-threadsn: use n processor threads.
Create a tree using the Data Master file $TT/phylogeny/data/Saccharomyces.dm:
$TT/phylogeny/makeDistTree -threads 3 -data $TT/phylogeny/data/Saccharomyces \
-variance linExp -optimize -subgraph_iter_max 2 \
-hybridness_min 1.2 -delete_hybrids Saccharomyces.hybrid -dissim_boundary 0.675 \
-output_tree Saccharomyces.tree
Remove all objects from a tree in.tree which are not in the list list:
$TT/phylogeny/makeDistTree -input_tree in.tree -keep list -output_tree out.tree
Find genogroups in a tere given a distance threshold.
Main parameters:
- input_tree: Input tree file;
- genogroup_dist: Max. distance between objects of the same genogroup;
-
-genogroup_tabletable: Output file with lines:<object> <genogroup leader>; -
-genogroupsgenogroups: Output file with the names of the interior nodes which are genogroup roots; -
-genogroup_under_genogrouptable: Output file with lines:<node1 LCA name> <node2 LCA name>, where nodes belong to different genogroups, but node1 is a child of node2.
Print the list of objects of a distance tree.
Parameter: Input distance tree made by makeDistTree.
Optimize of an existing tree using a subset of dissimilarities with a change of dissimilarity variance:
$TT/phylogeny/tree2obj.sh Saccharomyces.tree > Saccharomyces.list
$TT/dm/dm2subset $TT/phylogeny/data/Saccharomyces Saccharomyces.list > subset.dm
$TT/phylogeny/makeDistTree -threads 3 -input_tree Saccharomyces.tree -data subset
-variance pow -variance_power 3 -optimize -subgraph_iter_max 2
Extract the list of hybrid objects from the file hybrid_triangles made by makeDistTree and print it.
Parameter: file hybrid_triangles.
Main parameters:
- Input tree file
-
-name_matchname_match: File with lines:<name_old> <tab> <name_new>, to replace leaf names; -
-decimalsdecimals: Number of decimals in arc lengths, default = 6; -
-format{ newick | itree (makeDistTree output) | ASNT (textual ASN.1) } : default =newick; -
-ext_name: Extended leaf names fornewick; -
-order: Order subtrees by the number of leaves descending,
Convert a tree from an internal format to Newick adding normalized object criterion to each leaf:
$TT/phylogeny/printDistTree -data $TT/phylogeny/data/Enterobacteriaceae -dissim_attr Conservation \
-variance linExp Enterobacteriaceae.tree \
-order -decimals 4 -ext_name > Enterobacteriaceae.nw
Convert a tree from an internal format to Newick without adding normalized object criterion to each leaf:
$TT/phylogeny/printDistTree Enterobacteriaceae.tree -order -decimals 4 \
> Enterobacteriaceae.nw
Convert a newick tree to the makeDistTree tree format.
Parameter: Input newick tree.
PAUP* version used: Portable version 4.0b10 for Unix
$TT/phylogeny/attr2_2paup $TT/phylogeny/data/Saccharomyces cons map > Saccharomyces.nex
$ paup Saccharomyces.nex
paup> Set criterion=distance;
paup> dset objective=lsfit power=2;
paup> hsearch
...
Elapsed Taxa Rearr. -- Number of trees -- Best
time added tried saved left-to-swap tree(s)
--------------------------------------------------------------
0:01:00 - 247 1 1 3148.9391
...
1:00:07 - 14984 1 1 1334.1309
^C
$TT/phylogeny/makeDistTree -data $TT/phylogeny/data/Saccharomyces -variance sqr \
-variance_dissim -optimize
Takes 2 min.
Abs. criterion = 6.4861e+02.