-
Hello, I calculated gCF and sCF values as per instructions given in iqtree2 documentation (http://www.iqtree.org/doc/Concordance-Factor#site-concordance-factor-scf). As far as I understand, sCF values should not be below 30? I however got several sCF values that were below 30 for several species I analysed. What could be the reason behind this? |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
@josh3397, sCF (and sCFl) values can take any value from 0-100%. In a well-behaved dataset, where the focal branch is the true branch in the species tree, where different methods (like parsimony and likelihood) agree, the dataset is (very) informative (e.g. thousands of decisive sites), and where you are free of too much homoplasy and other things, one wouldn't expect the sCF (or sCFl) to go below 33%. However, if any of these things are not true, then the site concordance factors can be much lower than 33%. A simple example. If the focal branch disagrees with all of the decisive sites, then the sCF and sCFl will both be 0%. Another simple example. Perhaps the true sCF is 60%, but you have very few decisive sites (e.g. sCFN = 5), so the point estimate of the sCF is 20%. In this case the point estimate would have very wide confidence intervals. In other words, double check the sample sizes for all point estimates (and see below for calculating confidence intervals). A lot more information on the intricacies of these measures is in these papers: https://ecoevorxiv.org/repository/view/6484/ And here's a way to calculate confidence intervals on CFs in general (will be in the real docs soon): https://github.com/iqtree/iqtree2/wiki/Estimating-gene,-site,-and-quartet-concordance-vectors Here you can generate concordance vectors (definition in the preprint above) like this: Feel free to post your input and output files, with the specific branch(es) you are interested in, and I can try to give you more specific feedback. |
Beta Was this translation helpful? Give feedback.
@josh3397, sCF (and sCFl) values can take any value from 0-100%.
In a well-behaved dataset, where the focal branch is the true branch in the species tree, where different methods (like parsimony and likelihood) agree, the dataset is (very) informative (e.g. thousands of decisive sites), and where you are free of too much homoplasy and other things, one wouldn't expect the sCF (or sCFl) to go below 33%. However, if any of these things are not true, then the site concordance factors can be much lower than 33%.
A simple example. If the focal branch disagrees with all of the decisive sites, then the sCF and sCFl will both be 0%.
Another simple example. Perhaps the true sCF is 60%, but you hav…