Skip to content

Commit

Permalink
Add Kendall's tau to Edge Correlation
Browse files Browse the repository at this point in the history
  • Loading branch information
lczech committed Feb 13, 2024
1 parent 6bfffc2 commit 6b1adfd
Show file tree
Hide file tree
Showing 4 changed files with 31 additions and 6 deletions.
2 changes: 1 addition & 1 deletion CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -138,7 +138,7 @@ include( "${CMAKE_CURRENT_LIST_DIR}/tools/cmake/DownloadDependency.cmake" )
# These are replaced by tools/cmake/update_dependencies.sh to the hashes that are currently checked out.
# Thus, do not replace the hashes manually!
SET( CLI11_COMMIT_HASH "13becaddb657eacd090537719a669d66d393b8b2" ) #CLI11_COMMIT_HASH#
SET( genesis_COMMIT_HASH "6b08cb22be0af409c3de9daac1416a3f42359a3d" ) #genesis_COMMIT_HASH#
SET( genesis_COMMIT_HASH "2eca98c651e61fefc56e6a43549de13a415a3059" ) #genesis_COMMIT_HASH#
SET( sparsepp_COMMIT_HASH "6bfe3b4bdb364993e612d6bb729d680cf4c77649" ) #sparsepp_COMMIT_HASH#

# Call the github download function, which takes four arguments:
Expand Down
4 changes: 2 additions & 2 deletions doc/md/correlation.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

The command takes a set of `jplace` files (called samples), as well as a table containing metadata features for each sample. It then calculates and visualizes the Edge Correlation with the metadata features per edge of the reference tree. The files need to have the same reference tree.

Edge Correlation is explained and evaluated in detail in our article (in preparation). The following figure and its caption are an example adapted from this article:
Edge Correlation is explained and evaluated in detail in our article ([doi:10.1371/journal.pone.0217050](https://doi.org/10.1371/journal.pone.0217050)). The following figure and its caption are an example adapted from this article:

<br>

Expand Down Expand Up @@ -46,7 +46,7 @@ Controls whether to use masses or imbalances. By default, trees using both of th

### Correlation Method (`--method`)

Controls which method of correlation is used for the visualization. By default, Pearsons and Spearmans are used, that is, trees for each of them are created.
Controls which method of correlation is used for the visualization. We offer Pearson's `r`, Spearman's `rho`, and Kendall's `tau` (in the tau-b variant) correlation coefficients. By default, trees for all of them are created.

### Normalization (`--mass-norm`)

Expand Down
29 changes: 27 additions & 2 deletions src/commands/analyze/correlation.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -73,7 +73,8 @@ struct CorrelationVariant
enum CorrelationMethod
{
kPearson,
kSpearman
kSpearman,
kKendall
};

CorrelationVariant( std::string const& n, EdgeValues m, CorrelationMethod d )
Expand Down Expand Up @@ -130,7 +131,7 @@ void setup_correlation( CLI::App& app )
true
)->group( "Settings" )
->transform(
CLI::IsMember({ "all", "pearson", "spearman" }, CLI::ignore_case )
CLI::IsMember({ "all", "pearson", "spearman", "kendall" }, CLI::ignore_case )
);

// Color. We allow max, but not min, as this is always 0.
Expand Down Expand Up @@ -178,6 +179,11 @@ std::vector<CorrelationVariant> get_variants( CorrelationOptions const& options
"masses_spearman", CorrelationVariant::kMasses, CorrelationVariant::kSpearman
});
}
if(( options.method == "all" ) || ( options.method == "kendall" )) {
variants.push_back({
"masses_kendall", CorrelationVariant::kMasses, CorrelationVariant::kKendall
});
}
}
if(( options.edge_values == "both" ) || ( options.edge_values == "imbalances" )) {
if(( options.method == "all" ) || ( options.method == "pearson" )) {
Expand All @@ -190,6 +196,11 @@ std::vector<CorrelationVariant> get_variants( CorrelationOptions const& options
"imbalances_spearman", CorrelationVariant::kImbalances, CorrelationVariant::kSpearman
});
}
if(( options.method == "all" ) || ( options.method == "kendall" )) {
variants.push_back({
"imbalances_kendall", CorrelationVariant::kImbalances, CorrelationVariant::kKendall
});
}
}

return variants;
Expand Down Expand Up @@ -323,6 +334,13 @@ void run_with_matrix(
corrname = "Spearman";
break;
}
case CorrelationVariant::kKendall: {
corrname = "Kendall";
break;
}
default: {
throw std::runtime_error( "Internal Error: Invalid correlation variant." );
}
}
LOG_MSG1 << "Writing " << corrname << " correlation with meta-data column "
<< meta_col.name() << ".";
Expand All @@ -348,6 +366,13 @@ void run_with_matrix(
);
break;
}
case CorrelationVariant::kKendall: {
corr_vec[e] = kendalls_tau_correlation_coefficient(
meta_dbl.begin(), meta_dbl.end(),
edge_values.col( e ).begin(), edge_values.col( e ).end()
);
break;
}
default: {
throw std::runtime_error( "Internal Error: Invalid correlation variant." );
}
Expand Down

0 comments on commit 6b1adfd

Please sign in to comment.