Skip to content

ouzounis/cgg-toolkit

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

cgg-toolkit

CGG toolkit version 1.0.1 Citation: Vasileiou D, Karapiperis C, Baltsavia I, Chasapi A, Ahrén D, Janssen PJ, et al. (2023) CGG toolkit: Software components for computational genomics. PLoS Comput Biol 19(11): e1011498. https://doi.org/10.1371/journal.pcbi.1011498

Excerpt from abstract:

We re-launch a range of established software components for computational genomics, as legacy version 1.0.1, suitable for sequence matching, masking, searching, clustering and visualization for protein family discovery, annotation and functional characterization on a genome scale.

These applications are made available online as open source and include MagicMatch, GeneCAST, support scripts for CoGenT-like sequence collections, GeneRAGE and DifFuse, supported by centrally administered bioinformatics infrastructure funding.

The toolkit may also be conceived as a flexible genome comparison software pipeline that supports research in this domain.

modified from Figure 2 legend:

pre-processing:

  • MagicMatch: facilitates cross-indexing of entries at the sequence level or simply identification.
  • GeneCAST: masks compositional bias in protein sequences and prepares query for sensitive searches.
  • cogent_utils: creates a uniformly named sequence set.

post-processing:

(following a BLAST search) -

  • clustt_utils: launches Tribe-MCL and generates protein families, and input for network visualization.
  • GeneRAGE: clusters protein sequence similarity tables (pairs-list) for protein family (and domain) detection.
  • DifFuse: similarly to GeneRAGE, detects gene fusion events within a single genome.

About

CGG toolkit version 1.0.1

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • C 75.8%
  • C++ 20.1%
  • Shell 3.3%
  • Makefile 0.8%