Skip to content

Commit

Permalink
page update
Browse files Browse the repository at this point in the history
  • Loading branch information
jke000 committed Oct 15, 2024
1 parent 2be35b4 commit a25f613
Show file tree
Hide file tree
Showing 2 changed files with 39 additions and 28 deletions.
53 changes: 31 additions & 22 deletions notes/20241001_FI.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,11 +3,11 @@
Fragment ion indexing was first introduced by [MSFragger in 2017](https://pubmed.ncbi.nlm.nih.gov/28394336/) and this
strategy has since been adopted in search tools like [MetaMorpheus](https://pubmed.ncbi.nlm.nih.gov/29578715/)
and [Sage](https://pubmed.ncbi.nlm.nih.gov/37819886/). And yes, you are encouraged
to go use MSFragger, MetaMorpheus, Sage and all of the other great search tools out
there.
to go use MSFragger, MetaMorpheus, Sage and all of the other great peptide identification
tools out there.

Fragment ion indexing (abbreviated as "FI" going forward) is supported in Comet as of
[version 2024.02 rev. 0](https://uwpr.github.io/Comet/releases/release_202402.html).
Fragment ion indexing (abbreviated as "FI" or "Comet-FI" going forward) is supported
n Comet as of [version 2024.02 rev. 0](https://uwpr.github.io/Comet/releases/release_202402.html).
Given this is the first Comet release with FI functionality, we expect to improve on
features, performance, and functionality going forward.

Expand Down Expand Up @@ -66,17 +66,24 @@ can be avoided for all subsequent files being searched.

### Current limitations and known issues with Comet-FI:
- MSFragger's database slicing has not yet been implemented so you must have
enough RAM to stored the entire FI in memory. Note that for real-time
search application, database slicing is not feasible.
enough RAM to stored the entire FI in memory. Note that for the real-time
search application for intelligent instrument control, database slicing is
not feasible.
- Protein n-term and c-term variable modifications are not supported in this initial FI release.
This fuctionality is expected to be added soon. This means that variable
modifications are limited to residues and peptide termini.
- Only [variable_mod01 through variable_mod05](https://uwpr.github.io/Comet/parameters/parameters_202402/variable_modXX.html) are supported with FI.
This is a limit imposed to restrict the FI to a reasonable size.
- For each variable_modXX, a maximum of 5 modified residues will be considered in a peptide. This
might further be limited by the total allowed number of modified residues
in a peptide controlled by the [max_variable_mods_in_peptide](https://uwpr.github.io/Comet/parameters/parameters_202402/max_variable_mods_in_peptide.html) parameter.
- Comet's internal decoy search via the
[decoy_search](https://uwpr.github.io/Comet/parameters/parameters_202402/decoy_search.html)
parameter is not supported. For FDR analysis, you should supply Comet a FASTA
file containig target and decoy entries.

### Fragment ion index specific search parameters

- [fragindex_min_fragmentmass](/Comet/parameters/parameters_202402/fragindex_min_fragmentmass.html)
- [fragindex_max_fragmentmasss](/Comet/parameters/parameters_202402/fragindex_max_fragmentmass.html)
- [fragindex_min_ions_report](/Comet/parameters/parameters_202402/fragindex_min_ions_report.html)
Expand All @@ -102,21 +109,23 @@ user who wants to analyze MHC peptides requiring non-specific enzyme constraint
make sure you have a 128GB box before attempting this analysis with this version of Comet.

The following searches were run using 8-cores of an AMD Epyc 7443P processor with
256GB RAM running on Ubuntu linux version 22.04 Search times and memory use are
noted:

- Yeast forward + reverse (XXXX sequence entries), tryptic, 1 allowed
missed cleavage, variable mods 16M, peptide length 5 to 50 uses XX GB of RAM
and completes in XXX.
- Human forward + reverse (1XX,XXX sequence entries), tryptic, 1 allowed
missed cleavage, variable mods 16M, peptide length 5 to 50 uses XX GB of RAM
and completes in XXX.
256GB RAM running on Ubuntu linux version 22.04 Up to two of each specified variable
modifications are allowed in a peptide. The query file is a two hour Orbitrap Lumos
run with MS/MS spectra acquired in the Orbitrap. Peptide length 7 to 50 and
digest mass range 600.0 to 5000.0.

- Yeast forward + reverse (12,488 sequence entries), tryptic, 1 allowed
missed cleavage, variable mods 16M and 80STY uses 5.6 GB of RAM and
completes in 31 seconds.
- Human forward + reverse (193,864 sequence entries), tryptic, 1 allowed
missed cleavage, variable mods 16M uses 5.2 GB of RAM and completes in
31 seconds.
- Human forward + reverse (1XX,XXX sequence entries), tryptic, 1 allowed
missed cleavage, variable mods 16M, 80STY, peptide length 5 to 50 uses XX GB of RAM
and completes in XXX.
- Human forward + reverse, no enzyme constraint, no variable mods,
peptide length range 7 to 15 uses XX GB of RAM
and completes in XXX.
missed cleavage, variable mods 16M, 80STY uses 11.3 GB of RAM and
completes in 68 seconds. Corresponding standard Comet search
took 4 minutes and 10 seconds.
- Human forward + reverse sequences, no enzyme constraint, 16M variable mod,
peptide length range 7 to 15 uses XX GB of RAM
and completes in XXX.
peptide length range 7 to 15 uses 49 GB of RAM and completes in 5 minutes
and 30 seconds. However, just creating the plain peptide .idx file takes
over 100 GB RAM and 12 minutes. The corresponding standard Comet search
took 20 minutes and 5 seconds.
14 changes: 8 additions & 6 deletions releases/release_202402.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,11 +8,12 @@ Download release [here](https://github.com/UWPR/Comet/releases).

- Add fragment ion indexing support.
While fragment ion indexing code was present in the 2024.01 rev. 0 release,
this is the first release to official support fragment ion indexing, which
is a method that was originally implemented by [MSFragger](https://www.nature.com/articles/nmeth.4256).
this is the first Comet release to official support fragment ion indexing
which is a method that was originally implemented by
[MSFragger](https://www.nature.com/articles/nmeth.4256).
In Comet's implementation, the fragment ion index is applied as a
candidate peptide filter prior to performing full cross-correlation analysis.
[Please see this note](https://uwpr.github.io/Comet/notes/20241001_FI.html)
[Please see this page](https://uwpr.github.io/Comet/notes/20241001_FI.html)
for more details on Comet's fragment ion index.
Thanks to V. Sharma for implementing the modifications permutation code and
to E. Bergstrom, C. McGann, and D. Schweppe for driving the development and testing.
Expand All @@ -25,9 +26,10 @@ The following are new search parameters specific to this feature.
- [fragindex_skipreadprecursors](https://uwpr.github.io/Comet/parameters/parameters_202402/fragindex_skipreadprecursors.html)

- Allow variable modifications to apply to a subet of proteins.
For example, one can now apply mono-, di-, and tri-methylation
as variable modifications to only histone proteins and not all
proteins in the human database. This functionality is controlled by the
For example, one can now limit mono-, di-, and tri-methylation
as variable modifications to only histone proteins and not have
to apply those modifications on all proteins in the human database.
This functionality is controlled by the
[protein_modlist_file](https://uwpr.github.io/Comet/parameters/parameters_202402/protein_modlist_file.html)
parameter. Note there will be issues for post processing analysis, such
as FDR, when applying this feature. Thanks to C. McGann for the feature request.

0 comments on commit a25f613

Please sign in to comment.