Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update to FBMN MSDIAL/Progenesis/Metaboscape converter #868

Open
wants to merge 14 commits into
base: master
Choose a base branch
from
Open
4 changes: 2 additions & 2 deletions feature-based-molecular-networking/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,8 @@ include ../Makefile.deploytemplate

WORKFLOW_NAME=feature-based-molecular-networking
TOOL_FOLDER_NAME=feature-based-molecular-networking
WORKFLOW_VERSION=release_31
WORKFLOW_DESCRIPTION='Feature-Based Molecular Networking (FBMN) is a computational method that bridges popular mass spectrometry data processing tools for LC-MS/MS and molecular networking analysis on GNPS. The supported tools are: MZmine2, MZmine3, OpenMS, MS-DIAL, MetaboScape, XCMS, Progenesis QI, and the mzTab-M format. FBMN facilitates the detection of isomers that are separated by chromatographic or ion mobility separation, and provides accurate ion abundances for statistical analysis. Note that FBMN requires processing the mass spectrometry data with a feature detection and alignment tool. For rapid/qualitative analysis, we recommend using classical molecular networking that accepts unprocessed mass spectrometry files. See the FBMN documentation at https://ccms-ucsd.github.io/GNPSDocumentation/featurebasedmolecularnetworking and refer to the "Method and Citation for Manuscripts" on the results page for citations.<br>\
WORKFLOW_VERSION=release_32
WORKFLOW_DESCRIPTION='Feature-Based Molecular Networking (FBMN) is a computational method that bridges popular mass spectrometry data processing tools for LC-MS/MS and molecular networking analysis on GNPS. The supported tools are: MZmine2, MZmine3, OpenMS, MS-DIAL, MetaboScape, XCMS, Progenesis QI, SIRIUS, and the mzTab-M format. FBMN facilitates the detection of isomers that are separated by chromatographic or ion mobility separation, and provides accurate ion abundances for statistical analysis. Note that FBMN requires processing the mass spectrometry data with a feature detection and alignment tool. For rapid/qualitative analysis, we recommend using classical molecular networking that accepts unprocessed mass spectrometry files. See the FBMN documentation at https://ccms-ucsd.github.io/GNPSDocumentation/featurebasedmolecularnetworking and refer to the "Method and Citation for Manuscripts" on the results page for citations.<br>\
<hr> \
<h4> Citation </h4>\
<a href='https://www.nature.com/articles/s41592-020-0933-6'>Nothias LF, Petras D, Schmid R, Dührkop K, Rainer J, Sarvepalli A, Protsyuk I, Ernst M, Tsugawa H, Fleischauer M, Aicheler F, Aksenov AA, Alka O, Allard PM, Barsch A, Cachet X, Caraballo-Rodriguez AM, Da Silva RR, Dang T, Garg N, Gauglitz JM, Gurevich A, Isaac G, Jarmusch AK, Kameník Z, Kang KB, Kessler N, Koester I, Korf A, Le Gouellec A, Ludwig M, Martin H C, McCall LI, McSayles J, Meyer SW, Mohimani H, Morsy M, Moyne O, Neumann S, Neuweger H, Nguyen NH, Nothias-Esposito M, Paolini J, Phelan VV, Pluskal T, Quinn RA, Rogers S, Shrestha B, Tripathi A, van der Hooft JJJ, Vargas F, Weldon KC, Witting M, Yang H, Zhang Z, Zubeil F, Kohlbacher O, Böcker S, Alexandrov T, Bandeira N, Wang M, Dorrestein PC. Feature-based molecular networking in the GNPS analysis environment. Nat Methods. 2020 Sep;17(9):905-908. doi: 10.1038/s41592-020-0933-6. Epub 2020 Aug 24. PMID: 32839597; PMCID: PMC7885687.</a>"'
56 changes: 48 additions & 8 deletions feature-based-molecular-networking/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ The MGF output should contain the "SCANS" header, and it must correspond to the

### MS-DIAL

The feature quantification table (.TXT file, tab-separated) should include upper 3 rows that are ignored and headers starting in row 4 with the following column headers:
The feature quantification table (.TXT file, tab-separated) should include upper 3 or 4 rows that are ignored and headers starting in row 4 or 5 with the following column headers. It uses the "Class" column to detect the samples:

1. Alignment ID
2. Average Mz
Expand All @@ -39,18 +39,48 @@ Additionally, it is assumed there are additional columns where the per sample qu

The MGF output should contain the "SCANS" header, and it must correspond to the identifier of the "row ID". It has to be unique, and can be non sequential.

### Metaboscape
### MetaboScape

#### For MetaboScape 5.0

The feature quantification table (.CSV file, comma separated) should include columns with the following header:

1. FEATURE_ID
2. RT
3. PEPMASS
4. MaxIntensity
1. SHARED_NAME
2. FEATURE_ID
3. RT
4. PEPMASS
5. CCS (optional, only tims/PASEF data)
6. SIGMA_SCORE
7. NAME_METABOSCAPE
8. MOLECULAR_FORMULA
9. ADDUCT
10. KEGG
11. CAS
12. MaxIntensity
13. {GroupName}_MeanIntensity (0-n times, dependent on the groups defined in MetaboScape)
14. Sample Intensities

All sample headers are not including the file format extension ".d" (DDA) or ".tdf" (PASEF). The columns "FEATURE_ID", "RT", "PEPMASS", "MaxIntensity" are mandatory.
Important: In the metadata table, the filename MUST NOT HAVE the extension suffixe indicated.

#### Earlier versions of MetaboScape (<5.0)

For ion mobility data, it must include a "CCS" column.
The feature quantification table (.CSV file, comma separated) should include columns with the following header:

All sample headers are not including the file format extension ".d" (DDA) or ".tdf" (PASEF)
1. SHARED_NAME
2. FEATURE_ID
3. RT
4. PEPMASS
5. NAME
6. MOLECULAR_FORMULA
7. ADDUCT
8. KEGG
9. CAS
10. {GroupName}_MeanIntensity (0-n times, dependent on the groups defined in MetaboScape)
11. Sample Intensities

Sample headers are including the file format extension ".d". The columns "FEATURE_ID", "RT", "PEPMASS", "CAS" are mandatory.
Important: In the metadata table, the filename MUST HAVE the ".d" extensionsuffixe.

### Progenesis QI

Expand Down Expand Up @@ -91,3 +121,13 @@ The feature quantification table (.TXT format, tab-separated) should include a h
Following these headers are the samples.

The MGF output should contain the "SCANS" header, and it must correspond to the identifier of the "row ID". It has to be unique, and can be non sequential.

### SIRIUS

The feature quantification table (.CSV file, comma separated) should have three columns named:

1. row ID
2. row m/z
3. row retention time

The native sample headers from SIRIUS don't include the "Peak area" suffix, so the converter add that suffix for internal processing.
Original file line number Diff line number Diff line change
Expand Up @@ -167,6 +167,7 @@
<option value="OPENMS" label="OpenMS"/>
<option value="OPTIMUS" label="Optimus"/>
<option value="MSDIAL" label="MS-DIAL"/>
<option value="MSDIAL5" label="MS-DIAL5"/>
<option value="METABOSCAPE" label="MetaboScape"/>
<option value="XCMS3" label="XCMS"/>
<option value="PROGENESIS" label="Progenesis QI"/>
Expand Down
Loading