Skip to content

Latest commit




Produce Raw Results

Scripts to produce the raw results that can be used to create plots and tables. For both datasets (CASMI 2016 and EA (Massbank)) there are three scripts:

Script MS2 Base Scorer Note Reference in the Paper
eval__MetFrag22 MetFrag 2.2 Evaluation of metabolite identification performance Section 4.3.1, Table 3
eval__TFG Our, Bach et al. (2018) Evaluation of metabolite identification performance Section 4.3.1, Table 3
Inspect parameters of our framework, e.g. margin type or number of spanning trees Section 4.2.*
Inspect hyper-parameter estimation Section 4.2.2
eval__TGF__missing_MS2 Our Evaluation of score integration framework for missing tandem mass spectra (MS2) Section 4.4

Re-run Experiments

Here, we will describe how the experiments can be re-run on the example of (EA Massbank). Assuming you have installed the nmsmsrt_scorer package and, if needed, activated the virtual environment, you can run the evaluation script as follows:

python EA_Massbank/ \
      --mode=EVALUATION_MODE \
      --D_value_grid 0.001 0.005 0.01 0.05 0.1 0.15 0.25 0.35 0.5 \
      --make_order_prob=EDGE_POTENTIAL_FUNCTION \
      --order_prob_k_grid platt \
      --margin_type=MARGIN_TYPE \
      --ion_mode=IONIZATION_MODE \
      --max_n_ms2=NUMBER_OF_MS2_FOR_TEST \
      --database_fn=SCORE_DB_FN \

Detailed description of selected parameters

A description of all parameters, can be found in the __main___ of the "eval__" script files. Some selected parameters will be explained here:


EVALUATION_MODE [1, 2] Description
application Results to Evaluate the performance on the test sets in the application setting.
development Performance evaluation of training and test set for each hyper parameter grid value
missing_ms2 Performance evaluation for the mssing MS2 experiment


Grid used to search for the best retention order weight (see Section 2.2.4 and 3.4). We use [0.001, 0.005, 0.01, 0.05, 0.1, 0.15, 0.25, 0.35, 0.5] in our experiments.


Grid used to search for the best sigmoid slope parameter when using EDGE_POTENTIAL_FUNCTION=sigmoid or EDGE_POTENTIAL_FUNCTION=hinge_sigmoid (see Section 2.2.3 and 3.4). As grid we use [0.25, 0.5, 0.75, 1.0, 1.5, 2.0, 2.5, 3.0, 3.5, 4.0, 4.5, 5.0, 5.5, 6.0, 7.0, 10.0 for the Hinge-Sigmoid. Our experiments shows that for Sigmoid we can use Platt's ("platt") method to determine the optimal value for k (see Section 4.2.2).

--ìon_mode and --max_n_ms2

These two parameter controll which ionization mode should be evaluation (negative or positive) and how many MS-features are used to calculate the test accuracy. The following settings are available (see Section 3.1):

CASMI (2016) positive 75
negative 50
EA (Massbank) positive 100
negative 65


Number of (training, test)-set samples. In our experiments we use NUMBER_OF_RANDOM_TEST_TRAINING_SETS=50 (CASMI, EA (negative)) and NUMBER_OF_RANDOM_TEST_TRAINING_SETS=100 (EA (positive))


Path to the SQLite DB.


Path to the output directory storing the raw results. The output directory will sub-directories separating the results resulting from different parameter settings.

Example: EA (Massbank) positive, Results for Table 3

Running the following command can be used to reproduce the EA (Massbank) positive results in Table 3. Note, to speed up the calculations, this command uses a reduced D-value grid, only 4 random spanning trees and only 3 samples. To get the exact results as in the paper, you need to set:

--mode application
--D_value_grid 0.001 0.005 0.01 0.05 0.1 0.15 0.25 0.35 0.5
--n_random_trees 32
--n_samples 100

However, running the simplified setting can verify for you that the scripts are running in your configuration.

python EA_Massbank/ \
      --mode=debug_application \
      --D_value_grid 0.01 0.1 0.25\
      --make_order_prob=sigmoid \
      --order_prob_k_grid platt \
      --margin_type=max \
      --n_random_trees=4 \
      --n_samples=3 \
      --ion_mode=negative \
      --max_n_ms2=65 \
      --database_fn=!!!_YOUR_SCORE_DB_FN_!!! \

The results will be stored in:

      └── debug_application
          └── tree_method=random__n_trees=4__make_order_prob=sigmoid__param_selection_measure=topk_auc__norm_scores=none__mtype=max
              └── ion_mode=negative__participant=MetFrag_2.4.5__8afe4a14__max_n_cand=inf__sort_candidates_by_ms2_score=0
                  └── trainset=MEOH_AND_CASMI_JOINT__keep_test=0__est=ranksvm__mol_rep=substructure_count

You will find:

File Description
measures.csv Training set performance measures for each (D, k) grid value to select the best parameter
opt_params.csv Selected (D, k) for each sample
topk_casmi__max_n_ms2=VALUE__sample_id=VALUE.pkl.gz Top-k accuracies for the baseline (Only MS) and after the score integration (MS + RT)

You can load the results using the load_results function. To reproduce figures and tables of the paper, please take a look here.