Skip to content

1 How to use TraceGroomer

johaGL edited this page Aug 2, 2024 · 15 revisions

How to use TraceGroomer in the command line

The execution of TraceGroomer takes only few seconds. Below we explain how to proceed.

Please prepare your files as explained the section 2 Input files.

Once your input files are ready, you must organize them in the following folder structure:

MyProject
├── data
│   ├── dataset1_data
│   │   ├── metadata_1.csv  # <- experimental setup information
│   │   └── TRACER_IsoCor_out_example.tsv   # <- the quantification(s) file (one single file)
└── groom_files
    └── dataset1
          ├── config-1-groom.yml  # <- the basic configuration in .yml
          └── ...   #  <-  optional additional files for normalisation, etc

This structure is recommended to easily re-use the data folder for DIMet.

The basic command is:

python3 -m tracegroomer -lm $MY_MEASUREMENTS_FILE_PATH -tf $MY_TYPE_OF_INPUT -cf $MY_BASIC_YML_CONFIG_PATH

Command compulsory parameters:

  • -lm: is the file that contains the_labeled metabolomics_ measurements, in absolute path.
  • -tf: must be one of: IsoCor_out_tsv, rule_tsv , VIBMEC_xlsx, or generic_xlsx
  • -cf is the path to the basic configuration .yml file (its content is explained in 2 Input files).

We recommend to run a test with the provided examples if this is the first time you use TraceGroomer. Re-use the folder organization and the configuration .yml as template, and modify the command to be suitable to your data. See the section below.

Running a test with the provided examples

To perform a test using the examples we provide, please download and uncompress our examples from Zenodo. The structure of the folder is:

examples_TraceGroomer
├── data
│   ├── example-isocor_data
│   │   ├── metadata_1.csv
│   │   └── TRACER_IsoCor_out_example.tsv
│   ├── example-ruletsv_data
│   │   ├── sampleMetadata_2.tsv
│   │   ├── TRACER_dataMatrix.tsv
│   │   └── variableMetadata.tsv
│   ├── example-sheet_data
│   │   ├── metadata_3.csv
│   │   └── TRACER_generic_sheet.xlsx
│   └── example-vib_data
│       ├── metadata_4.csv
│       └── TRACER_metabo_4.xlsx
└── groom_files
    ├── example-isocor
    │   ├── amount_material_weightorcells.csv
    │   └── config-1-groom.yml
    ├── example-ruletsv
    │   ├── amount_material_weightorcells.csv
    │   └── config-2-groom.yml
    ├── example-sheet
    │   └── config-3-groom.yml
    └── example-vib
        ├── config-4-groom.yml
        ├── nbcells-or-amountOfMaterial.csv
        └── reject_list.csv


These examples correspond to:

  1. example-isocor_data: IsoCor generated .tsv file
  2. example-ruletsv_data: set of .tsv files following the rule that the dataMatrix file is always accompanied by a variableMetadata file
  3. example-sheet_data: "generic" type of xlsx file
  4. example-vib_data: VIB MEC xlsx file

Run the script with the provided examples

The following commands assume that the downloaded examples are in the 'home' directory. Please modify, according to your working directory, the absolute paths in the .yml files and in the bash commands.

  • For IsoCor data:
python3 -m tracegroomer \
  -lm ~/examples_TraceGroomer/data/example-isocor_data/TRACER_IsoCor_out_example.tsv \
  -tf IsoCor_out_tsv \
  -cf ~/examples_TraceGroomer/groom_files/example-isocor/config-1-groom.yml
  • or, for the "rule" case:
python -m tracegroomer \
  -lm ~/examples_TraceGroomer/data/example-ruletsv_data/TRACER_dataMatrix.tsv \
  -tf rule_tsv \
  -cf ~/examples_TraceGroomer/groom_files/example-ruletsv/config-2-groom.yml
  • or, for the generic xlsx case:
python3 -m tracegroomer \
  -lm ~/examples_TraceGroomer/data/example-sheet_data/TRACER_generic_sheet.xlsx \
  -tf generic_xlsx \
  -cf ~/examples_TraceGroomer/groom_files/example-sheet/config-3-groom.yml
  • or, for VIB MEC data:
python3 -m tracegroomer \
  -lm ~/examples_TraceGroomer/data/example-vib_data/TRACER_metabo_4.xlsx \
  -tf VIBMEC_xlsx \
  -cf ~/examples_TraceGroomer/groom_files/example-vib/config-4-groom.yml

The output

The output files are saved in the folder that you specified in the config .yml file (groom_out_path field). The data/[my_dataset] location is recommended for saving the output. A total of 4 output files are generated if the absolute isotopologues are provided, otherwise 3 files are generated.

These output data/ files are rapidly generated so you can give them as input to DIMet in seconds !

The format of these output files is tab-delimited .csv.