-
Notifications
You must be signed in to change notification settings - Fork 1
1 How to use TraceGroomer
The execution of TraceGroomer takes only few seconds. Below we explain how to proceed.
Please prepare your files as explained the section 2 Input files.
Once your input files are ready, you must organize them in the following folder structure:
MyProject
├── data
│ ├── dataset1_data
│ │ ├── metadata_1.csv # <- experimental setup information
│ │ └── TRACER_IsoCor_out_example.tsv # <- the quantification(s) file (one single file)
└── groom_files
└── dataset1
├── config-1-groom.yml # <- the basic configuration in .yml
└── ... # <- optional additional files for normalisation, etc
This structure is recommended to easily re-use the data
folder for DIMet.
The basic command is:
python3 -m tracegroomer -lm $MY_MEASUREMENTS_FILE_PATH -tf $MY_TYPE_OF_INPUT -cf $MY_BASIC_YML_CONFIG_PATH
Command compulsory parameters:
-
-lm
: is the file that contains the_labeled metabolomics_ measurements, in absolute path. -
-tf
: must be one of:IsoCor_out_tsv
,rule_tsv
,VIBMEC_xlsx
, orgeneric_xlsx
-
-cf
is the path to the basic configuration .yml file (its content is explained in 2 Input files).
We recommend to run a test with the provided examples if this is the first time you use TraceGroomer. Re-use the folder organization and the configuration .yml as template, and modify the command to be suitable to your data. See the section below.
To perform a test using the examples we provide, please download and uncompress our examples from Zenodo. The structure of the folder is:
examples_TraceGroomer
├── data
│ ├── example-isocor_data
│ │ ├── metadata_1.csv
│ │ └── TRACER_IsoCor_out_example.tsv
│ ├── example-ruletsv_data
│ │ ├── sampleMetadata_2.tsv
│ │ ├── TRACER_dataMatrix.tsv
│ │ └── variableMetadata.tsv
│ ├── example-sheet_data
│ │ ├── metadata_3.csv
│ │ └── TRACER_generic_sheet.xlsx
│ └── example-vib_data
│ ├── metadata_4.csv
│ └── TRACER_metabo_4.xlsx
└── groom_files
├── example-isocor
│ ├── amount_material_weightorcells.csv
│ └── config-1-groom.yml
├── example-ruletsv
│ ├── amount_material_weightorcells.csv
│ └── config-2-groom.yml
├── example-sheet
│ └── config-3-groom.yml
└── example-vib
├── config-4-groom.yml
├── nbcells-or-amountOfMaterial.csv
└── reject_list.csv
These examples correspond to:
- example-isocor_data: IsoCor generated .tsv file
- example-ruletsv_data: set of .tsv files following the rule of "three files", it is, sampleMetadata, variableMetadata and dataMatrix
- example-sheet_data: "generic" type of xlsx file
- example-vib_data: VIB MEC xlsx file
The following commands assume that the downloaded examples are in the 'home' directory. Please modify, according to your working directory, the absolute paths in the .yml files and in the bash commands.
- For IsoCor data:
python3 -m tracegroomer \
-lm ~/examples_TraceGroomer/data/example-isocor_data/TRACER_IsoCor_out_example.tsv \
-tf IsoCor_out_tsv \
-cf ~/examples_TraceGroomer/groom_files/example-isocor/config-1-groom.yml
- or, for the "three files" rule:
python -m tracegroomer \
-lm ~/examples_TraceGroomer/data/example-ruletsv_data/TRACER_dataMatrix.tsv \
-tf rule_tsv \
-cf ~/examples_TraceGroomer/groom_files/example-ruletsv/config-2-groom.yml
- or, for the generic xlsx case:
python3 -m tracegroomer \
-lm ~/examples_TraceGroomer/data/example-sheet_data/TRACER_generic_sheet.xlsx \
-tf generic_xlsx
-cf ~/examples_TraceGroomer/groom_files/example-sheet/config-3-groom.yml
- or, for VIB MEC data:
python3 -m tracegroomer \
-lm ~/examples_TraceGroomer/data/example-vib_data/TRACER_metabo_4.xlsx \
-tf VIBMEC_xlsx
-cf ~/examples_TraceGroomer/groom_files/example-vib/config-4-groom.yml
The output files are saved in the folder that you specified in the config .yml
file (groom_out_path
field).
The data/[my_dataset]
location is recommended for saving the output.
A total of 4 output files are generated if the absolute isotopologues are provided, otherwise 3 files are generated.
In this way you simply copy the entire data/
content to the folder structure that we want to run with DIMet !
The format of these output files is tab-delimited .csv.