The MARS-CLI tool is a powerful interface for submitting metadata and associated files to various biological repository services like ENA, BioSamples, and MetaboLights. This command-line tool is useful for managing and validating metadata submissions in a ISA-JSON, as well as for automating aspects of repository submissions.
This installation procedure describes a typical Linux installation. This application can perfectly work on Windows and MacOS but some of the steps might be different.
Installing the mars-cli from source:
cd mars-cli # Assuming you are in the root folder of this project
pip install .
If you want to install the optional testing dependencies as well, useful when contributing to the project:
pip install .[test]
If you want to overwrite the settings.ini
file when reinstalling, you need to set the environmental variable OVERWRITE_SETTINGS
to True
:
OVERWRITE_SETTINGS=True pip install .[test]
Installing the MARS-cli, will by default create a .mars
directory in the home directory to store settings and log files.
If you wish to create the .mars
directory in another place, you must specify the MARS_SETTINGS_DIR
variable and set it to the desired path:
export MARS_SETTINGS_DIR=<path/to/parent_folder/containing/.mars>
If you want to make it permanent, you can run to following commands in the terminal.
Note: replace .bashrc
by the config file of your shell.
echo '# Add MARS setting directory to PATH' >> $HOME/.bashrc
echo 'export MARS_SETTINGS_DIR=<path/to/parent_folder/containing/.mars>' >> $HOME/.bashrc
Once installed, the CLI application will be available from the terminal.
Installing this application will also generate a settings.ini
file in $HOME/.mars/
.
[logging]
log_level = ERROR
log_file = /my/logging/directory/.mars/app.log
log_max_size = 1024
log_max_files = 5
To configure MARS for submissions, modify the configuration file settings.ini
located at ~/.mars/settings.ini
. Ensure the following content is set:
[webin]
development-url = https://wwwdev.ebi.ac.uk/ena/dev/submit/webin/auth
development-token-url = https://wwwdev.ebi.ac.uk/ena/dev/submit/webin/auth/token
production-url = https://www.ebi.ac.uk/ena/submit/webin/auth
production-token-url = https://www.ebi.ac.uk/ena/submit/webin/auth/token
[ena]
development-url = http://localhost:8042/isaena
development-submission-url = http://localhost:8042/isaena/submit
development-data-submission-url = webin2.ebi.ac.uk
production-url = https://www.ebi.ac.uk/ena/submit/webin-v2/
production-submission-url = https://www.ebi.ac.uk/ena/submit/drop-box/submit/?auth=ENA
production-data-submission-url = webin2.ebi.ac.uk
[biosamples]
development-url = http://localhost:8032/isabiosamples
development-submission-url = http://localhost:8032/isabiosamples/submit
production-url = https://www.ebi.ac.uk/biosamples/samples/
production-submission-url = https://www.ebi.ac.uk/biosamples/samples/
The MARS-CLI will automatically log events to a .log
file.
log_level: The verbosity of logging can be set to three different levels
- CRITICAL: Only critical messages will be logged. Not recommended!
- ERROR: Errors and critical messages will be logged.
- WARNING: Warnings, errors and critical messages will be logged.
- INFO: All events are logged.
- DEBUG: For debugging purpose only. Not recommended as it might log more sensitive information! The default setting is ERROR. So only errors are logged!
log_file: The path to the log file. By default this will be in $HOME/.mars/app.log
.
log_max_size: The maximum size in kB for the log file. By default the maximum size is set to 1024 kB or 1 MB.
log_max_files: The maximum number of old log files to keep. By default, this is set to 5
Each of the target repositories have a set of settings:
- development-url: URL to the development server when performing a health-check
- development-submission-url: URL to the development server when performing a submission
- production-url: URL to the production server when performing a health-check
- production-submission-url: URL to the production server when performing a submissionW
If you wish to use a different location for the `.mars' folder:
export MARS_SETTINGS_DIR=<path/to/parent_folder/containing/.mars>
mars-cli [options] <command> ARGUMENT
The mars-cli's help text can be found from the command line as such:
mars-cli --help
Output:
➜ mars-cli --help
Usage: mars-cli [OPTIONS] COMMAND [ARGS]...
Options:
-d, --development Boolean indicating the usage of the development
environment of the target repositories. If not present,
the production instances will be used.
--help Show this message and exit.
Commands:
health-check Check the health of the target repositories.
set-password Store a password in the keyring.
submit Start a submission to the target repositories.
validate-isa-json Validate the ISA JSON file.
or for a specific command:
mars-cli submit --help
Output:
➜ mars-cli submit --help
Usage: mars-cli submit [OPTIONS] ISA_JSON_FILE
Start a submission to the target repositories.
Options:
--output TEXT
--investigation-is-root BOOLEAN
Boolean indicating if the investigation is
the root of the ISA JSON. Set this to True
if the ISA-JSON does not contain a
'investigation' field.
--submit-to-metabolights BOOLEAN
Submit to Metabolights.
--data-files FILENAME Path of files to upload
--file-transfer TEXT provide the name of a file transfer
solution, like ftp or aspera
--submit-to-ena BOOLEAN Submit to ENA.
--submit-to-biosamples BOOLEAN Submit to BioSamples.
--credentials-file FILENAME Name of a credentials file
--username-credentials TEXT Username from the keyring
--credential-service-name TEXT service name from the keyring
--help Show this message and exit.
By default, the mars-CLI will try to submit the ISA-JSON's metadata towards the repositories' production servers. Passing the development flag will run it in development mode and substitute the production servers with the development servers.
You can check whether the supported repositories are healthy, prior to submission, by doing a health-check.
mars-cli health-check
Output:
➜ mars-cli health-check
############# Welcome to the MARS CLI. #############
Running in Production environment
Checking the health of the target repositories.
Service 'webin' healthy and ready to use!
Service 'ena' healthy and ready to use!
Service 'biosamples' healthy and ready to use!
This CLI application comes with functionality to interact with your device's keychain backend in order to fetch the necessary credentials.
You can add a password to keychain:
mars-cli set-password set-password [OPTIONS] USERNAME
Store a password in the keyring.
Options:
--service-name TEXT You are advised to include service name to match the
credentials to. If empty, it defaults to "mars-
cli_{DATESTAMP}"
--password TEXT The password to store. Note: You are required to
confirm the password.
--help Show this message and exit.
--submit-to-biosamples
: By default set to True
. Will try submit ISA-JSON metadata towards Biosamples. Setting it to False
will skip sending the ISA-JSON's metadata to Biosamples.
Note: Following command line will avoid submission to Biosamples repository:
mars-cli submit --submit-to-biosamples False my-credentials my-isa-json.json
--submit-to-ena
: By default set to True
. Will try submit ISA-JSON metadata towards ENA. Setting it to False
will skip sending the ISA-JSON's metadata to ENA.
Note: Following command line will avoid submission to ENA repository:
mars-cli submit --submit-to-ena False my-credentials my-isa-json.json
--file-transfer
: Provide the name of a file transfer solution, like ftp or aspera
--data-file
: Paths of files to upload.
Note: Following command line will submit isa-file and data-file using FTP solution to Biosamples and ENA:
mars-cli submit --submit-to-metabolights False --file-transfer ftp --data-files ../data/file_to_upload.fastq.gz my-credentials my-isa-json.json
Status: 🚧 To Be Developed
--submit-to-metabolights
: By default set to True
. Will try to submit ISA-JSON metadata towards Metabolights.
Setting it to False
will skip sending the ISA-JSON's metadata to Metabolights.
Following command line will avoid submission to metabolights repository:
mars-cli submit --submit-to-metabolights False my-credentials my-isa-json.json
--investigation-is-root
: By default this flag is set to false, meaning the ISA-JSON should have the investigation
key at the root level. In case the root level IS the investigation (investigation
level is omitted), you need set
the flag --investigation-is-root
to True
in order to validate the ISA-JSON.
mars-cli submit --investigation-is-root True my-credentials my-isa-json.json
--output
: By default "output_{datetime.now()}", the name of the isa final output.
mars-cli submit --output final_isa my-credentials my-isa-json.json
Status: 🚧 To Be Developed
This feature is planned but not yet implemented. Further details will be provided as development progresses.
You can perform a syntactic validation of the ISA-JSON, without submitting to the target repositories.
Note: This does not take repository-side validation into account, nor guarantees successful submission.
JSONata is a JSON query and transformation tool that will be used it this project to perform additional validation of the ISA-JSON and, in some cases, automatically patch inconsistencies.
This feature is implemented as a set of additional validation rules a user can customize according to the submission needs.
mars-cli validate-isa-json --investigation-is-root True ../test-data/biosamples-input-isa.json
--investigation-is-root
: By default this flag is set to false, meaning the ISA-JSON should have the investigation
key at the root level. In case the root level IS the investigation (investigation
level is omitted), you need set
the flag --investigation-is-root
to True
in order to validate the ISA-JSON.
mars-cli validate-isa-json my-isa-investigation.json
Status: 🚧 To Be Developed
This part is designed to interact with the BioSamples database, allowing operations like fetching, updating, and extending biosample records. The script takes in a dictionary of BioSamples' accessions and their associated external references, and expands the former with the latter.
To summarize, the steps of the code are:
- Takes the BioSamples' submitter credentials and a set of BioSamples accessions and their associated external references
- Validates inputs
- For each BioSamples' accession, it downloads its JSON record from BioSamples
- Extend the BioSamples' JSON with the
externalReferences
of the input file - Submit the extended JSON to BioSamples to replace the existing one
After configuring the settings.ini
file, you can run the MARS CLI tool to submit the isa-json:
python mars_cli.py --development submit --submit-to-metabolights False --submit-to-ena False --credential-service-name <biosamples> --username-credentials <username> ../test-data/biosamples-input-isa.json
- Replace
<biosamples>
with the appropriate service name. - Replace
<username>
with your BioSamples username. - Adjust the submission file path (
../test-data/biosamples-input-isa.json
) as needed.
Aternatively, you can also use a credentials file to authenticate to the services. An example can be found here: https://github.com/elixir-europe/MARS/blob/main/mars-cli/tests/test_credentials_example.json
Run the MARS CLI tool to submit the isa-json using credentials file:
python mars_cli.py --development submit --submit-to-metabolights False --submit-to-ena False --credentials-file <path_to_your_credentials_file.json> ../test-data/biosamples-input-isa.json
python mars_cli.py --credential-service-name biosamples --username-credentials <username> --file-transfer ftp --data-files ../data/ENA_data.R1.fastq.gz --submit-to-metabolights False --output final-isa ../data/biosamples-input-isa.json
Work in progress, currently data transfer is not supported yet.
python mars_cli.py --credential-service-name metabolights --username-credentials <username> --file-transfer ftp --data-files ../data/ISA-BH2024-ALL/metpro-analysis.txt <additional *.mzml data files> --submit-to-biosamples False --submit-to-ena False --output final-isa ../data/metabolights-input-isa.json
To set up and run the MARS tool locally using Docker, follow these steps