DPimport is a command line tool for importing files into DPdash using a
simple glob
expression.
Just use pip
pip install https://github.com/AMP-SCZ/dpimport.git
DPimport requires a configuration file in YAML format, passed as a command
line argument with -c|--config
, for establishing a MongoDB database
connection. You will find an example configuration file in the examples
directory within this repository.
The main command line tool is import.py
. You can use this tool to import any
DPdash-compatible CSV files or metadata files using the direct path to a file
or a glob expression (use single quotes to avoid shell expansion)
import.py -c config.yml '/PHOENIX/GENERAL/STUDY_A/SUB_001/DATA_TYPE/processed/*.csv'
import.py -c config.yml '/PHOENIX/GENERAL/STUDY_A/SUB_001/DATA_TYPE/processed/*.csv' -n 8
-n 8
is for parallelly importing 8 files. The default is -n 1
.
You may also now use the **
recursive glob expression, for example:
import.py -c config.yml '/PHOENIX/**/*.csv'
or
import.py -c config.yml '/PHOENIX/GENERAL/**/processed/*.csv'
and so on.
Details about the pattern /**/
directory/*/*.csv
matches only directory/[subdirectory]/[filename].csv
. With a recursive glob pattern, directory/**/*.csv
will additionally match:
directory/[filename].csv
(no subdirectory)directory/[subdirectory1]/[subdirectory2]/[filename].csv
(sub-subdirectory)
and so on, for as many levels deep as exist in the directory tree.
This tool requires MongoDB to be running and accessible with the credentials you
supply in the config.yml
file. For tips on MongoDB as it is used in DPdash and DPimport,
see the DPdash wiki.