lochness REDCap to DPdash

Python scripts to convert REDCap survey JSON files produced by lochness into DPdash/DPimport-ready CSVs

Requirements
Installation
Usage
- Input
- Output
Additional notes

Requirements

Python 3

Installation

No installation required, just clone the git repo or download the code.

git clone https://github.com/AMP-SCZ/lochness-redcap-to-dpdash.git

Usage

convert.py [-h] -d DICT -o OUTDIR [-v] expr

positional arguments:
  expr                  double-quoted path to a single JSON file, or
                        glob matching several JSON files

optional arguments:
  -h, --help            show this help message and exit
  -d DICT, --dict DICT  path to the data dictionary CSV file from REDCap
  -o OUTDIR, --outdir OUTDIR
                        output directory for all DPdash formatted CSVs
  -v, --verbose

Example usage:

python convert.py \
--dict /dict.csv \
--outdir /RED-PHOENIX/GENERAL \
"/RED-PHOENIX/GENERAL/**/*.json"

Where /RED-PHOENIX/GENERAL/**/*.json matches all JSON files to be parsed.

You may also use a single * glob expression, such as /RED-PHOENIX/GENERAL/STUDY_ID/raw/surveys/SUBJECT_ID/*.json, or a path to a single file.

Details about the pattern /**/

directory/*/*.json matches only directory/[subdirectory]/[filename].json. With a recursive glob pattern, directory/**/*.json will additionally match:

directory/[filename].json (no subdirectory)
directory/[subdirectory1]/[subdirectory2]/[filename].json (sub-subdirectory)

and so on, for as many levels deep as exist in the directory tree.

Input

The input JSON files must be formatted as SUBJECT_ID.STUDY_ID.json, for example, HK06989.FAKE_HK.json.

The input data dictionary CSV must have columns called Variable / Field Name and Form Name.

Output

The output is a directory or set of directories containing CSV files, with the following structure:

├── STUDY_ID
│   └── processed
│       └── surveys
│           ├── SUBJECT_ID
│           │   ├── STUDY_ID-SUBJECT_ID-assessment-day1to121.csv
│           │   ├── STUDY_ID-SUBJECT_ID-assessment-day122to165.csv

Additional notes

De-identification

This script does not do any de-identification or anonymization of data. These tasks should be done before this script is used.

Date variables

Dates for assessments are ascertained according to specific variable names. In particular, a date variable must either contain the phrase interview_date, or be one of chrcrit_date or chrcbc_testdate. Since we expect REDCap forms for the AMP-SCZ study to follow this convention, they have been hard-coded.

If different variables must be added to correspond to assessment dates, they can be added to the list date_vars in lib/parse_redcap.py.

Output dates

Assessments will not be processed into output if their date is already covered in existing *day{X}to{Y}.csv files for that particular assessment. This is to prevent overwriting existing data.

If the date of the assessment is not in the existing range, output CSVs will be created with the range starting from the end of the previous day range (if existing) and ending with the date of the assessment. The existing files will not be modified.

For example, in the above directory tree, assessment was conducted for this subject on 121^st and 165^th days since consent. So the first output file starts at day 1 and the second output file starts at day 122. Moreover, rows 1-120 and 122-164 (corresponding to days) do not contain any data.

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
lib		lib
.gitignore		.gitignore
README.md		README.md
convert.py		convert.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

lochness REDCap to DPdash

Requirements

Installation

Usage

Input

Output

Additional notes

De-identification

Date variables

Output dates

About

Releases

Packages

Contributors 2

Languages

AMP-SCZ/lochness-redcap-to-dpdash

Folders and files

Latest commit

History

Repository files navigation

lochness REDCap to DPdash

Requirements

Installation

Usage

Input

Output

Additional notes

De-identification

Date variables

Output dates

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages