generated from sensein/python-package-template
-
Notifications
You must be signed in to change notification settings - Fork 0
add BIDS Conversion pipeline #29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
38 commits
Select commit
Hold shift + click to select a range
228c945
add bids code
aparnabg 739d352
add log _file.py
aparnabg 1564f16
add .sh
aparnabg c124e0c
add bids code
aparnabg 725305a
add bids code
aparnabg cb634ce
Add participants.tsv population file
manaalm ea61c29
Create README.md
manaalm 74bb9f4
Added configuration file for BIDS conversion
bb918b9
Cleaned src folder
59b66e3
Final script for BIDS conversion added
27cb826
Added poetry dependencies
47227b1
Modified test for new BIDS convertor script
acf2e66
updated README with BIDS-conversion pipeline
b1d262c
final cleaning and merge after script execution
348973b
changed number of jobs in submission file
e57bcae
Added logs to .gitignore
84f2e35
fixed last shell scripts
d6ac5b8
Update jobs/run_bids_convertor.sh
lucie271 770e600
Update jobs/merge_cleanup.sh
lucie271 316f227
Update src/tests/test_BIDS_convertor.py
lucie271 491a431
Update jobs/merge_cleanup.sh
lucie271 f8ed634
untrack poetry.lock and add logs folder
b50ba76
BIDS_convertor.py in sailsprep
e16f84b
Fixed little warnings from PR
04bad65
Fixed warnings in BIDS_convertor.py from PR
50a80b9
Cleaned /logs handling
0fce931
Changed source video to raw folder
c8699e4
Update src/tests/test_BIDS_convertor.py
1e04269
fixed issues of execution
9e5c76d
added documentation
d80bb46
Merge branch 'main' into BIDS-conversion
lucie271 21003fd
fixed scripts for unit tests
5bad1f5
Merge branch 'main' into BIDS-conversion
85f3384
Merge branch 'BIDS-conversion' of https://github.com/sensein/sailspre…
333f5f4
updated unit test
7768f7e
Added unit tests
5117e74
Change number of array
78e611e
Fixed error mypy
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Some comments aren't visible on the classic Files Changed page.
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,17 @@ | ||
| # Video Processing Configuration | ||
|
|
||
| # Input data | ||
| annotation_file: /orcd/data/satra/002/datasets/SAILS/data4analysis/Video Rating Data/SAILS_RATINGS_ALL_DEDUPLICATED_NotForFinalAnalyses_2025.10.csv | ||
| video_root: /orcd/data/satra/002/datasets/SAILS/Phase_III_Videos/Videos_from_external | ||
| asd_status: /orcd/data/satra/002/datasets/SAILS/data4analysis/ASD_Status.xlsx | ||
|
|
||
| # Output data | ||
| output_dir: /orcd/scratch/bcs/001/sensein/sails/BIDS_data | ||
|
|
||
| # Video processing parameters | ||
| target_resolution: 1280x720 | ||
| target_framerate: 30 | ||
|
|
||
| # Derived directory names (optional — can be built dynamically) | ||
| final_bids_root: final_bids-dataset | ||
| derivatives_subdir: derivatives/preprocessed |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,52 @@ | ||
| ## BIDS Format | ||
|
|
||
| For reproducibility, organization, and practicality, sailsprep converts its raw data into the BIDS (Brain Imaging Data Structure) format. | ||
| BIDS is a community-driven standard for organizing, naming, and describing neuroimaging and related data (e.g., EEG, fMRI, MEG, behavioral, physiological data, etc.). | ||
|
|
||
| During the BIDS conversion pipeline, the raw domestic videos are preprocessed to be standardized, denoised, and reformatted. | ||
| Relevant metadata and annotations necessary for downstream analysis are also extracted at this stage. | ||
|
|
||
| ## Structure | ||
|
|
||
| The final BIDS dataset follows the structure below: | ||
| ```graphql | ||
| ├── sub-ID1 # Contains raw videos in BIDS format | ||
| │ ├── ses-01 # Videos between 12 and 16 months | ||
| │ │ └── beh # Behavioral data | ||
| │ │ ├── sub-ID1_ses-01_task-A_run-01_beh.mp4 # Standardized raw video | ||
| │ │ ├── sub-ID1_ses-01_task-A_run-01_beh.tsv # Manual annotations | ||
| │ │ └── sub-ID1_ses-01_task-A_run-01_beh.json # Info on standardization | ||
| │ └── ses-02 # Videos between 34 and 38 months | ||
| │ └── beh | ||
| ├── derivatives | ||
| │ └── preprocessed # Contains stabilized, denoised, standardized videos | ||
| │ ├── sub-ID1 | ||
| │ │ ├── ses-01 | ||
| │ │ │ └── beh | ||
| │ │ │ ├── sub-ID1_ses-01_task-A_run-01_audio.json # Audio extraction info | ||
| │ │ │ ├── sub-ID1_ses-01_task-A_run-01_audio.wav # Extracted audio | ||
| │ │ │ ├── sub-ID1_ses-01_task-A_run-01_desc-processed.json # Video preprocessing info | ||
| │ │ │ └── sub-ID1_ses-01_task-A_run-01_desc-processed_beh.mp4 # Preprocessed video | ||
| │ │ └── ses-02 | ||
| │ └── sub-ID2 | ||
| ├── README.md # Explains dataset structure and content | ||
| ├── participants.tsv # Participant information (e.g., ASD status) | ||
| ├── participants.json # Metadata for participants.tsv | ||
| └── dataset_description.json # BIDS dataset description (name, version, etc.) | ||
| ``` | ||
| ## Execution | ||
|
|
||
| To verify that FFmpeg is correctly installed (cf [README.md](../README.md)) and at least version 6.0, run: | ||
|
|
||
| ``` | ||
| ffmpeg -version | ||
| ``` | ||
|
|
||
| You’ll need to submit the conversion job on Engaging using sbatch. | ||
| Make sure you are in the root directory of the repository. | ||
|
|
||
| We provide SLURM submission scripts for convenience — simply run the following commands (with the miniforge module deactivated to ensure the correct FFmpeg version is used): | ||
| ``` | ||
| jid=$(sbatch --parsable jobs/run_bids_convertor.sh) | ||
| sbatch --dependency=afterok:$jid jobs/merge_cleanup.sh | ||
| ``` |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,69 @@ | ||
| #!/bin/bash | ||
| #SBATCH --job-name=merge_cleanup | ||
| #SBATCH --output=logs/merge_cleanup_%j.out | ||
| #SBATCH --error=logs/merge_cleanup_%j.err | ||
| #SBATCH --time=01:00:00 | ||
| #SBATCH --mem=2G | ||
|
|
||
| # Clean up old logs before running | ||
| echo "Cleaning up old logs..." | ||
| if [ -d logs ]; then | ||
| find logs -mindepth 1 ! -name ".gitkeep" \ | ||
| ! -name "merge_cleanup_${SLURM_JOB_ID}.out" \ | ||
| ! -name "merge_cleanup_${SLURM_JOB_ID}.err" -delete | ||
| fi | ||
|
|
||
| OUTPUT_DIR=$(poetry run python -c "import yaml, sys; print(yaml.safe_load(open('configs/config_bids_convertor.yaml'))['output_dir'])") | ||
| MERGED_DIR="$OUTPUT_DIR" | ||
|
|
||
| mkdir -p "$MERGED_DIR" | ||
|
|
||
| echo "Merging logs from numbered folders under $OUTPUT_DIR" | ||
| echo "Started at $(date)" | ||
|
|
||
| merged_processed="$MERGED_DIR/all_processed.json" | ||
| merged_failed="$MERGED_DIR/all_failed.json" | ||
|
|
||
| # Create empty lists if not exist | ||
| echo "[]" > "$merged_processed" | ||
| echo "[]" > "$merged_failed" | ||
|
|
||
| # Load jq (if not already available) | ||
| module load jq 2>/dev/null || true | ||
|
|
||
| for folder in "$OUTPUT_DIR"/*/; do | ||
| foldername=$(basename "$folder") | ||
|
|
||
| if [[ "$foldername" =~ ^[0-9]+$ ]]; then | ||
| echo "Merging from folder: $foldername" | ||
| if [[ -f "$folder/processing_log.json" ]]; then | ||
| tmpfile=$(mktemp) | ||
| jq -s 'add' "$merged_processed" "$folder/processing_log.json" > "$tmpfile" && mv "$tmpfile" "$merged_processed" | ||
| fi | ||
| if [[ -f "$folder/not_processed.json" ]]; then | ||
| tmpfile=$(mktemp) | ||
| jq -s 'add' "$merged_failed" "$folder/not_processed.json" > "$tmpfile" && mv "$tmpfile" "$merged_failed" | ||
| fi | ||
lucie271 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| fi | ||
| done | ||
|
|
||
| echo "Merged logs saved in: $MERGED_DIR" | ||
| echo "Now cleaning up numbered folders..." | ||
|
|
||
| # Delete only folders with numeric names (avoid final_bids-dataset) | ||
| for folder in "$OUTPUT_DIR"/*/; do | ||
| foldername=$(basename "$folder") | ||
| if [[ "$foldername" =~ ^[0-9]+$ ]]; then | ||
| echo "Deleting temporary folder: $foldername" | ||
| rm -rf "$folder" | ||
| else | ||
| echo "Skipping non-numeric folder: $foldername" | ||
| fi | ||
| done | ||
|
|
||
| echo "Cleanup complete at $(date)" | ||
|
|
||
| # --- Run final Python merge --- | ||
| echo "Running final Python merge and participant file creation..." | ||
| poetry run python -c "from sailsprep.BIDS_convertor import merge_subjects, create_participants_file; merge_subjects(); create_participants_file()" | ||
| echo "Final BIDS merge and participant file creation complete ✅" | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,41 @@ | ||
| #!/bin/bash | ||
| #SBATCH --job-name=bids_processing | ||
| #SBATCH --partition=mit_normal | ||
| #SBATCH --array=0-18 | ||
| #SBATCH --output=logs/bids_%A_%a.out | ||
| #SBATCH --error=logs/bids_%A_%a.err | ||
| #SBATCH --mem=5G | ||
| #SBATCH --time=10:00:00 | ||
| #SBATCH --cpus-per-task=5 | ||
|
|
||
| mkdir -p logs | ||
|
|
||
| # --- Determine project root robustly --- | ||
| if [ -n "$SLURM_SUBMIT_DIR" ]; then | ||
| cd "$SLURM_SUBMIT_DIR" || { echo "❌ Cannot cd to SLURM_SUBMIT_DIR=$SLURM_SUBMIT_DIR"; exit 1; } | ||
| else | ||
| SCRIPT_DIR="$( cd -- "$( dirname -- "${BASH_SOURCE[0]}" )" &> /dev/null && pwd )" | ||
| cd "$SCRIPT_DIR/.." || { echo "❌ Cannot cd to project root"; exit 1; } | ||
| fi | ||
|
|
||
| echo "Running from project root: $(pwd)" | ||
| export PYTHONUNBUFFERED=1 | ||
|
|
||
| ffmpeg -version || echo "⚠️ FFmpeg not available" | ||
|
|
||
| # --- Poetry setup --- | ||
| if ! poetry env info --path &> /dev/null; then | ||
| echo "Creating Poetry environment..." | ||
| poetry install || { echo "❌ Poetry install failed"; exit 1; } | ||
| fi | ||
|
|
||
| ENV_PATH=$(poetry env info --path) | ||
| source "$ENV_PATH/bin/activate" || { echo "❌ Failed to activate Poetry environment"; exit 1; } | ||
|
|
||
| echo "Using Python from: $(which python)" | ||
| echo "Task ID: ${SLURM_ARRAY_TASK_ID}" | ||
| echo "Starting BIDS conversion at $(date)" | ||
|
|
||
| python -m sailsprep.BIDS_convertor "$SLURM_ARRAY_TASK_ID" "$SLURM_ARRAY_TASK_MAX" | ||
|
|
||
| echo "Finished at $(date)" |
Empty file.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.