-
Notifications
You must be signed in to change notification settings - Fork 144
Example datasets for bep036 #465
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great, thanks! I'm guessing you left it in draft state becausew of pheno001 and pheno002, right?
I think we should remove the age_at_visit
column/field from all phenotype/
measurement tools and instead provide a root-level sessions file with that field. Should we maybe take that a step farther and RECOMMEND or say it's OPTIONAL to add age
to the sessions file?
I like that idea. It's redundant information that can be aggregated to sessions level, and can be a recommendation in the BEP. |
It's in Draft state because I haven't prepared |
Co-authored-by: Eric Earl <eric.earl@nih.gov>
Co-authored-by: Eric Earl <eric.earl@nih.gov>
Co-authored-by: Eric Earl <eric.earl@nih.gov>
Co-authored-by: Eric Earl <eric.earl@nih.gov>
Got a question from @dominikwelke -- Could this PR include an example showing how to represent multiple runs from one participant-session? @ericearl mentioned today this is easily done by adding a |
- All participants.tsv files have been simplified. - pheno004 has become instead an example of some imaging-only, some phenotype-only, and some with both data
I hijacked the not yet created |
Please set the
Please also add pheno004 to be skipped on legacy and stable: bids-examples/.github/workflows/validate_datasets.yml Lines 98 to 101 in e52f77f
bids-examples/.github/workflows/validate_datasets.yml Lines 103 to 106 in e52f77f
|
@effigies Is that comment just above here a note for me? I'm confused by most of it and don't feel safe editing those files as-is. If you need me to take care of that, can I sit with you, Ross, or Nell to figure it out or have it explained to me enough to be able to do the work? |
Okay, I did what I asked. It looks like there are issues in the schema that need to be addressed, but also there are unrelated issues in pheno001-003: https://github.com/bids-standard/bids-examples/actions/runs/13188395001/job/36815880378?pr=465 |
This is super-helpful @effigies, thank you! I'm bringing the errors out of the logs here for us (@Arshitha @SamGuay @surchs):
|
@ericearl I fixed some of the bids validation errors but I'm not sure how to fix the following: ~/Desktop/Projects/bep036/bids-examples -> master
(datasci) Thu Apr 3 18:44:02 2025 ❯ bids-validator-deno pheno002 --ignoreWarnings
[ERROR] TSV_COLUMN_ORDER_INCORRECT Some TSV columns are in the incorrect order
session_id
/sessions.tsv - Column 0 (starting from 0) found at index 1.
Please visit https://neurostars.org/search?q=TSV_COLUMN_ORDER_INCORRECT for existing conversations about this issue.
[ERROR] TSV_INDEX_VALUE_NOT_UNIQUE An index column(s) was specified for the tsv file and not all of the values for it are unique.
/sessions.tsv - Row: 4, Value: 01
/sessions.tsv - Row: 5, Value: 02
Please visit https://neurostars.org/search?q=TSV_INDEX_VALUE_NOT_UNIQUE for existing conversations about this issue.
Summary: Available Tasks: Available Modalities:
18 Files, 41.2 MB MRI
2 - Subjects 2 - Sessions
If you have any questions, please post on https://neurostars.org/tags/bids.
I checked the TSV files and they are valid TSV files with no apparent issues in the "column order" which is one of the errors. @effigies could this be related to the validator issues you mentioned earlier? |
@ericearl - For |
"Description": "Age of the participant.", | ||
"Units": "years" | ||
}, | ||
"sex": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm conflicted on whether sex at birth should go in the participants.tsv or the demographics.tsv. I know OpenNeuro crawls participants.tsv files better right now to improve its search functionality, which may be justification enough to move this. Anyway, this is a point of discussion and I am open to other ideas.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ericearl Based on our slack discussion, I think we can leave the demographics info in the demographics file with participants.tsv being a list of unique participant IDs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The NIfTIs should be 0 Byte empty files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ericearl The validator was complaining about this so I replaced empty files with actual Niftis (from openneuro) in the other examples and this new example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The NIfTIs should be 0 Byte empty files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See the above comment.
@christinerogers |
Added
pheno001
andpheno002
example dataset inspired by ds004215 on OpenNeuro but significantly modified to keep it simple and easy to convey the various use cases proposed in BEP036.Use cases covered (and to be added to this PR):
pheno001
- Single session with both phenotype and imaging datapheno002
- Two sessions with one imaging data only sessionpheno003
- Two sessions with one phenotype data only sessionpheno004
- Two sets of sessions. One set of sessions (e.g. screening, baseline, followup, etc) for phenotype data and another set of sessions (e.g. 01, 02, etc) for imaging data.Still in draft state but would appreciate any and all feedback.
Pinging co-contributors: @ericearl @SamGuay @surchs