Inquiry regarding data splits and evaluation

Hello,

Thank you for your impressive work on BiomedParse. I am currently working on text-guided image segmentation and would like to include your model as a baseline for comparison.

However, while looking into the data configuration to ensure a fair comparison, I noticed a couple of things regarding the dataset splits for the pre-trained models (both v1 and v2):

1. Random vs. Official Splits: It appears that the models were trained using a random 80/20 split rather than the official splits provided by the original dataset benchmarks.

2. Potential Leakage in ACDC (v2): specifically regarding the ACDC dataset in BiomedParse v2, I noticed that the random split is performed at the image/volume level rather than the patient level. Since each patient in ACDC has two 3D MRI scans (End-Diastolic and End-Systolic), the current random split results in cases where the same patient's data appears in both the training and testing sets (e.g., ED in train, ES in test).

This overlap and the non-standard splits make it quite challenging to benchmark fairly against other methods that follow the official protocols.

Do you have any suggestions on how to best compare our work with yours under these circumstances? Alternatively, if you have any checkpoints trained on the official splits (or patient-level splits for ACDC), that would be incredibly helpful.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inquiry regarding data splits and evaluation #112

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Inquiry regarding data splits and evaluation #112

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions