-
Notifications
You must be signed in to change notification settings - Fork 15
Open
Description
Hello, as pointed out previously in another issue, it appears that most of the datasets for the classification tasks are already scaled, except for GS-LGG and GS-GBM. This might create problems in machine learning applications, as scaling should be performed on the training data and not on the entire dataset, to prevent data leakage. I believe this is also the reason why I could not reproduce the scores you have presented in your paper, getting extremely high scores even with Logistic Regression. In addition, in the Readme there is no description for the GBM cancer, and there are some incongruencies between the baseline models presented in the paper and what is available in the repo.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels