Skip to content

Latest commit

 

History

History
41 lines (19 loc) · 1.55 KB

DataSources.md

File metadata and controls

41 lines (19 loc) · 1.55 KB

MSCOCO

  1. Download and extract the COCO 2014 Train images and 2014 Val images from here.

  2. Download the Karpathy split for COCO from here.

  3. Run notebooks/preprocess_mscoco.ipynb, updating paths at the top of the notebook.

  4. Update the PATHS variable at the top of libs/datasets/utils.py.

Flickr30k

  1. Download the Flickr30k images from here.

  2. Download the Karpathy split for Flickr30k from here.

  3. Run notebooks/preprocess_flickr30k.ipynb, updating paths at the top of the notebook.

  4. Update the PATHS variable at the top of libs/datasets/utils.py.

MMIMDB

  1. Download the MMIMDB dataset from here.

  2. Run notebooks/preprocess_mmimdb.ipynb, updating paths at the top of the notebook.

  3. Update the PATHS variable at the top of libs/datasets/utils.py.

MIMIC-CXR

  1. Obtain access to the MIMIC-CXR-JPG Database Database on PhysioNet and download the dataset.

  2. Download and unzip the mimic-cxr-reports.zip file from this repository.

  3. Run notebooks/preprocess_mimiccxr.ipynb, updating paths at the top of the notebook.

  4. Update the PATHS variable at the top of libs/datasets/utils.py.