Skip to content

Repository for RadRevise, a dataset for instruction-based radiology report editing

License

Notifications You must be signed in to change notification settings

rajpurkarlab/RadRevise

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

RadRevise: A Benchmark Dataset for Instruction-Based Radiology Report Editing

Repository referenced in the paper with the same title, providing code for generating the dataset and using the dataset to benchmark existing models.

Table of Contents

Requirements

The RadRevise dataset will be available on PhysioNet through an open credential process.

Usage

Instruction data generation

Using GPT-4 to generate instructions and modified reports based on specified types of instructions and clinical topics. Note that the results will differ from RadRevise both due to GPT generated responses and the additional human review and annotation process that RadRevise has undergone.

cd generation
python generate.py

Model evaluation

The code can be used directly to evaluate any text-generation models hosted on Hugging Face.

  1. Download the RadRevise dataset.
  2. Navigate to the evaluation directory.
  3. Run the following command to evaluate a single model:
python eval_model $MODEL_ID [$DATA_PATH] [$BATCH_SIZE] [$OUTPUT_FILE]
  • $MODEL_ID: the Hugging Face model id
  • $DATA_PATH: path to RadRevise dataset (default: ../data/RadRevise_v0.csv)
  • $BATCH_SIZE: the inference batch size (default: 32)
  • $OUTPUT_FILE: the name of the evaluation output (default: output/result.csv)
  1. Alternatively, modify and execute the run.sh script to evaluate one or more models.

License

This repository is made publicly available under the MIT License.

About

Repository for RadRevise, a dataset for instruction-based radiology report editing

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published