The offical PyTorch implementation of the models used in DialGen paper. Please refer to our paper for details.
DIALGEN: Collaborative Human-LM Generated Dialogues for Improved Understanding of Human-Human Conversations.
Bo-Ru Lu1*, Nikita Haduong1*, Chia-Hsuan Lee1, Zeqiu Wu1, Hao Cheng2, Paul Koester3, Jean Utke3, Tao Yu4, Noah A. Smith1,5 and Mari Ostendorf1. *Equal Contribution
1University of Washington 2Microsoft Research 3Allstate 4University of Hong Kong 5Allen Institute for AI
[project] [data] [model] [paper] [interface]
This code has been written using PyTorch >= 1.13 and HuggingFace >= 4.27.3. If you use our source codes included in this repository in your work, please cite the following paper. The bibtex is listed below:
@misc{lu2023dialgen,
title={DIALGEN: Collaborative Human-LM Generated Dialogues for Improved Understanding of Human-Human Conversations},
author={Bo-Ru Lu and Nikita Haduong and Chia-Hsuan Lee and Zeqiu Wu and Hao Cheng and Paul Koester and Jean Utke and Tao Yu and Noah A. Smith and Mari Ostendorf},
year={2023},
eprint={2307.07047},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
Method | ||||||
---|---|---|---|---|---|---|
IC-DST | 71.3 | 71.9 | 68.5 | 68.4 | 68.2 | 68.1 |
T5 | 76.8 | 78.4 | 74.9 | 73.7 | 74.1 | 73.9 |
T5-SC | 78.2 | 79.3 | 76.4 | 76.6 | 76.9 | 74.2 |
T5-SC |
78.5 | 78.7 | 76.2 | 76.0 | 76.2 | 75.0 |
The released data is with name substitution. All reported values represent the medians obtained from 5 different
random seeds.
-
Download T5 and LongT5 models.
# T5-base python src/download_model.py --model_name t5-base --output_dir ./pretrained_models
# Long T5 python src/download_model.py --model_name google/long-t5-tglobal-base --output_dir ./pretrained_models
Raw data is here: dialgen_data/v1.0
.
-
T5 TLB.
bash scripts/setup.sh tlb t5 ./dialgen_data/v1.0/tlb
-
LongT5 DST
bash scripts/setup.sh dst longt5 ./dialgen_data/v1.0/dst
-
T5 SC We use most recent 18 turns to create the previous state here.
bash scripts/setup.sh sc t5 ./dialgen_data/v1.0/state_change/18_turns
-
T5 TLB
# We use the test file in longt5-dst as reference. # The test file is a softlink, it points to the original file. bash scripts/run_t5-tlb.sh ./data/t5-tlb 42 ./pretrained_models/google/long-t5-tglobal-base ./data/longt5-dst
-
LongT5 DST
bash scripts/run_t5-tlb.sh ./data/longt5-dst 42 ./pretrained_models/t5-base ./data/longt5-dst
-
T5 SC
# Limited by the max number of input tokens, we use the last 18 turns to create # the previous state. # Training time is about 2 hours. bash scripts/run_t5-sc.sh ./data/t5-sc 42 ./pretrained_models/t5-base ./data/t5-tlb ./data/longt5-dst 18