Skip to content

The offical PyTorch implementation of the models used in DialGen paper.

License

Notifications You must be signed in to change notification settings

boru-roylu/DialGenModel

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DialGenModel

The offical PyTorch implementation of the models used in DialGen paper. Please refer to our paper for details.

DIALGEN: Collaborative Human-LM Generated Dialogues for Improved Understanding of Human-Human Conversations.

Bo-Ru Lu1*, Nikita Haduong1*, Chia-Hsuan Lee1, Zeqiu Wu1, Hao Cheng2, Paul Koester3, Jean Utke3, Tao Yu4, Noah A. Smith1,5 and Mari Ostendorf1. *Equal Contribution

1University of Washington 2Microsoft Research 3Allstate 4University of Hong Kong 5Allen Institute for AI

[project] [data] [model] [paper] [interface]

This code has been written using PyTorch >= 1.13 and HuggingFace >= 4.27.3. If you use our source codes included in this repository in your work, please cite the following paper. The bibtex is listed below:

@misc{lu2023dialgen,
      title={DIALGEN: Collaborative Human-LM Generated Dialogues for Improved Understanding of Human-Human Conversations},
      author={Bo-Ru Lu and Nikita Haduong and Chia-Hsuan Lee and Zeqiu Wu and Hao Cheng and Paul Koester and Jean Utke and Tao Yu and Noah A. Smith and Mari Ostendorf},
      year={2023},
      eprint={2307.07047},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

Experiment Results

Method $CB_{avg}$ $CB_1$ $CB_2$ $CB_3$ $CB_4$ $TLB$
IC-DST 71.3 71.9 68.5 68.4 68.2 68.1
T5 76.8 78.4 74.9 73.7 74.1 73.9
T5-SC 78.2 79.3 76.4 76.6 76.9 74.2
T5-SC $\S$ 78.5 78.7 76.2 76.0 76.2 75.0

The released data is with name substitution. All reported values represent the medians obtained from 5 different random seeds. $\S$: T5-SC model on data with name substitution.

Environment Setup

  • Download T5 and LongT5 models.

    # T5-base
    python src/download_model.py --model_name t5-base --output_dir ./pretrained_models
    # Long T5
    python src/download_model.py --model_name google/long-t5-tglobal-base --output_dir ./pretrained_models

Data and Experiment Setup

Raw data is here: dialgen_data/v1.0.

  • T5 TLB.

    bash scripts/setup.sh tlb t5 ./dialgen_data/v1.0/tlb
  • LongT5 DST

    bash scripts/setup.sh dst longt5 ./dialgen_data/v1.0/dst
  • T5 SC We use most recent 18 turns to create the previous state here.

    bash scripts/setup.sh sc t5 ./dialgen_data/v1.0/state_change/18_turns

Run the script for training and evaluation.

  • T5 TLB

    # We use the test file in longt5-dst as reference.
    # The test file is a softlink, it points to the original file.
    bash scripts/run_t5-tlb.sh ./data/t5-tlb 42 ./pretrained_models/google/long-t5-tglobal-base ./data/longt5-dst
  • LongT5 DST

    bash scripts/run_t5-tlb.sh ./data/longt5-dst 42 ./pretrained_models/t5-base ./data/longt5-dst
  • T5 SC

    # Limited by the max number of input tokens, we use the last 18 turns to create
    # the previous state.
    # Training time is about 2 hours.
    bash scripts/run_t5-sc.sh ./data/t5-sc 42 ./pretrained_models/t5-base ./data/t5-tlb ./data/longt5-dst 18

Releases

No releases published

Packages

No packages published