BoK: Introducing Bag-of-Keywords Loss for Interpretable Dialogue Response Generation

Official code repository of the paper titled BoK: Introducing Bag-of-Keywords Loss for Interpretable Dialogue Response Generation.

Install dependencies

Python 3.11 or later.

❱❱❱ pip install -r requirements.txt

Set up Perl. Download and unzip meteor-1.5.tar.gz inside 3rdparty directory.

Download Datasets

Download the following datasets.

DailyDialog: Download link http://yanran.li/files/ijcnlp_dailydialog.zip
PersonaChat: Download data using ParlAI (https://parl.ai/docs/tasks.html#persona-chat)

Set the dataset paths correctly in the following files: DialoGPT/create_data.py and T5/create_data.py

Train

Train GPT2 and T5 by running train.py script provided in the respective directories.

With BoK loss

❱❱❱ python train.py -path=<model_dir> -src_file=train.py -dt=dd/pc -key

With BoW loss

❱❱❱ python train.py -path=<model_dir> -src_file=train.py -dt=dd/pc -key -all

Basic Model

❱❱❱ python train.py -path=<model_dir> -src_file=train.py -dt=dd/pc

Inference

Generate dialogues for DailyDialog and PersonaChat test data.

For base model (without BoK/BoW loss)

python generate.py -path=<model dir> -dt=dd/pc

For models using BoK/BoW loss (only response generation)

python generate.py -path=<model_dir> -dt=dd/pc -key

For models using BoK/BoW loss (response generation + tok-k token prediction)

python generate_predict.py -path=<model_dir> -dt=dd/pc -key

Evaluation

Word-overlapping based metrics (BLEU, NIST, METEOR, Diversity, Entropy)

Postprocess the generated and reference file. This step is required only for the DailyDialog dataset. Download multi-reference test data for DailyDialog from this link.

❱❱❱ python post_process_dailydialog.py -in=<file_name>

Compute the metrics.

❱❱❱ python compute_metrics.py -in=<result_path> -hyp=<hyp_file>

Note: <result_path> is the directory that contains the <hyp_file> and the <ref_file>.

Dial-M Evaluation

Follow Dial-M repo to train (or download) Dial-M model.

Run evaluation script.

❱❱❱ python eval_dialm.py -path=<output_dialm> -dt=dd/pc -out=<out_dir> -out=<model_dir> -lbl=<output_label>

Note: out is the path of the trained model and lbl is the label that was used to generate the output by running the generate.py script.

USL-H Evaluation

Follow USL-H to compute the metrics.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

BoK: Introducing Bag-of-Keywords Loss for Interpretable Dialogue Response Generation

Install dependencies

Download Datasets

Train

Inference

Evaluation

Word-overlapping based metrics (BLEU, NIST, METEOR, Diversity, Entropy)

Dial-M Evaluation

USL-H Evaluation

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
3rdparty		3rdparty
DialoGPT		DialoGPT
T5		T5
LICENSE		LICENSE
README.md		README.md
compute_metrics.py		compute_metrics.py
eval_dialm.py		eval_dialm.py
post_process_dailydialog.py		post_process_dailydialog.py
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

BoK: Introducing Bag-of-Keywords Loss for Interpretable Dialogue Response Generation

Install dependencies

Download Datasets

Train

Inference

Evaluation

Word-overlapping based metrics (BLEU, NIST, METEOR, Diversity, Entropy)

Dial-M Evaluation

USL-H Evaluation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages