In proceedings of EMNLP 2019.
For convinience, create a symlink: cd e2e-coref && ln -s ../wiki ./wiki
For pre-training the coreference resolution system, OntoNotes 5.0 is required. [Download] [Create splits]
Data for training the reward models and fine-tuning the coreference resolver (place in <PROJECT_HOME>/data
):
- 2M triples for RE-Text [Download]
- 12M triples for RE-KG [Download]
- 60k triples for RE-Joint [Download]
- Development data [Download]
- 10k wikipedia summaries for fine-tuning [Download]
Note: If you want to make these files from scratch, follow the instructions in the triples
folder.
- Best performing reward model (RE-Distill) [Download]
- Best performing coreference resolver (Coref-Distill) [Download]
Unzip Coref-Distill
into e2e-coref/logs
folder and run GPU=x python evaluate.py final
- Download pytorch big-graph embeddings (~40G, place in
<PROJECT_HOME>/embeddings
) [Download] - Run
wiki/embs.py
to create an index of the embeddings (you need to do this only once) - Run reward module training with
cd wiki/reward && python train.py <dataset-name>
- Follow
e2e-coref/README.md
to setup environment, create ELMO embeddings, etc. - Run coreference pre-training with
cd e2e-coref && GPU=x python train.py <experiment>
- Start the sling server with
python wiki/reward/sling_server.py
- Change
SLING_IP
inwiki/reward/reward.py
to the IP of the sling server - Run coreference fine-tuning with
cd e2e-coref && GPU=x python finetune.py <experiment>
(seee2e-coref/experiments.conf
for the different configurations)
wiki/reward/combine_models.py
can be used to distill the various reward modelse2e-coref/save_weights.py
can be used to save the weights of the fine-tuned coreference models so that they can be combined by setting thedistill
flag in the configuration file
@inproceedings{aralikatte-etal-2019-rewarding,
title = "Rewarding Coreference Resolvers for Being Consistent with World Knowledge",
author = "Aralikatte, Rahul and
Lent, Heather and
Gonzalez, Ana Valeria and
Herschcovich, Daniel and
Qiu, Chen and
Sandholm, Anders and
Ringaard, Michael and
S{\o}gaard, Anders",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)",
month = nov,
year = "2019",
address = "Hong Kong, China",
publisher = "Association for Computational Linguistics",
url = "https://www.aclweb.org/anthology/D19-1118",
doi = "10.18653/v1/D19-1118",
pages = "1229--1235"
}