GitHub - rahular/coref-rl: Rewarding Coreference Resolvers for Being Consistent with World Knowledge

Rewarding Coreference Resolvers for Being Consistent with World Knowledge

In proceedings of EMNLP 2019.

Datasets

For convinience, create a symlink: cd e2e-coref && ln -s ../wiki ./wiki

For pre-training the coreference resolution system, OntoNotes 5.0 is required. [Download] [Create splits]

Data for training the reward models and fine-tuning the coreference resolver (place in <PROJECT_HOME>/data):

2M triples for RE-Text [Download]
12M triples for RE-KG [Download]
60k triples for RE-Joint [Download]
Development data [Download]
10k wikipedia summaries for fine-tuning [Download]

Note: If you want to make these files from scratch, follow the instructions in the triples folder.

Pre-trained models

Best performing reward model (RE-Distill) [Download]
Best performing coreference resolver (Coref-Distill) [Download]

Evaluation

Unzip Coref-Distill into e2e-coref/logs folder and run GPU=x python evaluate.py final

Training

Reward models

Download pytorch big-graph embeddings (~40G, place in <PROJECT_HOME>/embeddings) [Download]
Run wiki/embs.py to create an index of the embeddings (you need to do this only once)
Run reward module training with cd wiki/reward && python train.py <dataset-name>

Coreference resolver

Pre-training

Follow e2e-coref/README.md to setup environment, create ELMO embeddings, etc.
Run coreference pre-training with cd e2e-coref && GPU=x python train.py <experiment>

Fine-tuning

Start the sling server with python wiki/reward/sling_server.py
Change SLING_IP in wiki/reward/reward.py to the IP of the sling server
Run coreference fine-tuning with cd e2e-coref && GPU=x python finetune.py <experiment> (see e2e-coref/experiments.conf for the different configurations)

Misc

wiki/reward/combine_models.py can be used to distill the various reward models
e2e-coref/save_weights.py can be used to save the weights of the fine-tuned coreference models so that they can be combined by setting the distill flag in the configuration file

Citation

@inproceedings{aralikatte-etal-2019-rewarding,
    title = "Rewarding Coreference Resolvers for Being Consistent with World Knowledge",
    author = "Aralikatte, Rahul  and
      Lent, Heather  and
      Gonzalez, Ana Valeria  and
      Herschcovich, Daniel  and
      Qiu, Chen  and
      Sandholm, Anders  and
      Ringaard, Michael  and
      S{\o}gaard, Anders",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)",
    month = nov,
    year = "2019",
    address = "Hong Kong, China",
    publisher = "Association for Computational Linguistics",
    url = "https://www.aclweb.org/anthology/D19-1118",
    doi = "10.18653/v1/D19-1118",
    pages = "1229--1235"
}

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
e2e-coref		e2e-coref
slides		slides
triples		triples
wiki		wiki
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Rewarding Coreference Resolvers for Being Consistent with World Knowledge

Datasets

Pre-trained models

Evaluation

Training

Reward models

Coreference resolver

Pre-training

Fine-tuning

Misc

Citation

About

Releases

Packages

Contributors 3

Languages

rahular/coref-rl

Folders and files

Latest commit

History

Repository files navigation

Rewarding Coreference Resolvers for Being Consistent with World Knowledge

Datasets

Pre-trained models

Evaluation

Training

Reward models

Coreference resolver

Pre-training

Fine-tuning

Misc

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages