Attention Is (not) All You Need for Commonsense Reasoning

News

Updated to version 0.2.0: added stand-alone script and example script
Updated to version 0.1.0: allows replication of our ACL'19 paper results

Description:

The recently introduced BERT (Deep Bidirectional Transformers for Language Understanding) [1] model exhibits strong performance on several language understanding benchmarks. In this work, we describe a simple re-implementation of BERT for commonsense reasoning. We show that the attentions produced by BERT can be directly utilized for tasks such as the Pronoun Disambiguation Problem (PDP) and Winograd Schema Challenge (WSC). Our proposed attention-guided commonsense reasoning method is conceptually simple yet empirically powerful. Experimental analysis on multiple datasets demonstrates that our proposed system performs remarkably well on all cases while outperforming the previously reported state of the art by a margin. While results suggest that BERT seems to implicitly learn to establish complex relationships between entities, solving commonsense reasoning tasks might require more than unsupervised models learned from huge text corpora. The sample code provided within this repository allows to replicate the results reported in the paper for PDP and WSC.

Authors:

Requirements

Download and Installation

Install BertViz by cloning the repository and getting dependencies:

git clone https://github.com/jessevig/bertviz.git
cd bertviz
pip install -r requirements.txt
cd ..

To replicate the results proceed to step 3). If you want to run the stand-alone version, you can just use MAS.py. Usage is showcased in the Jupyter Notebook example MAS_Example.ipynb.
Add BertViz path to Python path:

  export PYTHONPATH=$PYTHONPATH:/home/ubuntu/bertviz/

alternatively, you can add the statement to commonsense.py after importing of sys, e.g.

sys.path.append("/home/ubuntu/bertviz/")

Clone this repository and install dependencies:

git clone https://github.com/SAP/acl2019-commonsense-reasoning
cd acl2019-commonsense-reasoning
pip install -r requirements.txt

Create 'data' sub-directory and download files for PDP and WSC challenges:

mkdir data
wget https://cs.nyu.edu/faculty/davise/papers/WinogradSchemas/PDPChallenge2016.xml
wget https://cs.nyu.edu/faculty/davise/papers/WinogradSchemas/WSCollection.xml
cd ..

Run the scripts from the paper

For replicating the results on WSC:

python commonsense.py --data_dir=~/acl2019-commonsense-reasoning/data/ --bert_mode=bert-base-uncased --do_lower_case --task_name=MNLI

For replicating the results on PDP:

python commonsense.py --data_dir=~/acl2019-commonsense-reasoning/data/ --bert_mode=bert-base-uncased --do_lower_case --task_name=pdp

For more information on the individual functions, please refer to their doc strings.

Related Work

See our latest work accepted at ACL'20 on commonsense reasoning using contrastive self-supervised learning. arXiv, GitHub

Known Issues

No issues known

How to obtain support

This project is provided "as-is" and any bug reports are not guaranteed to be fixed.

Citations

If you use this code in your research, please cite:

@inproceedings{klein-nabi-2019-attention,
    title = "Attention Is (not) All You Need for Commonsense Reasoning",
    author = "Klein, Tassilo  and
      Nabi, Moin",
    booktitle = "Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics",
    month = jul,
    year = "2019",
    address = "Florence, Italy",
    publisher = "Association for Computational Linguistics",
    url = "https://www.aclweb.org/anthology/P19-1477",
    doi = "10.18653/v1/P19-1477",
    pages = "4831--4836",
    abstract = "The recently introduced BERT model exhibits strong performance on several language understanding benchmarks. In this paper, we describe a simple re-implementation of BERT for commonsense reasoning. We show that the attentions produced by BERT can be directly utilized for tasks such as the Pronoun Disambiguation Problem and Winograd Schema Challenge. Our proposed attention-guided commonsense reasoning method is conceptually simple yet empirically powerful. Experimental analysis on multiple datasets demonstrates that our proposed system performs remarkably well on all cases while outperforming the previously reported state of the art by a margin. While results suggest that BERT seems to implicitly learn to establish complex relationships between entities, solving commonsense reasoning tasks might require more than unsupervised models learned from huge text corpora.",
}

References

[1] J. Devlin et al., BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding, 2018, https://arxiv.org/abs/1810.04805.

Name		Name	Last commit message	Last commit date
Latest commit History 35 Commits
LICENSES		LICENSES
img		img
LICENSE		LICENSE
MAS.py		MAS.py
MAS_Example.ipynb		MAS_Example.ipynb
MAS_Example.ipynb.license		MAS_Example.ipynb.license
README.md		README.md
README.md.license		README.md.license
commonsense.py		commonsense.py
data_processors.py		data_processors.py
requirements.txt		requirements.txt
requirements.txt.license		requirements.txt.license

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Attention Is (not) All You Need for Commonsense Reasoning

News

Description:

Authors:

Requirements

Download and Installation

Related Work

Known Issues

How to obtain support

Citations

References

License

About

Releases

Packages

Contributors 4

Languages

License

SAP-samples/acl2019-commonsense

Folders and files

Latest commit

History

Repository files navigation

Attention Is (not) All You Need for Commonsense Reasoning

News

Description:

Authors:

Requirements

Download and Installation

Related Work

Known Issues

How to obtain support

Citations

References

License

About

Topics

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Languages

Packages