Skip to content

This project is special for SemEval 2020 Task 4 - Commonsense Validation and Explanation (ComVE).

Notifications You must be signed in to change notification settings

LetianLee/MSc-Project-SemEval2020-Task4

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

47 Commits
 
 
 
 
 
 

Repository files navigation

The project code was completed on September 10, 2021.

MSc Project: Commonsense Validation and Explanation in Natural Language Processing

License: MIT

This is the source code for the MSc Project: Commonsense Validation and Explanation in Natural Language Processing. The project is aiming to solve the first two subtasks in SemEval 2020 Task 4 - Commonsense Validation and Explanation (ComVE).

Project Information

  • Title: Commonsense Validation and Explanation in Natural Language Processing
  • Objective Task: SemEval 2020 Task 4 - Commonsense Validation and Explanation (ComVE)
  • Supervisor: Dr Mark Lee
  • Student: Letian Li (2214560)

Task Description

This MSc project contains two subtasks.

Subtask A is a Validation task. The purpose is to determine which of two similar natural language statements is against common sense.

Example:

Task: Which of the two statements is against common sense?
Statement1: He put a turkey into the fridge.
Statement2: He put an elephant into the fridge.

Subtask B is an Explanation (Multi-Choice) task. The purpose is to find the right reason why a given statement is against common sense.

Example:

Task: Select the most appropriate reason as to why this statement is against common sense.
Statement: He put an elephant into the fridge.
A: An elephant is much bigger than a fridge.
B: Elephants are usually white while fridges are usually white.
C: An elephant cannot eat a fridge.

Instructions

Python 3.7 with PyTorch 1.9.0 and Transformers 4.9.1 serve as the backbones of this project.

The code is using one GPU by default, you have to modify the code to make it running on CPU or multiple GPUs.

1. Envs and Dependencies

First, to implement this work, you need to download and install Anaconda3. Here is the official link for Anaconda: https://www.anaconda.com/products/individual-d
After installing the appropriate version of Anaconda3, please execute the following command block on the console. (The example below is based on the Linux operating system.)

# Create a conda envs
conda create -n msc_project_2214560 python=3.7 ipython
conda activate msc_project_2214560

# Install PyTorch 
# Please Note: Install the appropriate version of PyTorch, and be aware of your operating system and CUDA version. Please check https://pytorch.org/get-started/locally/ 
# The following command is for the PyTorch installation of Linux operating system and CUDA 10.2.
pip install torch torchvision torchaudio

# Install HuggingFace Transformers
pip install transformers==4.9.1

# Install other libraries
pip install TensorBoard==2.6.0
pip install jupyter==1.0.0
pip install pandas==1.2.4
pip install matplotlib==3.3.4
pip install seaborn==0.11.1

2. Viewing and Running the Project Code

After installing the project environment, you can run Jupyter Notebook to view and run the project.

3. The Directory Tree of Project

MSc Project (Commonsense Validation and Explanation in Natural Language Processing)
├── MSc Project
│   ├── Best Implementation
│   │   ├── Subtask_A.ipynb -------------------------- This is the best implementation for subtask A.
│   │   └── Subtask_B.ipynb -------------------------- This is the best implementation for subtask B.
│   ├── DataSet -------------------------------------- This is the DataSet of the SemEval-2020 Task 4.
│   └── Experiment
│       ├── ExperimentResult_A ----------------------- This keeps the experiment results of subtask A. (New results will be generated automatically)
│       ├── ExperimentResult_B ----------------------- This keeps the experiment results of subtask B. (New results will be generated automatically)
│       ├── TensorBoard ------------------------------ This keeps the TensorBoard records. (This directory will be generated automatically)
│       ├── Experiment Workflow.ipynb ---------------- This shows the experiment workflow and generates the result figures.
│       ├── Experiment_Subtask_A.ipynb --------------- This is the experiment code for subtask A.
│       └── Experiment_Subtask_B.ipynb --------------- This is the experiment code for subtask B.
├── Project Demonstration (updated).pdf -------------- This is the project demonstration slide.
└── README.md

Suggestion

If you just want a quick look at how to solve the task, please see the Implementation Code:

MSc-Project-SemEval2020-Task4/MSc Project/Best Implementation/Subtask_A.ipynb
MSc-Project-SemEval2020-Task4/MSc Project/Best Implementation/Subtask_B.ipynb

If you want to know how the project conducts experiments, please see the Experiment Code:

MSc-Project-SemEval2020-Task4/MSc Project/Experiment/Experiment_Subtask_A.ipynb
MSc-Project-SemEval2020-Task4/MSc Project/Experiment/Experiment_Subtask_B.ipynb

If you want to take an overview of the experiments results, please see the file Experiment Workflow:
(How different hyperparameters affect the accuracy.)

MSc-Project-SemEval2020-Task4/MSc Project/Experiment/Experiment Workflow.ipynb

If you want to take an overview of this project, please see the Project Demonstration Slide:

MSc-Project-SemEval2020-Task4/Project Demonstration (updated).pdf

About

This project is special for SemEval 2020 Task 4 - Commonsense Validation and Explanation (ComVE).

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published