Language Entailment in NLP 🚀

This project focuses on textual entailment—a crucial task in Natural Language Processing (NLP) where the goal is to determine if the meaning of one sentence (hypothesis) is logically inferred from another sentence (premise). By leveraging transformer-based models like DeBERTa and exploring textual graph-based fragmentation, the project aims to enhance entailment detection accuracy.

Project Overview

The project leverages the Stanford Natural Language Inference (SNLI) Corpus, containing sentence pairs classified as:

Neutral
Entailment
Contradiction

We fine-tuned the DeBERTa model, a state-of-the-art transformer-based architecture, to classify sentence pairs. Additionally, we incorporated textual graphs to break sentences into smaller fragments, aiming to identify relationships at a granular level.

Key Features

DeBERTa Integration: Fine-tunes a transformer-based model for entailment classification.
Textual Graph Analysis: Breaks sentences into fragments for more detailed semantic analysis.
Custom Dataset Loader: Tokenizes and preprocesses input sentences dynamically.
Metrics Logging: Tracks training and validation accuracy.

Code Structure

Files and Purpose:

train.py:
- Handles dataset loading, model training, and evaluation.
- Uses a DeBERTa model for fine-tuning on the SNLI dataset.
- Logs metrics using a custom Metrics class and supports Weights & Biases integration.
helper_fn.py:
- Contains helper functions and classes:
  - Metrics: Tracks accuracy and other evaluation metrics.
  - Collate: Dynamically batches and pads tokenized inputs.
textual_graphs.py:
- Implements the Build_Fragments class:
  - Breaks sentences into semantic fragments.
  - Extracts root sentences and entities for graph-based analysis.
Report on Entailment in Language Pragmatics.pdf:
- A detailed report discussing the theoretical underpinnings of textual entailment, methodologies, and the use of textual graphs.
Existing README.md:
- Initial placeholder for project documentation.

Results

Experiment	Train Accuray		Validation Accuracy
Without Textual Graphs	.882		.891
With Textual Graphs	.880		.891

Installation

Clone the repository:

git clone https://github.com/your-username/Language_Entailment.git
cd Language_Entailment

Install dependencies:
```
pip install -r requirements.txt
```
Download the SNLI dataset and place it in the project directory:
- Train and validation files should be in .json format.
Ensure you have GPU support (e.g., NVIDIA A100) for faster training.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Language Entailment in NLP 🚀

Table of Contents

Project Overview

Key Features

Code Structure

Files and Purpose:

Results

Results

Installation

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
README.md		README.md
Report on Entailment in Language pragmatics.pdf		Report on Entailment in Language pragmatics.pdf
helper_fn.py		helper_fn.py
textual_graphs.py		textual_graphs.py
train.py		train.py

cryptic-glitch/Language_Entailment

Folders and files

Latest commit

History

Repository files navigation

Language Entailment in NLP 🚀

Table of Contents

Project Overview

Key Features

Code Structure

Files and Purpose:

Results

Results

Installation

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages