This repository implements a sentence reordering model using the Transformer architecture, inspired by the "Attention Is All You Need" paper. The project involves developing, training, and evaluating the model on text data to achieve coherent sentence ordering for tasks like summarization and generation.sec
- Introduction
- Requirements
- Dataset
- Notebook Structure
- Custom Callback
- Cosine Decay Restart
- Usage
- Results
- Conclusion
- References
The objective of this project is to reorder shuffled sentences into coherent paragraphs using the Transformer model, which has shown great effectiveness in NLP tasks. This approach is crucial for applications such as text summarization, machine translation, and story generation.
To run this project, ensure you have the following packages installed:
- Python 3.7+
- TensorFlow
- Keras
- NumPy
- Pandas
- Matplotlib
- Scikit-learn
Install them with:
pip install tensorflow keras numpy pandas matplotlib scikit-learn
You can install the required packages using the following command:
```bash
pip install tensorflow keras numpy pandas matplotlib scikit-learn
The dataset contains text data where each instance includes shuffled sentences. The model's task is to reorder these sentences into their original sequence. The data is split into training, validation, and test sets for comprehensive evaluation.
- Introduction: Project overview and objectives.
- Data Loading and Preprocessing: Loading and preparing data, including tokenization and padding.
- Model Development: Transformer model implementation for reordering sentences.
- Training and Evaluation: Model training and validation.
- Experiments: Exploration of different architectures and hyperparameters.
- Conclusion: Summary of results and insights.
A custom callback monitors validation performance during training, enabling dynamic adjustments and early stopping to avoid overfitting and improve generalization.
The project utilizes a Cosine Decay with Restarts learning rate schedule, which helps the model converge by periodically reducing the learning rate and restarting it to escape local minima.
- Clone this repository:
git clone https://github.com/your_username/sentence-reordering-transformers.git
- Navigate to the project directory:
cd sentence-reordering-transformers
- Open the notebook:
jupyter notebook francesco_baiocchi_Sentence_Reordering.ipynb
- Run the cells in the notebook to execute the code.
The model’s performance is evaluated using a custom validation callback to monitor real-time performance on the validation set. The best model achieved an average score of 0.573 on the test set.
After numerous experiments and iterations, the Transformer model demonstrated average performance in the task of sentence reordering. While different architectures and parameter tuning efforts were explored, the improvements were marginal.
This project was inspired by the following papers: