Skip to content

Latest commit

 

History

History
80 lines (45 loc) · 3.14 KB

README.md

File metadata and controls

80 lines (45 loc) · 3.14 KB

Summary Generation for reviews

Introduction

Sequence to Sequence Models take a sequence of items (words, letters, time series, etc) and output another sequence of items. They are also known as Encoder-Decoder models because they use both parts of the Transformer architecture.

Such models are best suited for tasks revolving around generating new sentences depending on a given input, such as summarization, translation, or generative question answering.

About this Project

This project showcases a Text Summarizer which as the name suggests, outputs a summary for a given text input. To take it up a notch, this particular summarizer has been fine tuned specifically to generate a title for a given review.

Input : Review for a product

Output : Meaningful short summary for the review

Dataset Used

Amazon Multilingual Reviews Dataset

Model Description

Overview

A multilingual Text-to-Text Transfer Transformer (mT5) model has been used in this project.

About the Model

mT5 is basically a multilingual variant of T5 that has been pre-trained on a Common Crawl-based dataset covering 101 languages. The model architecture and training procedure that we use for mT5 closely follows that of T5.

T5 is a pre-trained language model whose primary distinction is its use of a unified “text-to-text” format for all text-based NLP problems. This approach is natural for generative tasks where the task format requires the model to generate text conditioned on some input.

Given the sequence-to-sequence structure of this task format, T5 uses a basic encoder-decoder Transformer architecture as proposed by Vaswani et al. (2017)

Steps to run the final model

1. Install the required modules

To get started, clone this repository and run the below command to make sure all required modules are installed.

pip install -r requirements.txt

2. Run driver.py

Commonly modified arguments have been configured in argument_parser.py to be passed as command line arguments.

  • model_card Model to be used, default = "google/mt5-small"
  • batch_size Size of batch, default = 32
  • weight_decay Weight decay, default = 0.01
  • learning_rate Learning rate, default = 5.6e-5
  • save_total_limit Number of checkpoints to save, default = 3
  • num_train_epochs Number of training epochs, default = 3
  • output_dir Output Directory, default = "."

Note: All above mentioned arguments are optional, to be used as and when required.

Example:

python driver.py --model_card "google/mt5-base" --learning_rate 2e-5 --batch_size 16 --num_train_epochs 4

Output

Some outputs of the final model are shown below.

Note: Original label shows the original title from the dataset and review is the input for the model.

image

image