Skip to content

Code for ACL 2018 paper: "Fast Abstractive Summarization with Reinforce-Selected Sentence Rewriting. Chen and Bansal"

License

Notifications You must be signed in to change notification settings

francescodisalvo05/nlp-financial-summarization-rl

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Hybrid Text Summarization through Reinforcement Learning

This repository contains the code for the "Deep Natural Language Processing" final project at Politecnico di Torino during the academic year 2021/2022.
We explored the hybrid neural summarization architecture proposed by Zmandar et al [1], starting from the codebase of Chen [2]. This novel approach has an extractor agent that filters the most salient information abstractor agent that will paraphrase them. Then, a reinforcement learning agent will reward the produced output for jointly learning both agents. We explored this architecture across two different domains and with two different reinforcement learning policies. We proved that the performances dropped with respect to Rouge-L score on a different domain and with less number of summary sentences as a reference. Moreover, on a randomly selected subsample we showed that despite a lower Rouge-L we obtain comparable results on BERTScore, that takes into account the context and the semantic of the produced summaries.


Install dependencies

A full requirements.txt is already provided. Therefore you can easily create your virtual environment and install the provided dependencies.

python -m venv venv
source venv/bin/activate
python -m pip install --upgrade pip
pip install -r requirements.txt
python setup.py develop

Metrics

Two main metrics are inspected on our study: Rouge-L and BERTScore. The former measures the longest common subsequence between the ground-truth text and the output generated by the model whereas the latter considers both syntactic overlapping between hypothesis and reference and the context.

However, the extraction of the BERTScore is really expensive, therefore a "small" subsample was extracted from the proposed datasets in order to evaluate the performances.

In order to compute the proposed scores we used pyrouge and bertscore.

Datasets

As reported here, two main datasets are used for the proposed experiments, namely Financial Narrative Summarisation and CNN/Daily Mail).

Due to the computational limitations of our machines, for the experiments we used two different configurations, called "small", and "large", reflecting the dimension of the extracted sub-samples. Below we reported the general information. Further details about the motivations behind these settings and the preprocessing, please read section X.Y of the report.

Note: for the same reason, the "Large" dataset from CNN/DailyCNN is a random subsample extracted from the full dataset, containing more than 300k news.

FNS CNN/Daily
Split Large Small Large Small
train 2,550 300 10,000 1,200
val 450 50 1,000 100
test 363 50 1,000 150

Due to the random extraction of FNS validation set and CNN/Daily subsample, we reported the ids used on each data split under the dataset folder.

Preprocessing

Once the data has been downloaded, it has to be preprocessed (see the report for further details). Then, the labels will be extracted according to the selected metric.

  1. Split the data (only for FNS)
python ./src/split_train_val.py \
      --dataset_path=<path to the selected dataset> \
      --suffix=<trainval's foldername> \
      --reports_folder=<name of reports subfolder> \
      --summaries_folder=<name of summaries subfolder> \
  1. Preprocess the data
python ./scripts/preprocess_text[_dailycnn].py \
      --dataset_path=<path to the selected dataset> \
      --preprocessed_path=<output path of the preprocessed dataset> \
      --filtered_path=<output path of the filtered dataset> [not for daily]\
      --reports_folder=<name of reports subfolder> \
      --summaries_folder=<name of summaries subfolder> 
  1. Extract labels
python ./scripts/extract_labels.py \
      --dataset_path=<path to the selected dataset> \
      --destination_path=<output path> \
      --dataset_split=<train/val/test (list)> \
      --reports_folder=<name of reports subfolder> \
      --summaries_folder=<name of summaries subfolder> 

Training

The selected hyperparameters are repoted in the Appendix of our paper. Moreover, here you can find all the pretrained models!

  1. Train Gensim Word2Vec
python ./scripts/train_word2vec.py \
      --corpus_path=<path to fullcorpus.txt (inside the preprocessed and/or filtered folder)> \
      --destination_path=<output path of the w2v model> \
      --vector_size=<dimension of the embedding>
  1. Train extractor
python ./scripts/train_extractor_ml.py \
      --data_path=<path of the extracted labels> \
      --path=<output path of the checkpoints and logs> \
      --w2v=<w2v filepath | extenaion .model > \
      --emb_dim=<dimension of the embedding>
  1. Train abstractor
python ./scripts/train_abstractor.py \
      --data_path=<path of the extracted labels> \
      --path=<output path of the checkpoints and logs> \
      --w2v=<w2v filepath> \
      --emb_dim=<dimension of the embedding>
      
  1. Train Reinforcement Learning
!python ./scripts/train_full_rl.py \
        --data_path=<path of the extracted labels> \
        --ext_dir=<path (root) of the extractor checkpoints> \
        --abs_dir=<path (root) of the abstractor checkpoints> \
        --path=<path of the checkpoints and output labels>
        --ckpt_freq=<number of batches between two checkpoints>  \
        --batch=<batch size>
        --n_sentences=<maximum number of sentences per file> \
        --reward=<bert/rouge>

Inference and evaluation

  1. Inference on the test set and evaluate the results according to both Rouge-L and BERT.
!python ./src/inference.py \
        --output_path=<path of the model's outputs> \
        --model_dir=<path (root) of the RL checkpoints> \
        --data_path=<path of the extracted labels> \
        --n_sentences=<maximum number of sentences per file>

Results

The following table summarizes the Rouge scores obtained on the entire datasets (large).

FNS DailyCNN
Only Extractor Full pipeline Only Extractor Full pipeline
0.36 0.38 0.20 0.23

While in the upcoming we have the cross-evaluation on dataset samples (small) performed by using different reinforcement rewards and extracted labels.

FNS DailyCNN
Extracted labels
& RL Policy
Rouge-L BERT score Rouge-L BERT score
Rouge-L 0.27 0.80 0.09 0.78
BERT score 0.26 0.81 0.10 0.78

Pre trained models

In this shared folder you can find the pre trained models used for all the main experiments reported above and on our paper. In particular there are combinations of datasets (fns,cnndaily) and metrics used for the extracted labels and reinforcement learning policies.

References

The main references followed for the proposed project are:

Contributors

About

Code for ACL 2018 paper: "Fast Abstractive Summarization with Reinforce-Selected Sentence Rewriting. Chen and Bansal"

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%