UPB FinCausal 2020 Task 1

This repository contains the ensemble used by the UPB team to obtain the 2nd place at the FinCausal 2020 Task 1. The ensemble is composed of five Transformer-based models: Bert-Large, RoBERTa-Large, ALBERT-Base, FinBERT-Base and SciBERT-base. The training of the models was done using the HuggingFace library.

The ensemble can be downloaded from here.

Installation

Make sure you have Python3 and PyTorch installed. Then, install the dependencies via pip:

pip install -r requirements.txt

Prediction

To make a prediction run the predict.py script and give it an input file with a sentence on each line and the path to the ensemble of models.

python3 predict.py [input_path] [ensemble_path] [--output_path]

The script will output a file with 1 or 0 on each line, for each model and for the ensemble, corresponding to the sentence being causal or not being causal, respectively.

albert-base     bert-large      finbert-base    roberta-large   scibert-base    ensemble        
0               0               0               0               0               0               
0               0               0               0               0               0               
1               1               1               1               1               1               
1               1               1               1               1               1               
1               1               1               1               1               1

Ensemble Performance

We depict the performance of each individual model and of the ensemble on both validation dataset (split explained in paper) and on the evaluation dataset.

Model	Valid-Prec	Valid-Rec	Valid-F1c	Test-Prec	Test-Rec	Test-F1
BERT-Large	95.81	96.15	95.77	97.10	97.02	97.05
RoBERTa-Large	95.69	95.29	95.46	97.35	97.30	97.32
ALBERT-Base	95.71	95.88	95.78	96.75	96.76	96.75
FinBERT-Base	93.88	94.71	93.92	94.08	94.30	94.18
SciBERT-Base	95.67	95.99	95.75	96.77	96.83	96.89
Ensemble	-	-	-	97.53	97.59	97.55

Cite

If you are using this repository, please consider citing the following paper as a thank you to the authors:

@inproceedings{ionescu-etal-2020-upb,
    title = "{UPB} at {F}in{C}ausal-2020, Tasks 1 {\&} 2: Causality Analysis in Financial Documents using Pretrained Language Models",
    author = "Ionescu, Marius  and
      Avram, Andrei-Marius  and
      Dima, George-Andrei  and
      Cercel, Dumitru-Clementin  and
      Dascalu, Mihai",
    booktitle = "Proceedings of the 1st Joint Workshop on Financial Narrative Processing and MultiLing Financial Summarisation",
    month = dec,
    year = "2020",
    address = "Barcelona, Spain (Online)",
    publisher = "COLING",
    url = "https://www.aclweb.org/anthology/2020.fnp-1.8",
    pages = "55--59",
    abstract = "Financial causality detection is centered on identifying connections between different assets from financial news in order to improve trading strategies. FinCausal 2020 - Causality Identification in Financial Documents {--} is a competition targeting to boost results in financial causality by obtaining an explanation of how different individual events or chain of events interact and generate subsequent events in a financial environment. The competition is divided into two tasks: (a) a binary classification task for determining whether sentences are causal or not, and (b) a sequence labeling task aimed at identifying elements related to cause and effect. Various Transformer-based language models were fine-tuned for the first task and we obtained the second place in the competition with an F1-score of 97.55{\%} using an ensemble of five such language models. Subsequently, a BERT model was fine-tuned for the second task and a Conditional Random Field model was used on top of the generated language features; the system managed to identify the cause and effect relationships with an F1-score of 73.10{\%}. We open-sourced the code and made it available at: https://github.com/avramandrei/FinCausal2020.",
}

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
resources		resources
LICENSE		LICENSE
README.md		README.md
loader.py		loader.py
model.py		model.py
predict.py		predict.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

UPB FinCausal 2020 Task 1

Installation

Prediction

Ensemble Performance

Cite

About

Releases

Packages

Languages

License

avramandrei/UPB-FinCausal-2020-Task-1

Folders and files

Latest commit

History

Repository files navigation

UPB FinCausal 2020 Task 1

Installation

Prediction

Ensemble Performance

Cite

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages