Skip to content

Latest commit

 

History

History
18 lines (13 loc) · 1012 Bytes

README.md

File metadata and controls

18 lines (13 loc) · 1012 Bytes

Information retrieval on SQuAD

IR + QA system on Wikipedia articles

This repository holds code for the Information Retrieval part of our project work for NLP2020 class by Paolo Torroni @unibo.

We tackle the IR problem with a classic tf-idf approach and with a contrastive Bi-Encoder model based on ELECTRA.
An in depth description of the work can be found here: QA-IR-report.pdf.

Schermata 2022-04-21 alle 17 08 19

Experiment plots

https://wandb.ai/veri/IR/reports/IR--Vmlldzo1Mzk3MDc

Branches

  • main: merged from the tfidf branch
  • tfidf: tf-idf performances on SQuAD v1.1
  • electra: neural model performances on SQuAD v1.1
  • deploy: contains the end to end notebook that performs the IR + QA task and other files used to deploy the system, where the QA model is trained on SQuAD v2