- 03/22/2025: The Wiki-NFQA dataset is released at Hugging Face Datasets.
- 03/20/2025: The paper and code for Typed-RAG is available. You can access the paper on arXiv and HF-paper.
- 03/13/2025: The paper for Typed-RAG is accepted at NAACL 2025 SRW!
Typed-RAG enhances retrieval-augmented generation for non-factoid question-answering (NFQA) through type-aware multi-aspect query decomposition, delivering more contextually relevant and comprehensive responses.
The Wiki-NFQA Dataset is a curated benchmark designed for evaluating open-domain question answering (ODQA) systems with non-factoid questions. The dataset is available on Hugging Face.
from datasets import load_dataset
# Load the combined dataset with all examples
wiki_nfqa_dataset = load_dataset("oneonlee/Wiki-NFQA", "Wiki-NFQA", split="test")
# Load reference answers for evaluation
reference_answers = load_dataset("oneonlee/Wiki-NFQA", "reference_answer_list", split="test")
conda create -n nfqa python=3.9
conda activate nfqa
pip install -r requirements.txt
sh scripts/download_raw_data.sh
sh scripts/process_data.sh
sh scripts/classify_data.sh
sh scripts/sample_data.sh
python retriever/process_wiki.py
sh scripts/elasticsearch_setup.sh
sh scripts/construct_reference_list.sh
# Should run ES retriever manually
nohup ./retriever/elasticsearch-7.9.1/bin/elasticsearch > elasticsearch.log &
sh scripts/run_baseline_hf.sh
sh scripts/run_baseline_gpt.sh
sh scripts/run_Typed-RAG_hf.sh
sh scripts/run_Typed-RAG_gpt.sh
sh scripts/LINKAGE_mistral.sh
sh scripts/LINKAGE_gpt.sh
python evaluation/evaluate_MRR_MPR.py
python evaluation/extract_results.py
python evaluation/concat_prediction_files.py
@misc{lee2025typedrag,
title={Typed-RAG: Type-aware Multi-Aspect Decomposition for Non-Factoid Question Answering},
author={DongGeon Lee and Ahjeong Park and Hyeri Lee and Hyeonseo Nam and Yunho Maeng},
year={2025},
eprint={2503.15879},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2503.15879},
}
This project is licensed under the CC BY-SA 4.0 license.