Typed-RAG

📣 Latest News

03/22/2025: The Wiki-NFQA dataset is released at Hugging Face Datasets.
03/20/2025: The paper and code for Typed-RAG is available. You can access the paper on arXiv and HF-paper.
03/13/2025: The paper for Typed-RAG is accepted at NAACL 2025 SRW!

💡 Overview

Typed-RAG enhances retrieval-augmented generation for non-factoid question-answering (NFQA) through type-aware multi-aspect query decomposition, delivering more contextually relevant and comprehensive responses.

📚 Wiki-NFQA Dataset

The Wiki-NFQA Dataset is a curated benchmark designed for evaluating open-domain question answering (ODQA) systems with non-factoid questions. The dataset is available on Hugging Face.

from datasets import load_dataset

# Load the combined dataset with all examples
wiki_nfqa_dataset = load_dataset("oneonlee/Wiki-NFQA", "Wiki-NFQA", split="test")

# Load reference answers for evaluation
reference_answers = load_dataset("oneonlee/Wiki-NFQA", "reference_answer_list", split="test")

🔧 Usage

Installation

conda create -n nfqa python=3.9
conda activate nfqa
pip install -r requirements.txt

Dataset Preparation & Preprocessing

sh scripts/download_raw_data.sh
sh scripts/process_data.sh
sh scripts/classify_data.sh
sh scripts/sample_data.sh

Elasticsearch Setup

python retriever/process_wiki.py
sh scripts/elasticsearch_setup.sh

Reference Answers Construction & Annotation

sh scripts/construct_reference_list.sh

Experiments

# Should run ES retriever manually
nohup ./retriever/elasticsearch-7.9.1/bin/elasticsearch > elasticsearch.log &

sh scripts/run_baseline_hf.sh
sh scripts/run_baseline_gpt.sh
sh scripts/run_Typed-RAG_hf.sh
sh scripts/run_Typed-RAG_gpt.sh

Evaluation

sh scripts/LINKAGE_mistral.sh
sh scripts/LINKAGE_gpt.sh

python evaluation/evaluate_MRR_MPR.py
python evaluation/extract_results.py
python evaluation/concat_prediction_files.py

📄 Citation

@misc{lee2025typedrag,
      title={Typed-RAG: Type-aware Multi-Aspect Decomposition for Non-Factoid Question Answering}, 
      author={DongGeon Lee and Ahjeong Park and Hyeri Lee and Hyeonseo Nam and Yunho Maeng},
      year={2025},
      eprint={2503.15879},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2503.15879}, 
}

📄 License

This project is licensed under the CC BY-SA 4.0 license.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
classifier		classifier
data		data
evaluation		evaluation
experiments		experiments
retriever		retriever
scripts		scripts
.env.sample		.env.sample
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Typed-RAG

📣 Latest News

💡 Overview

📚 Wiki-NFQA Dataset

🔧 Usage

Installation

Dataset Preparation & Preprocessing

Elasticsearch Setup

Reference Answers Construction & Annotation

Experiments

Evaluation

📄 Citation

📄 License

About

Languages

License

TeamNLP/Typed-RAG

Folders and files

Latest commit

History

Repository files navigation

Typed-RAG

📣 Latest News

💡 Overview

📚 Wiki-NFQA Dataset

🔧 Usage

Installation

Dataset Preparation & Preprocessing

Elasticsearch Setup

Reference Answers Construction & Annotation

Experiments

Evaluation

📄 Citation

📄 License

About

Resources

License

Stars

Watchers

Forks

Languages