We provide the conflictQA based on different large language models, which utilizes large language models guided parametric memory.
The data is available at conflictQA foloder. This folder contains the data for both POPQA and STRATEGYQA
{"question": "What is George Rankin's occupation?", "popularity": 142, "ground_truth": ["politician", "political leader", "political figure", "polit.", "pol"], "memory_answer": "George Rankin's occupation is a professional photographer.", "parametric_memory": "As a professional photographer, George Rankin...", "counter_answer": "George Rankin's occupation is political figure.", "counter_memory": "George Rankin has been actively involved in politics for over a decade...", "parametric_memory_aligned_evidence": "George Rankin has a website showcasing his photography portfolio...", "counter_memory_aligned_evidence": "George Rankin Major General George James Rankin..."}
- "question": The question in natural language
- "popularity": The monthly page views on Wikipedia for the given question
- "ground_truth": The factual answer to the question, which may include multiple possible answers
- "memory_answer": The answer provided by the LLM to the question
- "parametric_memory": The supportive evidence from LLM's parametric memory for the answer
- "counter_answer": The answer contradicting the "memory_answer"
- "counter_memory": The generation-based evidence supporting the counter_answer
- "parametric_memory_aligned_evidence": Additional evidence supporting the "memory_answer", which could be generated or derived from Wikipedia/human annotation
- "counter_memory_aligned_evidence": Additional evidence supporting the "counter_answer", either generated or sourced from Wikipedia/human annotation
We also release our dataset at: Huggingface datasets: https://huggingface.co/datasets/osunlp/ConflictQA (more details can be found on the dataset page)
#loading dataset
from datasets import load_dataset
# you can choose dataset "ConflictQA-popQA-[PLACEHOLDER]", and the [PLACEHOLDER] is in ["chatgpt","gpt4","palm2","llama2-7b","llama2-70b","qwen7b","vicuna7b","vicuna33b"].
dataset = load_dataset("osunlp/ConflictQA",'ConflictQA-popQA-chatgpt')
Code is available in code foloder.
If our paper or related resources prove valuable to your research, we kindly ask for citation. Please feel free to contact us with any inquiries.
@inproceedings{
xie2024knowledgeconflict,
title={Adaptive Chameleon or Stubborn Sloth: Revealing the Behavior of Large Language Models in Knowledge Conflicts},
author={Jian Xie and Kai Zhang and Jiangjie Chen and Renze Lou and Yu Su},
booktitle={The Twelfth International Conference on Learning Representations},
year={2024},
url={https://openreview.net/forum?id=auKAUJZMO6}
}