This is a repo contains a list of papers about RAG, especially RAG with Knowledge Graphs
Large language models (LLMs) have demonstrated impressive reasoning abilities in complex tasks. However, they lack up-to-date knowledge and experience hallucinations during reasoning, which can lead to incorrect reasoning processes and diminish their performance and trustworthiness.
Recently, Retrieval-Augmented Generation (RAG) has achieved remarkable success in addressing the challenges of LLMs without necessitating retraining. By referencing an external knowledge base, RAG refines LLM outputs, effectively mitigating issues such as “hallucination”, lack of domain-specific knowledge, and outdated information. But in some practical scenarios, traditional RAG fails to capture significant structured relational knowledge, often recounts content in the form of text when concatenated as prompts and fails to grasp global information comprehensively.
Combining RAG with Knowledge Graphs (KGs) emerges as a promising solution to address these challenges. KGs can offer a structured and explicit representation of entities and relationships that are more accurate than retrieving information through vector similarity. Leveraging external structured knowledge graphs can improve contextual understanding of LLMs and generate more informed responses. The entire process typically contains three stages: Indexing, Retrieval and Generation. The overall pipeline is as follows.
We collect the recent influential papers about RAG especially RAG with KGs. The following papers are listed in chronological order of publication.
Date | Venue | Title | Code | Notes |
---|---|---|---|---|
2025-02-20 | Arxiv | (HippoRAG 2)From RAG to Memory: Non-Parametric Continual Learning for Large Language Models | Yes | Graphs for Knowledge Indexing & Graphs as Knowledge Carrier(KG construction from corpus) |
2025-02-08 | NAACL 2025 | (KG2RAG)Knowledge Graph-Guided Retrieval Augmented Generation | Yes | Graphs for Knowledge Indexing & Graphs as Knowledge Carrier(KG construction from corpus) |
2025-02-06 | The ACM Web Conference 2025 | MedRAG: Enhancing Retrieval-augmented Generation with Knowledge Graph-Elicited Reasoning for Healthcare Copilot | Yes | Graphs as Knowledge Carrier(KG construction from corpus) |
2024-12-17 | Arxiv | SimGRAG: Leveraging Similar Subgraphs for Knowledge Graphs Driven Retrieval-Augmented Generation | Yes | Graphs as Knowledge Carrier(KG construction from corpus & with existing KGs) |
2024-10-28 | ICLR 2025 | Simple Is Effective: The Roles of Graphs and Large Language Models in Knowledge-Graph-Based Retrieval-Augmented Generation | Yes | Graphs as Knowledge Carrier(KG construction from corpus) |
2024-10-08 | Arxiv | LightRAG: Simple and Fast Retrieval-Augmented Generation | Yes | Graphs as Knowledge Carrier(KG construction from corpus) |
2024-05-23 | NeurIPS 2024 | HippoRAG: Neurobiologically Inspired Long-Term Memory for Large Language Models | Yes | Graphs as Knowledge Carrier(KG construction from corpus) |
2024-04-24 | Arxiv | From Local to Global: A Graph RAG Approach to Query-Focused Summarization | Yes | Graphs as Knowledge Carrier(Knowledge Graph Construction from Corpus) |
Date | Venue | Title | Repo |
---|---|---|---|
2024-12-31 | Arxiv | Retrieval-Augmented Generation with Graphs (GraphRAG)) | Yes |
2024-08-15 | Arxiv | Graph Retrieval-Augmented Generation: A Survey | Yes |
2024-05-10 | KDD 2024 | A Survey on RAG Meeting LLMs: Towards Retrieval-Augmented Large Language Models | No |
Domain | Task | Dataset | Background Knowledge |
---|---|---|---|
General | Simple QA | NaturalQuestions | Wikipedia dump |
General | Simple QA | PopQA | Wikipedia dump |
General | Simple QA | SimpleQuestion | Freebase |
General | Simple QA | WebQ | Freebase |
General | Simple QA | WebQSP | Freebase |
General | Multi-hop QA | MuSiQue | Inside dataset |
General | Multi-hop QA | 2WikiMultihopQA | Inside dataset |
General | Multi-hop QA | HotpotQA | Wikipedia dump |
General | Multi-hop QA | CWQ | Freebase |
General | Multi-hop QA | MultiHop-RAG | - |
Movie | Multi-hop QA | MetaQA | Movie knowledge base(inclued in dataset) |
General | Complex QA | Mintaka | Wikidata |
General | Complex QA | GrailQA | Freebase |
18 domains | Complex QA | UltraDomain | - |
General | Complex QA | TriviaQA | - |
General | Large-scale Complex QA | LC-QuAD v2 | Wikidata or DBpedia |
General | Large-scale Complex QA | KQAPro | Wikidata |
General | Fact Verification | FACTKG | DBpedia |
Medical | Medical QA & Diagnostic support | DDXPlus | - |
General | QA & Discourse Understanding | NarrativeQA | Wikipedia dump |
Date | Venue | Title | Homepage | Domain |
---|---|---|---|---|
2023-08-23 | SIGIR 2024 | YAGO 4.5: A Large and Clean Knowledge Base with a Rich Taxonomy | Yes | General |
2023-02-09 | Bioinformatics | The scalable precision medicine open knowledge engine (spoke): a massive knowledge graph of biomedical information | Yes | Biomedical |
2018-11-22 | Nucleic acids research | String v11: protein–protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets | Yes | Protein-protein interaction prediction |
2018-05-12 | LREC workshop | Lynx: building the legal knowledge graph for smart compliance services in multilingual europe | Yes | Legal |