Skip to content

🎯This project is a Retrieval-Augmented Generation (RAG) application designed to help researchers, students, and AI enthusiasts efficiently extract insights from complex AI research papers. By leveraging semantic search and generative AI, users can upload research papers, ask questions, and receive precise, context-aware answers.

License

Notifications You must be signed in to change notification settings

udhaya2823/Retrieval-Augmented-Generation-based-AI-Research-paper-Assistant

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

3 Commits
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ“š AI Research Paper Assistant - RAG-based Q&A System

πŸš€ Project Overview

The AI Research Paper Assistant is a Retrieval-Augmented Generation (RAG) application that enables users to interact with AI research papers. Users can upload a research paper, ask domain-specific questions, and receive context-aware answers generated using LLMs.

πŸ” Key Features:

  • Upload research papers in PDF format.
  • Semantic search for retrieving relevant sections.
  • Summarization of key paper sections.
  • Interactive Q&A based on user queries.
  • Citation assistance for proper referencing.

🎯 Problem Statement

Understanding complex AI research papers can be challenging. This project aims to simplify knowledge extraction by enabling users to ask questions and receive relevant, summarized insights.

πŸ—οΈ Tech Stack

  • Text Extraction: PyPDF2, PDFMiner, Tesseract (OCR)
  • Vectorization: Sentence Transformers (all-MiniLM-L6-v2)
  • Vector Database: Pinecone, FAISS, or Weaviate
  • LLM Integration: OpenAI GPT-4, Flan-T5
  • Frontend: Streamlit, Flask, or FastAPI
  • Deployment: Streamlit Cloud, Hugging Face Spaces, AWS

πŸ”„ Workflow

  1. Upload & Process Paper πŸ“„

    • Extract text while preserving section hierarchy.
    • Handle multi-column layouts and citations.
  2. Preprocessing & Embedding Creation πŸ”’

    • Chunk text (200-500 words per segment).
    • Convert text into embeddings using Sentence Transformers.
    • Store embeddings in a vector database.
  3. Query Processing & Response Generation 🎯

    • User submits a query.
    • Perform semantic search to fetch relevant chunks.
    • Use an LLM (GPT-4 or Flan-T5) to generate answers based on retrieved text.
  4. Frontend Interaction 🌐

    • Users upload papers, enter queries, and view AI-generated responses.

πŸ”¬ Example Use Case

Query: "What is the main contribution of the paper?"
Processing: Retrieve key sections (Abstract, Conclusion) & generate response.
Output: "The main contribution of this paper is the introduction of a novel transformer-based architecture that improves efficiency by 25% while maintaining state-of-the-art performance."

πŸ› οΈ How to Run Locally

# Clone the repository
git clone https://github.com/your-repo/ai-research-assistant.git
cd ai-research-assistant

# Install dependencies
pip install -r requirements.txt

# Run the application
streamlit run app.py

πŸš€ Future Enhancements

  • Support for multi-paper retrieval πŸ“‘
  • Integration with Google Scholar API for external references 🌍
  • More fine-tuned LLM models for improved accuracy πŸ€–

πŸ“œ License

This project is licensed under the MIT License.

🀝 Contributing

Contributions are welcome! Feel free to open issues or submit pull requests.


🌟 Star this repo if you find it useful!

About

🎯This project is a Retrieval-Augmented Generation (RAG) application designed to help researchers, students, and AI enthusiasts efficiently extract insights from complex AI research papers. By leveraging semantic search and generative AI, users can upload research papers, ask questions, and receive precise, context-aware answers.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published