Skip to content

A Streamlit-based RAG application that answers medical questions using LLM technology and trusted medical resources.

License

Notifications You must be signed in to change notification settings

slothJain/MediPediaBot

Repository files navigation

MediPedia: AI-Powered Medical Knowledge Assistant

MediPedia is an AI-powered application that provides accurate medical information based on trusted medical resources. It uses Retrieval Augmented Generation (RAG) to answer medical questions by searching through a database of medical knowledge.

MediPedia Screenshot

Features

  • Medical Q&A: Ask any medical question and get accurate information sourced from medical literature
  • Source Attribution: Responses include references to the source documents and page numbers
  • Chat Interface: User-friendly chat interface built with Streamlit
  • Secure Knowledge Base: Information is retrieved from trusted medical resources

Technology Stack

  • Frontend: Streamlit for the interactive web interface
  • NLP & ML:
    • LangChain for orchestrating LLM workflows
    • Hugging Face for embeddings and the LLM model (Mistral-7B)
    • FAISS for efficient vector search
  • Document Processing:
    • LangChain for document loading and text splitting
    • PDF processing for medical documents

Project Structure

medipedia/
├── data/                   # Medical PDF documents
│   └── GaleEncyclopediaOfMedicine.pdf
├── vector_store/           # Vector embeddings database
│   └── db_faiss/           # FAISS vector store
├── create_memory_for_llm.py   # Script to create embeddings from PDFs
├── connect_memory_with_llm.py # Script to test LLM with vector store
├── medipedia.py            # Main Streamlit application
├── requirements.txt        # Dependencies
└── README.md               # Project documentation

How It Works

  1. Data Ingestion: Medical PDFs are loaded and processed into chunks
  2. Embedding Generation: Text chunks are converted into vector embeddings
  3. Knowledge Storage: Embeddings are stored in a FAISS vector database
  4. User Query: User enters a medical question through the Streamlit interface
  5. Semantic Search: The system finds relevant information from the knowledge base
  6. Response Generation: The LLM generates a comprehensive answer based on retrieved context
  7. Source Attribution: References to source documents are provided for transparency

Installation

  1. Clone the repository:
git clone https://github.com/yourusername/medipedia.git
cd medipedia
  1. Create a virtual environment and activate it:
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
  1. Install dependencies:
pip install -r requirements.txt
  1. Set up your Hugging Face API token: Create a .env file in the project root with your Hugging Face API token:
HF_TOKEN=your_huggingface_token_here

Usage

Creating the Knowledge Base

  1. Place your medical PDFs in the data/ directory.

  2. Generate the vector database:

python create_memory_for_llm.py

Running the Application

Start the Streamlit application:

streamlit run medipedia.py

The application will be available at http://localhost:8501

Future Improvements

  • Add more medical resources to expand the knowledge base
  • Implement multi-modal support for medical images
  • Add user authentication for personalized medical information
  • Implement search history and favorite responses
  • Optimize for mobile devices

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgements

  • The Gale Encyclopedia of Medicine for the medical knowledge base
  • Hugging Face for providing access to state-of-the-art NLP models
  • LangChain for the powerful RAG framework

About the Author

This project was created as a demonstration of building AI-powered knowledge systems using retrieval augmented generation techniques. It showcases skills in natural language processing, vector embeddings, and building interactive AI applications.


Note: This application is for educational purposes only and should not be used as a substitute for professional medical advice, diagnosis, or treatment.

About

A Streamlit-based RAG application that answers medical questions using LLM technology and trusted medical resources.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages