Paper Snap is a state-of-the-art research paper analysis platform designed to democratize access to academic knowledge. By leveraging the power of Retrieval-Augmented Generation (RAG) and high-performance Large Language Models (LLMs), it transforms dense, complex academic PDFs into clear, actionable insights.
Whether you are a researcher, student, or curious learner, Paper Snap helps you understand papers faster by providing high-fidelity summaries and an interactive Q&A assistant that answers questions based strictly on the source text—eliminating hallucinations.
Reading research papers is often time-consuming and cognitively demanding. Paper Snap solves this by:
- Breaking Down Complexity: Automatically extracting and summarizing core sections like Abstract, Methodology, Results, and Conclusions.
- Ensuring Accuracy: Using a specialized RAG pipeline with Cohere Reranking to ensure that answers are grounded in the most legally relevant parts of the document.
- Speed: Utilizing Groq's LPU™ Inference Engine to deliver near-instant summaries and answers, even for long papers (using
llama-3.3-70b-versatile).
- Smart Ingestion: Uses
PyPDFLoaderand specialized parsing logic to segment papers into meaningful semantic chunks. - Hybrid Search: Combines Supabase's pgvector for semantic search with Cohere's Rerank v3.0 model to prioritize the most accurate context chunks before sending them to the LLM.
- Contextual Awareness: Filters knowledge bases by
file_id, ensuring the AI only calls upon the specific paper you are currently analyzing.
- Targeted Extraction: Specifically isolates critical sections (Abstract, Methodology, Results, Conclusion) to generate a summary that matters.
- Large Context Processing: Capable of processing vast amounts of text to generate a cohesive overview without "tunnel vision."
- Groq Inference: Powered by the
llama-3.3-70b-versatilemodel running on Groq hardware for blazing-fast response times. - Cohere Embeddings: Uses
embed-english-v3.0for industry-leading semantic understanding of academic text. - Supabase Vector Store: Scalable and secure storage for document embeddings.
- React.js: Component-based UI library.
- React Router: For seamless navigation.
- CSS3: Custom responsive styling.
- Supabase Client: For authentication and data interactions.
- Python & Flask: robust API server.
- LangChain: Framework for building LLM applications.
- Cohere: High-quality text embeddings (
embed-english-v3.0). - Groq: Ultra-fast LLM inference engine.
- LangExtract: For specialized unstructured data extraction.
- Supabase: PostgreSQL database with
pgvectorextension for vector storage and file management.
Follow these instructions to set up the project locally.
- Node.js & npm
- Python 3.8+
- Supabase Account
- API Keys for: Groq, Cohere, Supabase, and LangExtract.
git clone https://github.com/your-username/paper-snap.git
cd paper-snapNavigate to the backend directory and set up the environment.
cd rag-backend
# Optional: Create a virtual environment
python -m venv venv
# Windows: venv\Scripts\activate
# Mac/Linux: source venv/bin/activate
# Install dependencies (ensure you have a requirements.txt, otherwise install manually)
pip install flask flask-cors langchain langchain-groq langchain-cohere langchain-community supabase python-dotenv langextractConfigure Environment Variables:
Create a .env file in the rag-backend directory:
SUPABASE_URL=your_supabase_url
SUPABASE_SERVICE_ROLE_KEY=your_supabase_service_key
COHERE_API_KEY=your_cohere_api_key
GROQ_API_KEY=your_groq_api_key
LANGEXTRACT_API_KEY=your_langextract_api_keyRun the Backend:
python main.pyThe server typically runs on http://localhost:5000.
Open a new terminal and navigate to the frontend directory.
cd my-react-app
# Install dependencies
npm install
# Run the development server
npm startThe application should now be running on http://localhost:3000.
Dr-Venom29 |
SriharshaG05 |
vislavathmahesh |