Welcome to the RAG Pipeline with BeyondLLM project! This repository showcases a sophisticated Retrieval-Augmented Generation (RAG) pipeline built using the BeyondLLM framework. Whether you're exploring LLM (Large Language Model) applications or building intelligent systems capable of retrieving and generating human-like responses, this project offers a solid foundation.
This project demonstrates the creation of a RAG pipeline, primarily focused on processing YouTube videos (in English) to generate meaningful insights and responses. With features like real-time query processing, user feedback integration, and advanced retrievers, this project stands out as a powerful tool for developers and AI enthusiasts.
- YouTube Video Processing: Extract and process content from YouTube videos, converting them into manageable data chunks.
- Advanced Retriever: Utilize advanced retrievers like cross-rerank to enhance retrieval accuracy.
- Language Model Integration: Leverage powerful LLMs, including the "Mistralai/Mistral-7B-Instruct-v0.2" model from Hugging Face.
- User Feedback Mechanism: Gather user feedback to refine and improve generated responses.
- RAG Triad Evaluation: Implement RAG Triad evaluation metrics to assess the quality of generated responses.
Check out the demo video to see the RAG pipeline in action! (Replace this link with an actual demo video link if available.)
Follow these simple steps to get the project up and running:
- Python 3.10+
- pip (Python package installer)
- Streamlit: For deploying the interactive application
git clone https://github.com/yourusername/rag-pipeline-beyondllm.git cd rag-pipeline-beyondllm
pip install -r requirements.txt
Add your Hugging Face and Google API keys to the config.py file. HF_TOKEN = "your-huggingface-token" GOOGLE_API_KEY = "your-google-api-key"
streamlit run app.py
Here's a brief overview of the key files in this project:
app.py: The main Streamlit application script. Handles data processing, querying, and user interactions. config.py: Configuration file where you set your API keys. requirements.txt: List of dependencies required to run the project. utils.py: Contains utility functions used throughout the project.
Enter the URL of an English YouTube video into the provided text box. Click Process Video to analyze and convert the video content into data chunks.
Enter your query in the text box (e.g., "Which tool is mentioned in the video?"). Click Get Answer to receive a response generated by the language model.
The response will be displayed along with RAG Triad evaluation metrics. Provide feedback on the response quality to help improve future answers.
Processing a YouTube video
Querying the model for specific information
This project is licensed under the MIT License. See the LICENSE file for details.
Contributions are welcome! Please feel free to submit a Pull Request or open an issue.
For any inquiries or feedback, please reach out to okan.rescue@gmail.com.