Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Building RAG based applications #54

Open
manisnesan opened this issue Sep 17, 2023 · 35 comments
Open

Building RAG based applications #54

manisnesan opened this issue Sep 17, 2023 · 35 comments
Assignees

Comments

@manisnesan
Copy link
Owner

https://www.anyscale.com/blog/a-comprehensive-guide-for-building-rag-based-llm-applications-part-1

Summary

Excited to share our production guide for building RAG-based LLM applications where we bridge the gap between OSS and closed-source LLMs.

  • 💻 Develop a retrieval augmented generation (RAG) based LLM application from scratch.
  • 🚀 Scale the major workloads (load, chunk, embed, index, serve, etc.) across multiple workers.
  • ✅ Evaluate different configurations of our application to optimize for both per-component (ex. retrieval_score) and overall performance (quality_score).
  • 🔀 Implement LLM hybrid routing approach to bridge the gap b/w OSS and closed LLMs.
  • 📦 Serve the application in a highly scalable and available manner.
  • 💥 Share the 1st order and 2nd order impacts LLM applications have had on our products.
@manisnesan
Copy link
Owner Author

Updates to the above with

Added some new components (fine-tuning embeddings, lexical search, reranking, etc.) to our production guide for building RAG-based LLM applications. Combination of these yielded significant retrieval and quality score boosts (evals included).

https://www.anyscale.com/blog/a-comprehensive-guide-for-building-rag-based-llm-applications-part-1

@manisnesan manisnesan self-assigned this Oct 26, 2023
@manisnesan
Copy link
Owner Author

6 tips on doing RAG better from nirantk Thread

@manisnesan
Copy link
Owner Author

@manisnesan
Copy link
Owner Author

Courses

RAG course by Activeloop

@manisnesan
Copy link
Owner Author

Gives a nice overview by building RAG from scratch using OSS bi-encoder, cross-encoder models using hsnwlib for search index and Llama for generation. And then shows how to use Langchain to do the same.

https://github.com/pacman100/DHS-LLM-Workshop/tree/main/6_Module

@manisnesan
Copy link
Owner Author

Full Stack Retrieval

@manisnesan
Copy link
Owner Author

Interesting tip about literature review

On the literature side: the 2020 Lewis paper is a good starting point. Read the papers that reference it in decreasing order of citations.

@manisnesan
Copy link
Owner Author

sophiamyang tutorial - On improving RAG

@manisnesan
Copy link
Owner Author

@manisnesan
Copy link
Owner Author

https://uptrain.ai/blog/a-comprehensive-guide-to-context-retrieval-in-llms

RAG pipeline standards, advanced retrieval techniques like hybrid search, query rewrite, evaluate the quality of retrieval context

@manisnesan
Copy link
Owner Author

https://outerbounds.com/blog/retrieval-augmented-generation/

Summary

  • Retrieval Augmented Generation (RAG) combines prompt engineering with a custom dataset.
  • RAG workflows do not require training of the model, unlike fine-tuning workflows.
  • The goal of RAG is to search for and merge relevant context from the dataset into the prompt fed to the model at generation time.

@manisnesan
Copy link
Owner Author

RAG for GitHuB issues using langchain and zephyr - demonstrates how you can quickly build a RAG (Retrieval Augmented Generation) for a project’s GitHub issues using HuggingFaceH4/zephyr-7b-beta model, and LangChain.

Follow up
Advanced RAG on HF docs and langchain

@manisnesan
Copy link
Owner Author

RAG survey

@manisnesan
Copy link
Owner Author

@manisnesan
Copy link
Owner Author

manisnesan commented Mar 10, 2024

Chat with your code - RAG application step by step
Lightning studio

This naive chunking approach does not work for 80% of the real company guides, documentation etc. Let's say you have 100 product, each with different return policy, return policy is big, you need to chunk it, but you don't have the name of the product in return policy. You ask about return policy for a product, you will get hundreds chunks, without knowing which is which. First structure, then embed.

@manisnesan
Copy link
Owner Author

@manisnesan
Copy link
Owner Author

manisnesan commented Mar 20, 2024

  • A new long-form (5.5 hour) RAG tutorial is available, building a pipeline from scratch to create "NutriChat" using a 1200 page nutrition PDF. It uses free open-source HF models, no APIs needed. (6k views)
  • LangChain released a 10th video in their RAG From Scratch series, focusing on query routing using logical reasoning with an LLM or semantic similarity. (35k views)
  • LangChain announced an integration with NVIDIA NIM for GPU-optimized LLM inference in RAG applications. (15k views)

@manisnesan
Copy link
Owner Author

manisnesan commented Mar 20, 2024

Advanced RAG with Gemma, weaviate and llamaindex

image

✨ Custom LLM class to use Gemma
✨ weaviate_io with hybrid search & auto-retriever
✨ Reranking
✨ Few-shot prompting

@manisnesan
Copy link
Owner Author

Is RAG really dead

@manisnesan
Copy link
Owner Author

What is RAG: Understanding Retrieval Augmented Generation

image

Indexing

image

Query Vectorization

image

image

Hybrid Search

image

Generator

image

image

Next

@manisnesan
Copy link
Owner Author

Financial tiny rag dataset

Dataset details:
• 100 questions on Airbnb 2023 10-K
• synthetically generated via opus

image

Dataset
Source : X post from virat

@manisnesan
Copy link
Owner Author

Mastering RAG series

Pratik Bhavsar authored

Architecture
image

Challenges

Case studies
image

Failure points of RAG system

image

@manisnesan
Copy link
Owner Author

Mastering RAG series

Pratik Bhavsar authored

https://www.rungalileo.io/blog/mastering-rag-llm-prompting-techniques-for-reducing-hallucinations

Prompting techniques for RAG

image

@manisnesan
Copy link
Owner Author

LLM Hallucination Index

https://www.rungalileo.io/hallucinationindex

@manisnesan
Copy link
Owner Author

RAG from scratch - langchain YouTube series

manisnesan/til#92

@manisnesan
Copy link
Owner Author

manisnesan commented Apr 2, 2024

Check rag recipes in this cookbook https://huggingface.co/learn/cookbook/index

image

@manisnesan
Copy link
Owner Author

manisnesan commented Apr 4, 2024

https://github.com/mistralai/cookbook/tree/main/third_party/LlamaIndex

image

LlamaIndex + MistralAI Cookbook Series 🧑‍🍳❤️

Here’s a definitive set of cookbooks to build simple-to-advanced RAG, agentic RAG, and agents in general with MistralAI.

It takes you through a tour of our RAG abstractions (including routing and query decomposition), along with our FunctionCallingAgent and ReActAgent.
1️⃣ RAG setup
2️⃣ Routing
3️⃣ Sub-question query decomposition
4️⃣ Agents + Tool Use with native function calling support
5️⃣ Adaptive RAG

@manisnesan
Copy link
Owner Author

Chat with your code: RAG with weaviate and llamdaindex

This studio guided you through the fundamental steps of building a naive RAG pipeline for a "Chat with your code" application. For this, we used a BGE embedding model via Hugging Face to generate the embeddings to store in a Weaviate vector database. We also used LlamaIndex as the orchestration framework to connect the retrieval component with a local LLM Mistral via Ollama. Finally, everything was wrapped up in a Streamlit app.

@manisnesan
Copy link
Owner Author

RAG and what does it do for GenAI

Uses a variety of data sources to keep AI models fresh with up-to-date information and organizational knowledge.

@manisnesan
Copy link
Owner Author

From Llamaindex post

RAG from prototype to production - 9 part series

RAG in a notebook is easy, RAG serving live production users is hard. This tutorial series by Marco Bertelli is the perfect step-by-step resource to outline all the architectural components you need to productionize a full RAG server:

  1. Introduction and setting up prototype RAG
  2. Incorporating conversation history, user feedback, advanced RAG
  3. Setting up the right document store for dynamic RAG
  4. Building and deploying backend to Heroku with CI/CD
  5. (Most recently) Building and deploying FE to AWS Cloudfront with ACLs on S3

image

@manisnesan
Copy link
Owner Author

manisnesan commented May 6, 2024

Improved RAG with Llama3 and Ollama

https://miro.medium.com/v2/resize:fit:786/format:webp/1*r5ukSwg5kBzIx9lqvt-EXg.png

implement an advanced RAG with fully local infrastructure leveraging the most advanced openly available Large Language Model Llama-3 from meta

@manisnesan
Copy link
Owner Author

Local function calling with ollama. https://github.com/phidatahq/phidata/tree/main/cookbook/llms/ollama/tools

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant