Building RAG based applications #54

manisnesan · 2023-09-17T20:04:18Z

https://www.anyscale.com/blog/a-comprehensive-guide-for-building-rag-based-llm-applications-part-1

Summary

Excited to share our production guide for building RAG-based LLM applications where we bridge the gap between OSS and closed-source LLMs.

💻 Develop a retrieval augmented generation (RAG) based LLM application from scratch.
🚀 Scale the major workloads (load, chunk, embed, index, serve, etc.) across multiple workers.
✅ Evaluate different configurations of our application to optimize for both per-component (ex. retrieval_score) and overall performance (quality_score).
🔀 Implement LLM hybrid routing approach to bridge the gap b/w OSS and closed LLMs.
📦 Serve the application in a highly scalable and available manner.
💥 Share the 1st order and 2nd order impacts LLM applications have had on our products.

manisnesan · 2023-10-26T00:47:06Z

Updates to the above with

Added some new components (fine-tuning embeddings, lexical search, reranking, etc.) to our production guide for building RAG-based LLM applications. Combination of these yielded significant retrieval and quality score boosts (evals included).

https://www.anyscale.com/blog/a-comprehensive-guide-for-building-rag-based-llm-applications-part-1

manisnesan · 2023-11-05T20:57:47Z

6 tips on doing RAG better from nirantk Thread

manisnesan · 2023-12-19T04:51:33Z

Update to the above post https://x.com/gokumohandas/status/1736416631725940969?s=46&t=aOEVGBVv9ICQLUYL4fQHlQ

manisnesan · 2023-12-19T23:49:22Z

Courses

RAG course by Activeloop

manisnesan · 2023-12-31T16:20:58Z

Gives a nice overview by building RAG from scratch using OSS bi-encoder, cross-encoder models using hsnwlib for search index and Llama for generation. And then shows how to use Langchain to do the same.

https://github.com/pacman100/DHS-LLM-Workshop/tree/main/6_Module

manisnesan · 2023-12-31T16:21:17Z

Full Stack Retrieval

manisnesan · 2023-12-31T16:22:12Z

Interesting tip about literature review

On the literature side: the 2020 Lewis paper is a good starting point. Read the papers that reference it in decreasing order of citations.

manisnesan · 2023-12-31T16:24:18Z

sophiamyang tutorial - On improving RAG

manisnesan · 2023-12-31T16:24:51Z

https://github.com/microsoft/chat-copilot

manisnesan · 2024-01-02T16:35:57Z

https://uptrain.ai/blog/a-comprehensive-guide-to-context-retrieval-in-llms

RAG pipeline standards, advanced retrieval techniques like hybrid search, query rewrite, evaluate the quality of retrieval context

manisnesan · 2024-01-23T16:15:28Z

https://ai.gopubby.com/transforming-data-orchestration-the-query-pipeline-and-flagembedding-rerank-with-llamaindex-dee5a2e9a797

manisnesan · 2024-01-28T22:55:20Z

https://outerbounds.com/blog/retrieval-augmented-generation/

Summary

Retrieval Augmented Generation (RAG) combines prompt engineering with a custom dataset.
RAG workflows do not require training of the model, unlike fine-tuning workflows.
The goal of RAG is to search for and merge relevant context from the dataset into the prompt fed to the model at generation time.

manisnesan · 2024-02-18T03:50:35Z

RAG for GitHuB issues using langchain and zephyr - demonstrates how you can quickly build a RAG (Retrieval Augmented Generation) for a project’s GitHub issues using HuggingFaceH4/zephyr-7b-beta model, and LangChain.

Follow up
Advanced RAG on HF docs and langchain

manisnesan · 2024-02-20T22:38:42Z

RAG survey

manisnesan · 2024-02-26T03:38:41Z

Improving RAG Retrieval Performance with Sub-Document Summaries

manisnesan · 2024-02-26T03:47:19Z

A Comprehensive Guide to RAG Pain Points and Solutions

https://medium.com/towards-data-science/12-rag-pain-points-and-proposed-solutions-43709939a28c?sk=ad208df6b9f8ded8c8ef7257a1daaed7

manisnesan · 2024-03-10T10:06:48Z

Chat with your code - RAG application step by step
Lightning studio

This naive chunking approach does not work for 80% of the real company guides, documentation etc. Let's say you have 100 product, each with different return policy, return policy is big, you need to chunk it, but you don't have the name of the product in return policy. You ask about return policy for a product, you will get hundreds chunks, without knowing which is which. First structure, then embed.

manisnesan · 2024-03-10T10:08:21Z

Chat with your Video

https://docs.llamaindex.ai/en/stable/examples/retrievers/videodb_retriever.html

manisnesan · 2024-03-20T12:03:57Z

A new long-form (5.5 hour) RAG tutorial is available, building a pipeline from scratch to create "NutriChat" using a 1200 page nutrition PDF. It uses free open-source HF models, no APIs needed. (6k views)
LangChain released a 10th video in their RAG From Scratch series, focusing on query routing using logical reasoning with an LLM or semantic similarity. (35k views)
LangChain announced an integration with NVIDIA NIM for GPU-optimized LLM inference in RAG applications. (15k views)

manisnesan · 2024-03-20T20:42:12Z

Advanced RAG with Gemma, weaviate and llamaindex

✨ Custom LLM class to use Gemma
✨ weaviate_io with hybrid search & auto-retriever
✨ Reranking
✨ Few-shot prompting

manisnesan · 2024-03-20T20:52:56Z

https://github.com/run-llama/llama_index/blob/main/docs/examples/vector_stores/WeaviateIndex_auto_retriever.ipynb

manisnesan · 2024-03-21T04:45:21Z

Is RAG really dead

manisnesan · 2024-03-21T04:58:25Z

What is RAG: Understanding Retrieval Augmented Generation

Indexing

Query Vectorization

Hybrid Search

Generator

Creating your first RAG chatbot with Langchain, Groq, and OpenAI - Here
Evaluating RAG framework for assessment https://superlinked.com/vectorhub/evaluating-retrieval-augmented-generation-a-framework-for-assessment

manisnesan · 2024-03-22T14:39:37Z

Financial tiny rag dataset

Dataset details:
• 100 questions on Airbnb 2023 10-K
• synthetically generated via opus

Dataset
Source : X post from virat

manisnesan · 2024-03-24T15:01:27Z

Mastering RAG series

Pratik Bhavsar authored

https://www.rungalileo.io/blog/mastering-rag-how-to-architect-an-enterprise-rag-system

Architecture

Challenges

Case studies

Failure points of RAG system

manisnesan · 2024-03-24T15:04:41Z

Mastering RAG series

Pratik Bhavsar authored

https://www.rungalileo.io/blog/mastering-rag-llm-prompting-techniques-for-reducing-hallucinations

Prompting techniques for RAG

manisnesan · 2024-04-02T02:35:08Z

LLM Hallucination Index

https://www.rungalileo.io/hallucinationindex

manisnesan · 2024-04-02T02:36:01Z

RAG from scratch - langchain YouTube series

manisnesan/til#92

manisnesan · 2024-04-02T02:49:33Z

Check rag recipes in this cookbook https://huggingface.co/learn/cookbook/index

manisnesan · 2024-04-04T21:56:54Z

https://github.com/mistralai/cookbook/tree/main/third_party/LlamaIndex

LlamaIndex + MistralAI Cookbook Series 🧑‍🍳❤️

Here’s a definitive set of cookbooks to build simple-to-advanced RAG, agentic RAG, and agents in general with MistralAI.

It takes you through a tour of our RAG abstractions (including routing and query decomposition), along with our FunctionCallingAgent and ReActAgent.
1️⃣ RAG setup
2️⃣ Routing
3️⃣ Sub-question query decomposition
4️⃣ Agents + Tool Use with native function calling support
5️⃣ Adaptive RAG

manisnesan · 2024-04-12T08:37:14Z

Chat with your code: RAG with weaviate and llamdaindex

This studio guided you through the fundamental steps of building a naive RAG pipeline for a "Chat with your code" application. For this, we used a BGE embedding model via Hugging Face to generate the embeddings to store in a Weaviate vector database. We also used LlamaIndex as the orchestration framework to connect the retrieval component with a local LLM Mistral via Ollama. Finally, everything was wrapped up in a Streamlit app.

manisnesan · 2024-04-15T11:14:35Z

RAG and what does it do for GenAI

Uses a variety of data sources to keep AI models fresh with up-to-date information and organizational knowledge.

manisnesan · 2024-04-28T00:45:16Z

From Llamaindex post

RAG from prototype to production - 9 part series

RAG in a notebook is easy, RAG serving live production users is hard. This tutorial series by Marco Bertelli is the perfect step-by-step resource to outline all the architectural components you need to productionize a full RAG server:

Introduction and setting up prototype RAG
Incorporating conversation history, user feedback, advanced RAG
Setting up the right document store for dynamic RAG
Building and deploying backend to Heroku with CI/CD
(Most recently) Building and deploying FE to AWS Cloudfront with ACLs on S3

manisnesan · 2024-05-06T01:08:33Z

Improved RAG with Llama3 and Ollama

https://miro.medium.com/v2/resize:fit:786/format:webp/1*r5ukSwg5kBzIx9lqvt-EXg.png

implement an advanced RAG with fully local infrastructure leveraging the most advanced openly available Large Language Model Llama-3 from meta

manisnesan · 2024-05-06T01:11:38Z

Local function calling with ollama. https://github.com/phidatahq/phidata/tree/main/cookbook/llms/ollama/tools

manisnesan self-assigned this Oct 26, 2023

Building RAG based applications #54

Building RAG based applications #54

Comments

manisnesan commented Sep 17, 2023

manisnesan commented Oct 26, 2023

manisnesan commented Nov 5, 2023

manisnesan commented Dec 19, 2023

manisnesan commented Dec 19, 2023

Courses

manisnesan commented Dec 31, 2023

manisnesan commented Dec 31, 2023

manisnesan commented Dec 31, 2023

manisnesan commented Dec 31, 2023

manisnesan commented Dec 31, 2023

manisnesan commented Jan 2, 2024

manisnesan commented Jan 23, 2024 • edited Loading

manisnesan commented Jan 28, 2024

manisnesan commented Feb 18, 2024

manisnesan commented Feb 20, 2024

manisnesan commented Feb 26, 2024

manisnesan commented Feb 26, 2024

manisnesan commented Mar 10, 2024 • edited Loading

manisnesan commented Mar 10, 2024

manisnesan commented Mar 20, 2024 • edited Loading

manisnesan commented Mar 20, 2024 • edited Loading

manisnesan commented Mar 20, 2024

manisnesan commented Mar 21, 2024

manisnesan commented Mar 21, 2024

Indexing

Query Vectorization

Hybrid Search

Generator

manisnesan commented Mar 22, 2024

manisnesan commented Mar 24, 2024

manisnesan commented Mar 24, 2024

manisnesan commented Apr 2, 2024

manisnesan commented Apr 2, 2024

manisnesan commented Apr 2, 2024 • edited Loading

manisnesan commented Apr 4, 2024 • edited Loading

manisnesan commented Apr 12, 2024

manisnesan commented Apr 15, 2024

manisnesan commented Apr 28, 2024

manisnesan commented May 6, 2024 • edited Loading

manisnesan commented May 6, 2024

manisnesan commented Jan 23, 2024 •

edited

Loading

manisnesan commented Mar 10, 2024 •

edited

Loading

manisnesan commented Mar 20, 2024 •

edited

Loading

manisnesan commented Mar 20, 2024 •

edited

Loading

manisnesan commented Apr 2, 2024 •

edited

Loading

manisnesan commented Apr 4, 2024 •

edited

Loading

manisnesan commented May 6, 2024 •

edited

Loading