Haystack

Haystack is a framework for building custom RAG pipelines with LLMs.

What is RAG?

RAG, or Retrieval-Augmented Generation, integrates retrieval-based models, which search large datasets to extract relevant information, and generation-based models, which use this information to produce coherent and contextually accurate text.

This approach is especially valuable in scenarios requiring access to extensive external information, such as customer support, knowledge-based systems, and other applications where precise and accurate information retrieval is essential.

The RAG process consists of four stages:

Indexing: Data to be referenced is first transformed into LLM embeddings—numerical representations in the form of large vectors. These embeddings are then stored in a vector database, enabling efficient document retrieval.
Retrieval: Given a user query, a document retriever is employed to select the most relevant documents. This process typically utilizes models like BM25, dense retrievers, or other retrieval techniques to find documents or passages related to the query, providing the necessary information for the next steps.
Augmentation: The relevant information retrieved is fed into the LLM through prompt engineering, augmenting the user’s original query. This augmented data serves as additional context, helping the model generate more informed and accurate responses.
Generation: The LLM generates output based on both the user query and the retrieved documents. Models like GPT process this combined input to produce a coherent and contextually appropriate response.

Haystack Concepts

Pipelines: Directed multigraphs of different Haystack components and integrations to design and scale your interactions with LLMs
- The Pipeline class in Haystack allows you to construct a sequence of components, each performing a specific task in a data processing pipeline. This modular approach makes it easy to build complex pipelines by chaining together different components.
Document Stores: an object that stores your documents in Haystack, like an interface to a storage database
Data Classes
Components: Building Blocks of a pipeline that can be connected to each other
- Generators: responsible for generating text responses after you give them a prompt using a specific llm technology (two types: chat and non-chat)
- Retrievers: go through all the documents in a Document Store, select the ones that match the user query, and pass it on to the next componen

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.gitignore		.gitignore
01-building-blocks.ipynb		01-building-blocks.ipynb
02a-rag-pipeline.ipynb		02a-rag-pipeline.ipynb
02b-rag-pipeline-project.ipynb		02b-rag-pipeline-project.ipynb
03a-custom-components-toy-example.ipynb		03a-custom-components-toy-example.ipynb
03b-custom-components-news-summarizer.ipynb		03b-custom-components-news-summarizer.ipynb
03c-custom-components-project.ipynb		03c-custom-components-project.ipynb
04-fallbacks-with-websearch-branch.ipynb		04-fallbacks-with-websearch-branch.ipynb
05-self-reflecting-agents.ipynb		05-self-reflecting-agents.ipynb
06-chat-agent.ipynb		06-chat-agent.ipynb
README.md		README.md
example.py		example.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Haystack

What is RAG?

Haystack Concepts

About

Uh oh!

Releases

Packages

Uh oh!

Languages

andreascansee/ML-Ops-Hub

Folders and files

Latest commit

History

Repository files navigation

Haystack

What is RAG?

Haystack Concepts

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages