🤖 Research Agent API

Overview

The Research Agent API helps you find and understand information from research documents. It’s made for researchers and professionals, combining general knowledge with specific details from documents to give clear and helpful answers.

Key Features

🧠 Smart Question Routing: Automatically directs queries to the most suitable processing mechanism, including conversational memory for context-aware responses, vector-based retrieval for document-specific answers, and general LLM-based solutions for broader questions.
🔍 Hybrid Retrieval System: Combines keyword search (BM25) with semantic search using FAISS and OpenAI Embeddings for comprehensive and accurate document discovery.
✨ Enhanced Query Rewriting: Optimizes search precision by rephrasing user queries with advanced techniques.
⚙️ Layered Answer Generation: Produces high-quality responses through a multi-stage process that includes self-assessment and quality assurance.
📄 Flexible Retrieval Pipeline: Allows real-time indexing of newly uploaded documents, seamlessly updating the retrieval process.
🌐 REST API Accessibility: Provides easy integration with endpoints for document uploads, query handling, and managing conversational memory.
📊 Advanced Monitoring: Incorporates Langfuse for in-depth logging, debugging, and performance monitoring, ensuring detailed insights into system operations.
🛡️ Failsafe Mechanism: Ensures smooth operations by seamlessly switching to backup APIs in case the primary API is down.

Technical Stack

Core Components

Language Models:
- Primary: Llama 3 70B Instruct via TogetherAI API. 🦙
- Backup: GPT-4o (via OpenAI API) for guaranteed response fallback in case of primary LLM issues. 🤖
🗄️ Vector Database: FAISS (Facebook AI Similarity Search) for creating semantic embeddings.
Retrieval Techniques: 🔎
- 🔑 Keyword Search: BM25 (Best Matching 25) for keyword-based document retrieval.
- ✨ Semantic Search: OpenAI Embeddings for vector-based semantic search.
- 🔀 Ensemble Search: Combines BM25 and semantic search results for enhanced retrieval performance.
🕹️ Orchestration: FastAPI for API management and efficient handling of web requests.
🚦 Tracing: Langfuse for observability of application flow and debugging insights.
📄 Document Handling: PyPDF for processing PDF documents.

Environment and Dependencies

Runtime: Python 3.12
Containerization: Docker (via docker-compose) for reproducible deployments.
Key Libraries:
- langchain: Framework for developing applications with language models.
- langgraph: Library for building robust conversational flows using graph architectures.
- faiss-cpu: Library for creating and searching vector indices efficiently.
- fastapi: Modern, high-performance web framework for building APIs.
- python-multipart: For handling multipart/form-data file uploads.
- langfuse: Langfuse library for tracing LLM applications.

API Endpoints

1. Upload Research Papers

Endpoint: POST /api/docs/upload
Description: Uploads multiple PDF documents to be added to the system's knowledge base.
Request Body: Accepts multipart/form-data with files under the key files.
Response: JSON object detailing each file's upload status (successful, failed, pages processed), along with the total number of files processed.

2. Query Documents

Endpoint: POST /api/query
Description: Handles user queries against loaded documents or general knowledge. Supports conversation continuity via thread_id.

Request Body:

{
    "question": "Your question here",
    "thread_id": "Optional thread ID for conversation continuity",
    "config": {
       //  Optional  custom config options  like temperature or any other config
    }
}

Response: A JSON object containing the answer, a list of document references, and the thread ID.

    {
        "answer": "The answer to your question",
        "references": [
            {
                "source": "paper1.pdf",
                "relevance_score": 0.95,
                "snippet": "A relevant excerpt from the paper"
            },
            ...
        ],
        "thread_id": "UUID of the conversation thread"
    }

3. Reset Thread

Endpoint: POST /api/thread/reset
Description: Clears the memory for a specific conversation thread.
Request Params: Accepts thread_id as a query parameter.
Response: Confirmation message and thread ID of the reset thread.

    {
      "status": "success",
      "message": "Thread reset successfully",
      "thread_id": "your-thread-id"
    }

4. Reset Vector Database

Endpoint: POST /api/vectordb/reset
Description: Clears the FAISS vector database, removing all document indexes, and resetting the retrieval pipeline.
Response: Confirmation message of the database reset, along with document status.

    {
      "status": "success",
      "message": "Vector database reset successfully",
      "document_count": 0,
      "has_documents": false
    }

5. Get Document Status

Endpoint: GET /api/status
Description: Gets the status of documents currently loaded in the vector database.
Response: JSON object containing a boolean to check if document available and the document count.

{
  "has_documents": true,
  "document_count": 4
}

6. Get Thread Status

Endpoint: GET /api/thread/status/{thread_id}
Description: Returns the status and history of a specific thread, including all messages exchanged.
Response: JSON object showing the number of messages, a boolean to check is history available and a list of the messages exchanged.

{
  "thread_id": "your-thread-id",
  "message_count": 3,
  "has_history": true,
  "messages": [
    {
      "type": "human",
      "content": "your question ?"
    },
    {
      "type": "ai",
      "content": "Your response to the question"
    },
        {
      "type": "human",
      "content": "second question ?"
    }
  ]
}

Project Structure

├── docker-compose.yml
├── postman_test_cases.json
├── requirements.txt
├── tests/
│   ├── conftest.py
│   ├── unit/
│   │   ├── test_graph_service.py
│   │   ├── test_llm_service.py
│   │   ├── test_memory_service.py
│   │   └── test_retrieval_service.py
│   └── fixtures/
├── docs/
├── papers/    <--- PDF documents are stored here
├── scripts/
│   └── export_codebase.py
├── src/
│   ├── main.py
│   ├── routers/
│   │   ├── docs_router.py
│   │   └── query_router.py
│   ├── utils/
│   │   ├── config.py
│   │   ├── logging.py
│   │   └── pdf_utils.py
│   ├── models/
│   │   ├── request_models.py
│   │   └── response_models.py
│   ├── services/
│   │   ├── graph_service.py
│   │   ├── llm_service.py
│   │   ├── memory_service.py
│   │   ├── retrieval_service.py
│   │   └── tracing_service.py

Detailed Component Explanation

src/main.py

The `src/main.py` file serves as the entry point for the FastAPI application. It initializes the app, setting up essential components such as CORS and routing. On startup, the application preloads PDF documents from the `docs` directory to make them immediately accessible. Additionally, it exposes a root path `/` that can be used for system health checks.

src/routers/docs_router.py

This module handles document upload and status retrieval functionality. It includes the `POST /api/docs/upload` endpoint, which allows users to upload PDF documents. Uploaded PDFs are processed and text is extracted using utilities from `src/utils/pdf_utils.py`. The router also provides the `GET /api/status` endpoint for checking the status of loaded documents.

src/routers/query_router.py

The `query_router` defines endpoints for query processing, conversation thread management, and resetting the vector database. The `POST /api/query` endpoint dynamically routes user queries to the memory service, vector store, or general LLM handler based on the characteristics of the question. It uses `src/services/memory_service.py` to maintain and manage conversation context. Additionally, it includes the `POST /api/thread/reset` endpoint for clearing conversation history and the `POST /api/vectordb/reset` endpoint to clear all indexed documents from the vector store.

src/utils/config.py

This module manages configuration settings for the application by loading environment variables from `.env` files. It defines key settings for OpenAI and Langfuse integrations, as well as paths for default directories used throughout the application.

src/utils/logging.py

The `logging` module configures the `loguru` library for logging. It sets up both console and file-based logging to ensure that system activity is appropriately recorded for debugging and monitoring purposes.

src/utils/pdf_utils.py

The `pdf_utils` module provides utility functions for loading and processing PDF files. It includes functions like `load_pdfs_from_directory` to load multiple PDFs from a specified directory and `load_pdf` to load a single PDF file. Additionally, it contains asynchronous functions for processing uploaded PDF files.

src/models/request_models.py

This module defines Pydantic models used for validating request payloads. It includes the `QueryRequest` model for queries, which incorporates fields for the question, user ID, and thread ID. Additionally, the `UploadRequest` model is used for specifying the file type during document uploads.

src/models/response_models.py

Response models for API interactions are defined in this module. It includes the `QueryResponse` model for returning answers to queries, as well as `UploadResponse` and `DocumentReference` models for document-related operations. An `ErrorResponse` model is also included to standardize error handling.

src/services/graph_service.py

The `graph_service` module implements the core logic for processing research questions. It manages the overall flow, including question rewriting, document grading, and answer generation. It leverages `src/services/llm_service.py` for LLM interactions and `src/services/retrieval_service.py` for document retrieval, ensuring a seamless question-answering pipeline.

src/services/llm_service.py

This module provides methods for interacting with Large Language Models (LLMs). It manages OpenAI and TogetherAI integrations with fallback logic to guarantee responses using `gpt-4o-mini` if `Llama-3.3-70B` models fail. It includes system prompts to define LLM behavior and incorporates error handling and tracing using `langfuse`.

src/services/memory_service.py

The `memory_service` module is responsible for managing conversation history. It allows messages to be added to a thread, retrieves conversation messages when needed, and provides functionality to clear conversation threads entirely.

src/services/retrieval_service.py

This module manages the document retrieval pipeline. It sets up and indexes documents using both BM25 keyword-based retrieval and FAISS for semantic search. The `rebuild` method allows the indexes to be updated with new documents. The module also includes functionality for extracting relevant document snippets to improve the relevance of retrieved content.

src/services/tracing_service.py

The `tracing_service` module initializes the Langfuse client for observability. It provides methods to trace interactions and log events during API calls. Context-managed traces are implemented, allowing for detailed monitoring of system interactions, including quality assessments and scoring.

Installation

1. Clone the Repository

Clone the repository and navigate to the project directory:

git clone https://github.com/RThaweewat/research-agent-api.git
cd research-agent-api

2. Set Up the Environment

Copy the environment variables file and add your API keys:

cp .env.example .env
# Add your OpenAI, Together, and Langfuse API keys in the .env file

3. Install Dependencies

Option 1: Using pip (for local development)

Install the required Python packages:

pip install -r requirements.txt

Option 2: Using Docker (recommended for consistent deployments)

Install Docker: If not already installed, follow instructions for your OS at Docker Installation Guide.
Install Docker Compose: Usually included with Docker Desktop. For standalone installation, refer to Docker Compose Installation Guide.

4. Run the API

Option 1: Without Docker

Run the application using Uvicorn:

uvicorn src.main:app --host 0.0.0.0 --port 8000 --reload

Option 2: With Docker

Navigate to the project root where docker-compose.yml is located.
Build and run the Docker containers in detached mode:
```
docker-compose up -d --build
```
To stop the Docker instance:
```
docker-compose down
```

5. Access the API

Open your browser and navigate to http://localhost:8000/docs to view the interactive API documentation.

Testing

Unit Tests

Run unit tests using Pytest:

pytest tests/

These tests focus on the core logic of each module in the src/services directory.

test_graph_service.py: Tests the end-to-end flow and error handling within the graph architecture.
test_llm_service.py: Tests that the LLM is initialized correctly and handles prompts.
test_memory_service.py: Tests the storage and retrieval functionality of the conversational memory system.
test_retrieval_service.py: Tests that the document retrieval pipeline works correctly. These tests are crucial for ensuring the reliability of individual components.

API Tests

The postman_test_cases.json file provides comprehensive API tests which you can use with the Postman application. Import this collection into Postman to check the API functionality and confirm proper integration.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
scripts		scripts
src		src
tests		tests
.DS_Store		.DS_Store
.env.example		.env.example
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
docker-compose.yml		docker-compose.yml
postman_test_cases.json		postman_test_cases.json
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🤖 Research Agent API

Overview

Key Features

Technical Stack

Core Components

Environment and Dependencies

API Endpoints

1. Upload Research Papers

2. Query Documents

3. Reset Thread

4. Reset Vector Database

5. Get Document Status

6. Get Thread Status

Project Structure

Detailed Component Explanation

Installation

1. Clone the Repository

2. Set Up the Environment

3. Install Dependencies

Option 1: Using pip (for local development)

Option 2: Using Docker (recommended for consistent deployments)

4. Run the API

Option 1: Without Docker

Option 2: With Docker

5. Access the API

Testing

Unit Tests

API Tests

About

Uh oh!

Releases

Packages

Languages

License

RThaweewat/research-agent-api

Folders and files

Latest commit

History

Repository files navigation

🤖 Research Agent API

Overview

Key Features

Technical Stack

Core Components

Environment and Dependencies

API Endpoints

1. Upload Research Papers

2. Query Documents

3. Reset Thread

4. Reset Vector Database

5. Get Document Status

6. Get Thread Status

Project Structure

Detailed Component Explanation

Installation

1. Clone the Repository

2. Set Up the Environment

3. Install Dependencies

Option 1: Using pip (for local development)

Option 2: Using Docker (recommended for consistent deployments)

4. Run the API

Option 1: Without Docker

Option 2: With Docker

5. Access the API

Testing

Unit Tests

API Tests

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages