DeepSearch AI (Local LLM Agent with Web Search)

DeepSearch AI is a fully local, full-stack conversational agent capable of real-time web search. It runs a quantized Large Language Model (LLM) directly on your machine using Docker, ensuring privacy and control without relying on external cloud APIs for the core logic.

Video Preview

0109.mp4

Project Description

This project is a production-ready Local AI Agent designed to bridge the gap between offline LLMs and real-time internet data. Built with FastAPI and cpp-python, the agent intelligently detects when a user needs up-to-date information (e.g., "search iPhone 16 price") and triggers a web search using DuckDuckGo. The results are then synthesized by the local Qwen2.5 model to provide a comprehensive answer.

The project is fully containerized, allowing for a seamless "write once, run anywhere" experience using Docker, while keeping the heavy model files managed locally.

Project Goal

The primary goal of this project is to provide a private, low-latency, and cost-effective alternative to cloud-based AI assistants. It is designed for developers and privacy enthusiasts who want to run powerful AI agents on consumer hardware.

Key capabilities include:

Smart Intent Detection: Automatically switches between "Chat Mode" and "Search Mode" based on user input.
Real-Time Knowledge: Overcomes the knowledge cutoff of static LLMs by fetching live data from the web.
Local Inference: Uses 4-bit quantized GGUF models (Qwen2.5-0.5B) to run efficiently on CPU/RAM.
Full-Stack Experience: Provides a clean, dark-mode chat interface built with Vanilla JS, connected to a robust Python backend.

Architecture

The system follows a microservice-like architecture encapsulated within a Docker container:

Frontend: Captures user input and handles UI state (Thinking/Searching animations).
API Layer: FastAPI receives the request.
Agent Logic: Analyzes the prompt to decide if a search tool is needed.
Tool Execution: If needed, queries DuckDuckGo (ddgs) for live results.
LLM Inference: The Context + Query is fed into llama-cpp-python running the Qwen model.
Response: The final answer is streamed back to the user.

🛠️ Technologies Used

AI & Core Logic

LLM Engine: llama-cpp-python (Binding for llama.cpp)
Model: Qwen2.5-0.5B-Instruct (GGUF Format - Quantized)
Web Search Tool: duckduckgo-search
Model Management: Hugging Face Hub (for downloading the GGUF)

Backend

Framework: FastAPI
Server: Uvicorn
Data Validation: Pydantic

Frontend

Core: HTML5, CSS3, Vanilla JavaScript
Styling: Custom CSS with Dark Mode & Responsive Design

DevOps & Deployment

Containerization: Docker
Virtualization: Docker Volumes (For mapping local models)

Libraries Used

📂 Project Structure

ai/
- main.py → Script to download the GGUF model.
- models/ → Directory where the model file will be stored.
agent.py → Core logic for the AI Agent (Switching between Search/Chat).
main.py → FastAPI application entry point.
tools.py → Implementation of the Internet Search tool.
index.html → The frontend chat interface.
Dockerfile → Configuration for building the application image.
requirements.txt → Python dependencies.

How to Run Locally

Since the AI model file (.gguf) is large, it is NOT included in the GitHub repository. You must download it manually using the provided script before running the Docker container.

1. Clone the Repository

git clone https://github.com/YOUR_USERNAME/DeepSearch-AI.git
cd DeepSearch-AI

2. Download the Model (Critical Step)

You need to download the qwen2.5-0.5b-instruct-q4_k_m.gguf model. I have prepared a script to do this automatically.

First, install the necessary library:

pip install huggingface_hub

Then, run the download script:

python ai/main.py

This will download the model (~400MB) and place it into the ./ai/models/ directory.

3. Build the Docker Image

docker build -t ai-agent .

4. Run the Container

We use Docker Volumes to map the downloaded model into the container. This keeps the image light and allows you to swap models easily.

docker run -d -p 8000:8000 --name ai-agent \
-v $(pwd)/ai/models:/app/ai/models \
ai-agent

5. Access the Application

Open your browser and go to: http://localhost:8000

How It Works

Chat Mode: If you ask general questions (e.g., "Write a poem"), the Local LLM answers directly.
Search Mode: If you start your sentence with keywords like search, ara, or bul (e.g., "search Tesla stock price"), the Agent:
- Parses your query.
- Searches DuckDuckGo for live results.
- Reads the content.
- Summarizes the answer using the LLM.

Summary

DeepSearch AI demonstrates how to build a functional, privacy-focused AI Agent without relying on paid cloud APIs. By combining FastAPI for the backend, Docker for deployment, and GGUF quantization for performance, it brings the power of modern LLMs to your local machine.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DeepSearch AI (Local LLM Agent with Web Search)

Video Preview

Project Description

Project Goal

Architecture

🛠️ Technologies Used

AI & Core Logic

Backend

Frontend

DevOps & Deployment

Libraries Used

📂 Project Structure

How to Run Locally

1. Clone the Repository

2. Download the Model (Critical Step)

3. Build the Docker Image

4. Run the Container

5. Access the Application

How It Works

Summary

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
__pycache__		__pycache__
ai		ai
Dockerfile		Dockerfile
README.md		README.md
agent.py		agent.py
index.html		index.html
main.py		main.py
requirements.txt		requirements.txt
tools.py		tools.py

AliDmrcIo/LLM_Agent

Folders and files

Latest commit

History

Repository files navigation

DeepSearch AI (Local LLM Agent with Web Search)

Video Preview

Project Description

Project Goal

Architecture

🛠️ Technologies Used

AI & Core Logic

Backend

Frontend

DevOps & Deployment

Libraries Used

📂 Project Structure

How to Run Locally

1. Clone the Repository

2. Download the Model (Critical Step)

3. Build the Docker Image

4. Run the Container

5. Access the Application

How It Works

Summary

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages