Multi-agent AI system for controlling a NVIDIA JetBot with vision-based navigation.
Frontend (Next.js) → ADK API (port 8003) → Director/Observer/Pilot Agents
↓
→ JetBot API (port 8000) ← Movement Commands
↓
→ YOLO-E Vision (port 8002) ← Detection/Telemetry
↓
JetBot Hardware
# Create virtual environment
python3 -m venv .venv
source .venv/bin/activate
# Run setup script
./setup_dependencies.sh --system76 # or --desktop or --jetsonCreate .env file in adk-backend/:
OPENROUTER_API_KEY=your_openrouter_key_here
GOOGLE_API_KEY=your_google_key_hereTerminal 1 - JetBot Backend (Hardware Control):
cd jetbot-backend
source ../.venv/bin/activate
python main.py
# Runs on port 8000Terminal 2 - YOLO-E Backend (Vision):
cd yoloe-backend
source ../.venv/bin/activate
python main.py
# Runs on port 8002 (CUDA accelerated on RTX 5080+)Terminal 3 - ADK Agent API:
cd adk-backend
source ../.venv/bin/activate
./run_api.sh
# Runs on port 8003Terminal 4 - Frontend:
cd frontend
npm install # First time only
npm run dev
# Runs on port 3000Open http://localhost:3000 and use the Agent Chat panel:
- Type commands like: "find a bottle", "scan the room", "move forward 1 meter"
- Watch the camera feed and telemetry
- See the agent reasoning and tool calls in the chat
jetbot-backend/: Hardware control API (motors, ultrasonic sensor)yoloe-backend/: YOLO-E vision system with WebSocket telemetryadk-backend/: Multi-agent AI system using Google ADK + OpenRouter
frontend/: Next.js web UI with:- Live camera feed from JetBot
- Agent chat interface
- YOLO prompts configuration
- Keyboard controls
- Director (Nvidia Nemotron): Receives goals, decides simple vs complex
- Observer (Qwen 2.5): Vision specialist, finds objects
- Pilot (Qwen 2.5): Movement specialist, executes navigation
- ✅ Real-time camera feed with YOLO-E object detection
- ✅ Multi-agent reasoning system (Director/Observer/Pilot)
- ✅ Ultrasonic distance sensing for collision avoidance
- ✅ GPU-accelerated vision (RTX 5080 with PyTorch nightly)
- ✅ Open-vocabulary detection (detect any object by text prompt)
- ✅ WebSocket telemetry streaming
- ✅ Modern React UI with live updates
- Director:
openrouter/nvidia/llama-3.1-nemotron-70b-instruct - Observer:
openrouter/qwen/qwen-2.5-72b-instruct - Pilot:
openrouter/qwen/qwen-2.5-72b-instruct
8000: JetBot hardware control API8002: YOLO-E vision backend (HTTP + WebSocket)8003: ADK agent API3000: Frontend web UI
Test WebSocket telemetry:
cd yoloe-backend
python test_websocket.pyCUDA not working:
- Make sure graphics mode is set to NVIDIA:
sudo system76-power graphics nvidia - Reboot after changing graphics mode
- Verify with:
nvidia-smi
Agents not triggering:
- Ensure OPENROUTER_API_KEY is in
.env - Check logs for model errors
- Verify backends are running (ports 8000, 8002)
WebSocket not connecting:
- Make sure yoloe-backend is running
- Check if JetBot is sending frames
- Test with
curl http://localhost:8002/current-detections