A CLI chat agent powered by GPT-OSS 120B on AWS Bedrock with semantic tool discovery via Titan Text Embeddings V2.
Instead of sending all tool definitions in every request, the agent uses a gateway pattern: the LLM only sees a single discover_tools function and searches for relevant tools by describing what it needs. An embedding-based similarity search returns the top matches, which are then injected into the conversation dynamically.
User message
│
▼
LLM receives toolConfig = [discover_tools]
│
▼
LLM calls: discover_tools({ description: "run a shell command" })
│
▼
Gateway: embed query → cosine similarity vs tool embeddings → top-6
│
▼
toolConfig updated: [discover_tools + matched tools]
│
▼
LLM calls the appropriate tool → executes → responds
| Tool | Description |
|---|---|
get_datetime |
Current date, time and timezone |
calculate |
Math expression evaluator (supports functions, exponentiation, parentheses) |
generate_qr_code |
Generates a QR code image URL |
query_files |
Reads files matching a regex and answers questions about them using AI |
edit_file |
Creates or modifies files via AI-generated search/replace diffs |
shell |
Executes shell commands in a specified directory |
system_info |
Machine details: OS, CPU, RAM, disk, uptime |
web_search |
Search the web via Brave Search API |
web_fetch |
Fetch a URL and answer questions about its content using AI |
edit_file and shell require user confirmation before executing. You can approve, decline, or provide feedback explaining why (the agent will see your feedback and adjust).
- Node.js 18+
- An AWS Bedrock account with access to:
openai.gpt-oss-120b-1:0(or the model configured inconfig.js)amazon.titan-embed-text-v2:0
- A bearer token for Bedrock API access
- Clone the repository:
git clone https://github.com/your-username/TalkingAgentsBedrock.git
cd TalkingAgentsBedrock- Install dependencies:
npm install- Create a
.envfile with your credentials:
AWS_BEARER_TOKEN_BEDROCK=your_bearer_token_here
AWS_BEDROCK_REGION=us-west-2
BRAVE_API_KEY=your_brave_api_key_here
BRAVE_API_KEYis optional. Get one free at brave.com/search/api.
- Run:
npm startOn first run, the gateway will compute embeddings for all tools (cached in .embeddings-cache.json for subsequent runs).
Edit config.js to customize:
export default {
mainModel: "openai.gpt-oss-120b-1:0", // Chat + edit_file
fastModel: "openai.gpt-oss-120b-1:0", // query_files (swap for a cheaper model)
maxMessages: 40, // Conversation sliding window
};You can also change the terminal theme in main.js:
const THEME = "light"; // "dark" or "light"├── main.js # CLI chat loop and display
├── agent.js # Agent class: streaming, tool loop, sliding window
├── tools.js # Tool definitions and implementations
├── gateway.js # discover_tools meta-tool factory
├── embeddings.js # Titan V2 embedding service with SHA-256 cache
├── config.js # Model and conversation settings
├── package.json
├── .env # Credentials (not committed)
└── .gitignore
You: What time is it and how much RAM does this machine have?
[10:32:15.421] calling discover_tools({"description":"get current date and time"})
[10:32:16.003] result discover_tools -> {"discovered_tools":[{"name":"get_datetime",...}]}
[10:32:16.450] calling get_datetime({})
[10:32:16.451] result get_datetime -> {"iso":"2025-06-15T10:32:16.451Z",...}
[10:32:16.800] calling discover_tools({"description":"system memory RAM information"})
[10:32:17.200] result discover_tools -> {"discovered_tools":[{"name":"system_info",...}]}
[10:32:17.500] calling system_info({})
[10:32:17.502] result system_info -> {"ram":{"total":"16.0 GB","free":"8.2 GB",...},...}
Assistant: It's 10:32 AM (UTC). Your machine has 16 GB of RAM total with 8.2 GB free.
Add your tool definition to the toolDefinitions array in tools.js and its implementation to the toolImplementations object. The embedding cache will automatically detect the new tool on next startup and compute its embedding.
MIT
Made with ❤️ by Juan Pablo Gramajo