Looking for the Python version? Check out Saiku.py.
- About
- Features
- Prerequisites
- 1. Using Saiku in Your Own Projects
- 2. Using the Project Itself
- 3. Global Installation (Not Recommended Yet)
- Demo
- Setting Up Environment Variables
- Available Commands
- Use Cases (via MCP & Extensions)
- Workflows
- Future Features
- Contributing
- Support Saiku
- Feedback and Issues
- API Rate Limits/Cost
- Note
- License
This project aims to create a robust, intelligent AI Agent capable of automating various tasks. Our agent is designed following the PEAS (Performance measure, Environment, Actuators, Sensors) framework to ensure it's robust, scalable, and efficient.
Saiku leverages the Model Context Protocol (MCP), a cutting-edge standard for enabling AI models to interact with external tools and resources securely and efficiently. MCP is becoming increasingly vital in the AI landscape, allowing agents like Saiku to:
- Extend Capabilities: Seamlessly integrate with various tools (filesystem access, web browsing, API interactions, code execution, etc.) provided by connected MCP servers.
- Access Real-time Data: Utilize dynamic information from connected resources (e.g., databases, APIs, system information).
- Perform Complex Actions: Go beyond text generation to execute commands, manipulate files, interact with external systems, and orchestrate multi-step processes.
By building on MCP, Saiku ensures a flexible, extensible, and future-proof architecture for AI agent development. Learn more about MCP here.
"Saiku" (ç´°ĺ·Ą) in Japanese refers to detailed or delicate work, symbolizing the intricate and intelligent workings of our AI agent.
- S: Smart
- A: Artificial
- I: Intelligent
- K: Knowledgeable
- U: Unmatched
We chose a Japanese name to symbolize precision, innovation, and advanced technology, attributes highly respected in Japanese culture. Even though we are based in Tunisia, we believe in global collaboration and the universal appeal and understanding of technology.
PEAS stands for Performance measure, Environment, Actuators, and Sensors. It's a framework used to describe the various components of an intelligent agent:
- Performance Measure: How well is the agent doing in its environment? (e.g., task completion rate, efficiency)
- Environment: Where the agent operates (e.g., user's local machine, specific software, web)
- Actuators: What actions the agent can take via MCP tools (e.g., writing files, executing commands, calling APIs)
- Sensors: How the agent perceives its environment via MCP resources and tool outputs (e.g., reading files, getting system status, receiving API responses)
- MCP-Powered: Core architecture based on the Model Context Protocol for secure and extensible tool/resource integration.
- Multi-LLM Support: Integrates with various Large Language Models (OpenAI, Vertex AI, Ollama, Hugging Face, Mistral, Anthropic).
- Workflow Engine: Define and run complex, multi-step automations using JSON-based workflows.
- Extensible: Easily add new capabilities by connecting new MCP servers.
- VS Code Extension: Interact with Saiku using voice commands directly within your editor via the Cline Voice Assistant extension.
- Web Interface: Chat with Saiku through a browser interface.
- Node.js: Version 18 or higher recommended.
- LLM API Key: An API key for at least one supported Large Language Model (e.g., OpenAI).
- MCP Servers: For extended capabilities (like file system access, web browsing, code execution, specific API interactions, text-to-speech, speech-to-text), you need to run the corresponding MCP servers. Many servers require their own API keys or setup (e.g., ElevenLabs API key for TTS/STT, Google Cloud credentials for Vision/Calendar). Configure these in your MCP settings.
- Git: Required for using git-related MCP tools.
Saiku can be integrated into your applications to leverage its agent capabilities.
- Step: Run
npm install saiku
in your project directory.
- Code:
import Agent from 'saiku'; // Or specific components if needed
- Example:
async function main(opts) { // Ensure MCP client/server setup is handled appropriately const agent = new Agent(opts); // Initialize the agent // ... }
-
AgentOptions:
- systemMessage (
string
|optional
): Default system message or instructions for the LLM. - allowCodeExecution (
boolean
|optional
): Flag to enable/disable code execution (typically handled by a dedicated MCP server now). - interactive (
boolean
|string
|optional
): Interactive mode setting for CLI usage. - llm (
'openai' | 'vertexai' | 'ollama' | 'huggingface' | 'mistral' | 'anthropic'
): Specifies the language learning model. Default is'openai'
. - [key: string]: any (
optional
): Allows additional custom properties.
- systemMessage (
-
Example Configuration:
let opts = { systemMessage: "You are Saiku, an AI assistant.", interactive: true, llm: "openai", // Custom options };
-
Process:
- User Input: Provide user queries or tasks.
- MCP Interaction: The agent interacts with connected MCP servers to use tools and access resources based on the query.
- Response Generation: Generates responses based on LLM processing and tool results.
-
Example Interaction:
// Assuming 'agent' is an initialized Agent instance async function runInteraction(agent, userQuery) { agent.messages.push({ role: "user", content: userQuery }); await agent.interact(); // Agent processes query, potentially using MCP tools // Handle agent's response (last message in agent.messages) }
-
Clone the Repository:
git clone https://github.com/nooqta/saiku.git
-
Navigate to Project Folder:
cd saiku
-
Install Dependencies:
npm install
-
Run the Project Locally:
Before starting Saiku locally, build the project:
npm run build
To start the agent in interactive CLI mode:
npm start
For automated building during development:
npm run watch
Global installation is possible but not recommended due to ongoing development.
npm install -g saiku
saiku-browser.mp4
(Note: May require updates for MCP compatibility)
Configure necessary environment variables for the core agent and any MCP servers you intend to use. Copy the example environment file:
cp .env.example .env
Edit the .env
file. Minimally, you need an LLM API key:
# OpenAI (Example)
OPENAI_API_KEY=your_openai_api_key
OPENAI_MODEL=gpt-4-turbo # Or another model
# Add other API keys as needed for specific MCP servers
# e.g., ELEVENLABS_API_KEY=... for the ElevenLabs MCP server
# e.g., GOOGLE_APPLICATION_CREDENTIALS=path/to/your/keyfile.json for Google Cloud servers
Refer to the documentation of individual MCP servers for their specific environment variable requirements.
Use the Saiku CLI with various options:
AI agent to help automate your tasks
Options:
-v, --version Output the current version.
-exec, --allowCodeExecution (Deprecated - Handled by MCP) Execute code without prompting.
-role, --systemMessage The model system role message.
-m, --llm <model> Specify the language model (openai, vertexai, ollama, huggingface, mistral, anthropic). Default: openai.
-h, --help Display help for command.
Commands:
mcp [options] Manage MCP servers.
workflow [options] Manage and run workflows.
autopilot [options] (Experimental) Run Saiku in autopilot mode.
serve Chat with the Saiku agent in the browser.
help [command] Display help for a specific command.
To start the interactive CLI with a specific LLM:
npm start -- -m ollama
To run a specific workflow:
npm start -- workflow run <workflow_name>
To list connected MCP servers:
npm start -- mcp list
To chat with Saiku in the browser:
npm start -- serve
Saiku achieves tasks by leveraging tools provided by connected MCP servers or through specific extensions.
- Transcribe Audio to Text: Use an STT MCP server (e.g., ElevenLabs, Whisper) to transcribe audio files.
- Extract Text from Image: Use a Vision MCP server (e.g., Google Vision) to perform OCR on images.
- Summarize Long Articles: The core LLM can summarize provided text, potentially fetched via a Filesystem or HTTP MCP server.
- HTML to PDF Conversion: Use a Puppeteer or similar MCP server with HTML-to-PDF capabilities.
- Take Screenshot of Webpage: Use a Puppeteer MCP server.
- Text to Speech: Use a TTS MCP server (e.g., ElevenLabs).
- File Actions (Read/Write/List): Use a Filesystem MCP server.
- Database Queries: Use a custom MCP server connected to your database.
- Git Operations: Use a Git MCP server.
- API Interactions (GitLab, GitHub, etc.): Use specific MCP servers designed for those APIs.
- Voice Interaction: Use the VS Code Cline Voice Assistant extension, which coordinates with STT/TTS MCP servers.
Saiku includes a workflow engine that allows you to define complex, multi-step tasks in a JSON format. These workflows can chain together multiple LLM calls and MCP tool uses to automate sophisticated processes.
- Define: Create workflow JSON files (see
workflows.json
for examples). - List:
npm start -- workflow list
- Run:
npm start -- workflow run <workflow_name> [input_data]
- Enhanced Workflow Engine: More complex logic, error handling, and dynamic step generation in workflows.
- Improved MCP Server Management: Easier discovery, installation, and configuration of MCP servers.
- Multi-Agent Collaboration: Exploring scenarios where multiple Saiku agents (or other MCP-compatible agents) can collaborate.
- Advanced Memory/Context Management: More sophisticated techniques for handling long-running tasks and large contexts.
- Proactive Assistance: Developing capabilities for the agent to suggest actions or workflows proactively.
- Refined PEAS Implementation: Continuously improving how the agent senses its environment and acts within it via MCP.
- Comprehensive Tests: Expanding test coverage for core agent logic, MCP interactions, and workflows.
- Cost Tracking & Budgeting: Integrating better mechanisms for tracking and managing API costs.
We welcome contributions! Please follow these steps:
- Fork the repository
- Create your feature branch (
git checkout -b feature/YourFeature
) - Commit your changes (
git commit -m 'Add some feature'
) - Push to the branch (
git push origin feature/YourFeature
) - Create a new Pull Request
We are actively seeking sponsors and contributors. Your support helps accelerate development.
Please open an issue on our GitHub repository for feedback or bug reports.
Be mindful of the rate limits and costs associated with the LLM APIs and any external services used by MCP servers.
Saiku is under active development. Expect changes to the architecture and features.
This project is licensed under the MIT License - see the LICENSE.md file for details.