opensearch-project · smar-sean-sekora · Oct 9, 2025 · Oct 9, 2025 · Oct 9, 2025 · Nov 4, 2025
diff --git a/CLAUDE.md b/CLAUDE.md
@@ -0,0 +1,214 @@
+# CLAUDE.md
+
+This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
+
+## Project Overview
+
+This is an OpenSearch MCP (Model Context Protocol) Server implemented in Python. It provides a bridge between AI assistants and OpenSearch clusters, supporting both stdio and streaming transports (SSE/HTTP streaming).
+
+## Development Commands
+
+### Setup
+```bash
+# Create & activate virtual environment
+uv venv
+source .venv/bin/activate
+
+# Install dependencies
+uv sync
+```
+
+### Running the Server
+**Important**: Server commands must be run from the `src/` directory.
+
+```bash
+cd src
+
+# Run stdio server (default)
+uv run python -m mcp_server_opensearch
+
+# Run streaming server
+uv run python -m mcp_server_opensearch --transport stream
+
+# Run multi-cluster mode with config
+uv run python -m mcp_server_opensearch --mode multi --config ../config/dev-clusters.yml
+
+# Run with AWS profile
+uv run python -m mcp_server_opensearch --profile my-profile
+```
+
+### Testing
+```bash
+# Run all tests
+uv run pytest
+
+# Run with coverage
+uv run pytest --cov=mcp_server_opensearch
+
+# Run specific test file
+uv run pytest tests/test_tools.py
+
+# Run with verbose output
+uv run pytest -v
+```
+
+### Code Quality
+```bash
+# Format code (required before commits)
+uv run ruff format .
+
+# Check code quality
+uv run ruff check .
+
+# Type checking
+uv run mypy src/
+```
+
+### Dependency Management
+```bash
+# Add new package
+uv add <package-name>
+
+# Add development dependency
+uv add --dev <package-name>
+
+# Update dependencies after manual pyproject.toml changes
+uv lock
+uv sync
+
+# Update all dependencies to latest versions
+uv lock --upgrade
+uv sync
+```
+
+## Architecture
+
+### Core Components
+
+- **Server Layer** (`src/mcp_server_opensearch/`)
+  - `stdio_server.py`: Standard input/output transport
+  - `streaming_server.py`: SSE and HTTP streaming transport
+  - `clusters_information.py`: Multi-cluster configuration management
+  - `__main__.py`: Entry point with argument parsing
+
+- **OpenSearch Integration** (`src/opensearch/`)
+  - `client.py`: OpenSearch client initialization with authentication
+  - `helper.py`: Single REST call functions to OpenSearch API (one function = one API call)
+
+- **Tool System** (`src/tools/`)
+  - `tools.py`: Tool definitions and implementations using `TOOL_REGISTRY` dictionary
+  - `tool_params.py`: Pydantic models for tool arguments (all extend `baseToolArgs`)
+  - `tool_filter.py`: Tool filtering by name, category, or regex pattern
+  - `tool_generator.py`: Dynamic tool schema generation
+  - `config.py`: YAML configuration parsing for tool filters and customization
+  - `utils.py`: Tool compatibility checking based on OpenSearch version
+  - `index_filter.py`: Index-level access control with pattern-based filtering
+
+### Server Modes
+
+1. **Single Mode** (default)
+   - Connects to one OpenSearch cluster via environment variables
+   - Automatically filters tools based on OpenSearch version compatibility
+   - Tools do not require `opensearch_cluster_name` parameter
+
+2. **Multi Mode**
+   - Supports multiple OpenSearch clusters defined in YAML config
+   - All tools available regardless of version (compatibility checked at execution)
+   - All tools require `opensearch_cluster_name` parameter
+   - Tool filtering not supported
+
+### Authentication Flow
+
+Priority order: No Auth → IAM Role → Basic Auth → AWS Credentials
+
+### Index Security
+
+The server supports index-level access control to restrict which indexes can be accessed:
+- **Allowed Patterns**: Whitelist approach using wildcards (`logs-*`) or regex (`regex:^logs-\d{4}$`)
+- **Denied Patterns**: Blacklist approach with same pattern types
+- **Priority**: Denied patterns checked first and take precedence over allowed patterns
+- **Configuration**: Via YAML `index_security` section or environment variables
+- **Validation**: Applied before OpenSearch queries in all tools with index parameters
+- **Wildcard Bypass**: Index names with wildcards in tool calls bypass validation (OpenSearch expands them)
+
+Example configuration:
+```yaml
+index_security:
+  allowed_index_patterns:
+    - "logs-*"
+    - "metrics-*"
+  denied_index_patterns:
+    - "sensitive-*"
+    - ".security*"
+```
+
+### Tool Architecture
+
+**Key Design Principle**: Each helper function in `opensearch/helper.py` performs a single REST call to OpenSearch. This promotes:
+- Clear separation of concerns
+- Easy testing and maintenance
+- Reusable OpenSearch operations
+
+Tools in `tools/tools.py` orchestrate these helper functions and are registered in the `TOOL_REGISTRY` dictionary with:
+- `description`: Tool documentation
+- `input_schema`: Pydantic model JSON schema
+- `function`: Async tool implementation
+- `args_model`: Pydantic model class
+- Optional: `min_version`, `max_version` for version compatibility
+
+## Adding New Tools
+
+1. **Create Tool Arguments Model** in `src/tools/tool_params.py`:
+   ```python
+   class YourToolArgs(baseToolArgs):
+       """Arguments for YourTool."""
+       param1: str = Field(description="Description")
+   ```
+
+2. **Add Helper Function** in `src/opensearch/helper.py`:
+   ```python
+   def your_helper_function(args: YourToolArgs) -> json:
+       """Perform single REST call to OpenSearch."""
+       from .client import initialize_client
+       client = initialize_client(args)
+       response = client.your_api_call()
+       return response
+   ```
+
+3. **Implement Tool Function** in `src/tools/tools.py`:
+   ```python
+   async def your_tool_function(args: YourToolArgs) -> list[dict]:
+       try:
+           check_tool_compatibility('YourToolName', args)
+           # Add index validation if tool accepts index parameter
+           if hasattr(args, 'index') and args.index:
+               validate_index_access(args.index)
+           result = your_helper_function(args)
+           return [{"type": "text", "text": json.dumps(result, indent=2)}]
+       except Exception as e:
+           return [{"type": "text", "text": f"Error: {str(e)}"}]
+   ```
+
+4. **Register Tool** in `TOOL_REGISTRY` dict in `src/tools/tools.py`:
+   ```python
+   TOOL_REGISTRY = {
+       "YourToolName": {
+           "description": "What this tool does",
+           "input_schema": YourToolArgs.model_json_schema(),
+           "function": your_tool_function,
+           "args_model": YourToolArgs,
+       }
+   }
+   ```
+
+## Important Notes
+
+- By default, only **core tools** are enabled (tool filtering only works in single mode)
+- Tool filtering supports: exact names, categories, regex patterns, and write operation controls
+- Tool customization (display names, descriptions) works in both single and multi modes
+- All commands must be run from the `src/` directory
+- The server supports both stdio and streaming (SSE/HTTP) transports
+- Version compatibility is checked automatically in single mode, at runtime in multi mode
+- Ruff line length is set to 99 characters
+- Pydocstyle convention: Google
+- Quote style: single quotes
diff --git a/README.md b/README.md
@@ -20,6 +20,7 @@
 - Built-in tools for common OpenSearch operations
 - Easy integration with Claude Desktop and LangChain
 - Secure authentication using basic auth or IAM roles
+- Index-level access control with pattern-based filtering
 
 ## Installing opensearch-mcp-server-py
 

diff --git a/USER_GUIDE.md b/USER_GUIDE.md
@@ -11,6 +11,7 @@
 - [Running the Server](#running-the-server)
 - [Tool Filter](#tool-filter)
 - [Tool Customization](#tool-customization)
+- [Index Security](#index-security)
 - [LangChain Integration](#langchain-integration)
 
 ## Overview
@@ -419,6 +420,13 @@ python -m mcp_server_opensearch --mode multi
 | `OPENSEARCH_DISABLED_TOOLS_REGEX` | No | `''` | Comma-separated list of regex patterns for disabled tools |
 | `OPENSEARCH_SETTINGS_ALLOW_WRITE` | No | `"true"` | Enable/disable write operations (`"true"` or `"false"`) |
 
+### Index Security Variables
+
+| Variable | Required | Default | Description |
+|----------|----------|---------|-------------|
+| `OPENSEARCH_ALLOWED_INDEX_PATTERNS` | No | `''` | Allowed index patterns (JSON array or comma-separated) |
+| `OPENSEARCH_DENIED_INDEX_PATTERNS` | No | `''` | Denied index patterns (JSON array or comma-separated) |
+
 *Required in single mode or when not using multi-mode config file
 
 ## Multi-Mode Cluster Configuration
@@ -568,6 +576,144 @@ Configuration file settings have higher priority than runtime parameters. If bot
 - Changes take effect immediately when the server starts
 - Invalid tool names or properties will throw an error
 
+## Index Security
+
+OpenSearch MCP server supports index-level access control to restrict which indexes can be accessed through the MCP server. This provides an additional security layer on top of OpenSearch's built-in security features.
+
+**Supported in Both Single and Multi Mode**
+
+### How It Works
+
+- **Allowed Patterns**: Define which indexes are accessible (whitelist approach)
+- **Denied Patterns**: Define which indexes are blocked (blacklist approach)
+- **Priority**: Denied patterns are checked first and take precedence over allowed patterns
+- **Default Behavior**: If no patterns are configured, all indexes are accessible
+
+### Pattern Types
+
+The index security feature supports two types of patterns:
+
+1. **Wildcard Patterns**: Use `*` and `?` for simple pattern matching
+   - `logs-*` - Matches all indexes starting with "logs-"
+   - `test-?-index` - Matches indexes like "test-1-index", "test-a-index"
+   - `*-production` - Matches all indexes ending with "-production"
+
+2. **Regex Patterns**: Prefix with `regex:` for complex patterns
+   - `regex:^logs-\d{4}-\d{2}$` - Matches "logs-2024-01", "logs-2023-12"
+   - `regex:.*-dev-.*` - Matches indexes containing "-dev-"
+
+### Configuration Methods
+
+#### 1. YAML Configuration File
+
+Add an `index_security` section to your configuration file:
+
+```yaml
+version: "1.0"
+description: "OpenSearch MCP Server Configuration"
+
+# Index security configuration
+index_security:
+  allowed_index_patterns:
+    - "logs-*"              # Allow all logs indexes
+    - "metrics-*"           # Allow all metrics indexes
+    - "app-production-*"    # Allow production app indexes
+    - "regex:^test-\d+$"    # Allow test-123, test-456, etc.
+  denied_index_patterns:
+    - "sensitive-*"         # Block sensitive data indexes
+    - ".security*"          # Block security system indexes
+    - "*-internal"          # Block internal indexes
+    - "regex:.*-dev-.*"     # Block development indexes
+```
+
+Start the server with the configuration file:
+
+```bash
+# Single Mode
+python -m mcp_server_opensearch --config config.yml
+
+# Multi Mode
+python -m mcp_server_opensearch --mode multi --config config.yml
+```
+
+#### 2. Environment Variables
+
+Configure index security using environment variables:
+
+```bash
+# JSON array format
+export OPENSEARCH_ALLOWED_INDEX_PATTERNS='["logs-*", "metrics-*"]'
+export OPENSEARCH_DENIED_INDEX_PATTERNS='["sensitive-*", ".security*"]'
+
+# Or comma-separated format
+export OPENSEARCH_ALLOWED_INDEX_PATTERNS="logs-*, metrics-*, app-*"
+export OPENSEARCH_DENIED_INDEX_PATTERNS="sensitive-*, .security*, *-internal"
+```
+
+### Configuration Priority
+
+YAML configuration file takes priority over environment variables. If both are provided, the YAML settings will be used.
+
+### Use Cases
+
+1. **Whitelist Approach** - Only allow specific index patterns:
+   ```yaml
+   index_security:
+     allowed_index_patterns:
+       - "logs-*"
+       - "metrics-*"
+   ```
+   Result: Only indexes matching "logs-\*" or "metrics-\*" are accessible.
+
+2. **Blacklist Approach** - Block specific index patterns:
+   ```yaml
+   index_security:
+     denied_index_patterns:
+       - "sensitive-*"
+       - ".security*"
+   ```
+   Result: All indexes except those matching denied patterns are accessible.
+
+3. **Combined Approach** - Allow specific patterns but deny others:
+   ```yaml
+   index_security:
+     allowed_index_patterns:
+       - "logs-*"
+     denied_index_patterns:
+       - "logs-sensitive-*"
+   ```
+   Result: Only logs indexes are accessible, except those containing "sensitive".
+
+### Important Notes
+
+- **Wildcard Expansion**: Index names containing wildcards (like `logs-*`) in tool calls bypass validation, as OpenSearch expands them at query time
+- **Comma-Separated Indexes**: When tools receive comma-separated index names, each is validated individually
+- **Error Handling**: If an index is denied, the tool will return an error message indicating access is blocked
+- **Scope**: Index filtering applies to all tools that accept an `index` parameter
+- **Performance**: Pattern matching is performed before OpenSearch queries, adding minimal overhead
+
+### Example: Production Use Case
+
+For a production environment where you want to restrict access to only production logs and metrics:
+
+```yaml
+index_security:
+  allowed_index_patterns:
+    - "logs-production-*"
+    - "metrics-production-*"
+    - "app-prod-*"
+  denied_index_patterns:
+    - "*-dev-*"
+    - "*-test-*"
+    - "*-staging-*"
+    - "temp-*"
+    - ".internal-*"
+```
+
+This configuration:
+- ✅ Allows: `logs-production-2024`, `metrics-production-cpu`, `app-prod-users`
+- ❌ Blocks: `logs-dev-2024`, `metrics-test-memory`, `temp-debug`, `.internal-cache`
+
 ## LangChain Integration
 
 The OpenSearch MCP server can be easily integrated with LangChain using the SSE server transport.