Functional interface with MedCP, using Claude Sonnet 4 with streaming API and embedded MCP server
fMedCP provides a streamlined Python interface for medical AI analysis by combining Claude Sonnet 4's advanced reasoning capabilities with specialized medical databases. It integrates electronic health records (EHR) and biomedical knowledge graphs to answer complex medical questions with evidence-based responses and real-time streaming output.
- Dual Database Integration: Query both clinical records (SQL Server) and biomedical knowledge graphs (Neo4j)
- Claude Sonnet 4 Integration: Leverage the latest Claude Sonnet 4 model with streaming API for real-time responses
- Read-Only Safety: All database queries are strictly read-only for data protection
- Real-Time Streaming: Live output streaming with customizable callback functions
- Embedded MCP Server: Integrated Model Context Protocol server with optimized resource management
- Comprehensive Logging: Detailed reasoning steps and performance metrics with suppressed verbose output
- Flexible Configuration: Enable/disable specific data sources as needed
- Automatic Result Export: Responses saved to structured markdown files with enhanced formatting
fMedCP consists of three main components:
- Embedded MCP Server: Provides secure, read-only access to medical databases using FastMCP framework
- Claude Sonnet 4 Integration: Orchestrates AI-driven medical analysis with streaming API support
- Resource Management: Automatic connection pooling, cleanup, and optimization for production use
The system uses an embedded Model Context Protocol (MCP) server to safely expose medical data to Claude while maintaining strict security controls and optimal performance through in-process communication.
pip install anthropic>=0.7.0 neo4j>=5.28.2 pymssql>=2.3.7 fastmcp>=2.11.2 python-dotenv>=1.0.0 pydantic>=2.0.0 mcp>=1.0.0 asyncio-mqtt- Claude API: Get your API key from Anthropic Console - requires access to Claude Sonnet 4
- Neo4j Database: Biomedical knowledge graph (requires APOC plugin for schema introspection)
- SQL Server: Electronic health records database (read-only access sufficient)
- APOC Plugin: Required for schema introspection
- Database: Default expects "spoke" database
- Access: Read-only Cypher queries supported
- Tables: Must have accessible clinical data tables
- Permissions: Read-only SELECT permissions required
- Connection: Standard SQL Server connection parameters
fMedCP can be installed using either uv (recommended for faster installation) or pip. Both methods are fully supported.
uv is a fast Python package manager that can significantly speed up installation and dependency resolution.
First, install uv if you don't have it already:
curl -LsSf https://astral.sh/uv/install.sh | shpowershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"pip install uvbrew install uv- Clone the repository:
git clone https://github.com/yourusername/fMedCP.git
cd fMedCP- Create and activate a virtual environment:
uv venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate- Install dependencies with uv:
uv add anthropic>=0.7.0 neo4j>=5.28.2 pymssql>=2.3.7 fastmcp>=2.11.2 python-dotenv>=1.0.0 pydantic>=2.0.0 mcp>=1.0.0 asyncio-mqtt- Create environment configuration:
cp example.env .env- Clone the repository:
git clone https://github.com/yourusername/fMedCP.git
cd fMedCP- Create and activate a virtual environment (recommended):
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate- Install dependencies with pip:
pip install anthropic>=0.7.0 neo4j>=5.28.2 pymssql>=2.3.7 fastmcp>=2.11.2 python-dotenv>=1.0.0 pydantic>=2.0.0 mcp>=1.0.0 asyncio-mqtt- Create environment configuration:
cp example.env .envFor either method, you can also use a requirements file:
anthropic>=0.7.0
neo4j>=5.28.2
pymssql>=2.3.7
fastmcp>=2.11.2
python-dotenv>=1.0.0
pydantic>=2.0.0
mcp>=1.0.0
asyncio-mqttuv add -r requirements.txtpip install -r requirements.txt| Method | Installation Time | Dependency Resolution | Virtual Environment |
|---|---|---|---|
| uv | ~10-30 seconds | Very fast | Automatic |
| pip | ~2-5 minutes | Standard | Manual setup recommended |
Recommendation: Use uv for faster installation and better dependency management, especially in development environments.
Regardless of installation method, configure your .env file with your credentials:
# Claude API Configuration
CLAUDE_API_KEY=your_claude_api_key_here
# Neo4j Knowledge Graph Configuration
KNOWLEDGE_GRAPH_URI=neo4j://localhost:7687
KNOWLEDGE_GRAPH_USERNAME=your_kg_username
KNOWLEDGE_GRAPH_PASSWORD=your_kg_password
KNOWLEDGE_GRAPH_DATABASE=spoke
# SQL Server Clinical Records Configuration
CLINICAL_RECORDS_SERVER=your_clinical_server_address
CLINICAL_RECORDS_DATABASE=your_clinical_database_name
CLINICAL_RECORDS_USERNAME=your_clinical_username
CLINICAL_RECORDS_PASSWORD=your_clinical_password
# MedCP Configuration
MEDCP_LOG_LEVEL=INFO
MEDCP_NAMESPACE=MedCPTest your installation by running the example:
python example.pyIf successful, you should see streaming output from Claude analyzing your medical question.
The provided example.py demonstrates a complete workflow with real-time streaming:
import os
from dotenv import load_dotenv
from MedCP import run_medcp
# Load environment variables from .env file
load_dotenv()
# Run a medical question analysis with streaming output
result = run_medcp(
question_name="test",
question_text="How many patients with diabetes were prescribed metformin in 2022?",
claude_api_key=os.getenv("CLAUDE_API_KEY"),
# Knowledge graph configuration
kg_uri=os.getenv("KNOWLEDGE_GRAPH_URI"),
kg_username=os.getenv("KNOWLEDGE_GRAPH_USERNAME"),
kg_password=os.getenv("KNOWLEDGE_GRAPH_PASSWORD"),
kg_database=os.getenv("KNOWLEDGE_GRAPH_DATABASE", "spoke"),
# Clinical records configuration
clinical_server=os.getenv("CLINICAL_RECORDS_SERVER"),
clinical_database=os.getenv("CLINICAL_RECORDS_DATABASE"),
clinical_username=os.getenv("CLINICAL_RECORDS_USERNAME"),
clinical_password=os.getenv("CLINICAL_RECORDS_PASSWORD"),
# Optional parameters
log_level=os.getenv("MEDCP_LOG_LEVEL", "INFO"),
namespace=os.getenv("MEDCP_NAMESPACE", "MedCP"),
max_tokens=20000,
max_iterations=50,
use_clinical_records=True,
use_knowledge_graph=True,
output_dir="results"
)
# Check results with enhanced output
if result["success"]:
print(f"✅ Analysis complete! Results saved to: {result['file_path']}")
print(f"📊 Performance: {result['elapsed_time']:.2f}s, {result['usage']['total_tokens']:,} tokens")
print(f"🔧 Tools used: {len(result['tool_calls'])} calls")
else:
print(f"❌ Error: {result['error']}")- Set up your environment:
# Copy and configure environment file
cp example.env .env
# Edit .env with your actual credentials- Run the example:
python example.py- View results:
- Real-time streaming: See Claude's analysis as it happens with live text output
- Console metrics: Performance data including token usage and tool calls
- Detailed results: Saved to
results/test.mdwith complete reasoning steps - Enhanced formatting: Structured markdown with performance metrics and tool usage summary
You can enable/disable specific data sources:
# Use only knowledge graph (no clinical records)
result = run_medcp(
question_name="drug_interactions",
question_text="What are the known interactions between warfarin and NSAIDs?",
# ... credentials ...
use_knowledge_graph=True,
use_clinical_records=False
)
# Use only clinical records (no knowledge graph)
result = run_medcp(
question_name="patient_demographics",
question_text="What is the age distribution of patients with diabetes?",
# ... credentials ...
use_knowledge_graph=False,
use_clinical_records=True
)# Organize results by date
from datetime import datetime
output_dir = f"results/{datetime.now().strftime('%Y-%m-%d')}"
result = run_medcp(
question_name="daily_analysis",
question_text="Your medical question here",
# ... credentials ...
output_dir=output_dir
)# Custom streaming callback for processing output in real-time
def my_stream_handler(text_chunk):
# Process each chunk as it arrives
print(f"[STREAM] {text_chunk}", end="")
# Could log to file, send to UI, etc.
result = run_medcp(
question_name="streaming_analysis",
question_text="What are the biomarkers for early Alzheimer's detection?",
# ... credentials ...
stream_callback=my_stream_handler # Custom callback function
)# For complex questions requiring extensive analysis
result = run_medcp(
question_name="complex_analysis",
question_text="Comprehensive drug-disease interaction analysis for polypharmacy patients",
# ... credentials ...
max_tokens=30000, # Allow longer responses (Claude Sonnet 4 supports up to 200k)
max_iterations=75, # More tool calls for thorough analysis
log_level="DEBUG" # Detailed logging for debugging
)| Parameter | Type | Description |
|---|---|---|
question_name |
str | Unique identifier for the question |
question_text |
str | The medical question to analyze |
claude_api_key |
str | Your Claude API key |
| Parameter | Type | Default | Description |
|---|---|---|---|
kg_uri |
str | Required | Neo4j connection URI |
kg_username |
str | Required | Neo4j username |
kg_password |
str | Required | Neo4j password |
kg_database |
str | "spoke" | Neo4j database name |
| Parameter | Type | Default | Description |
|---|---|---|---|
clinical_server |
str | Required | SQL Server host |
clinical_database |
str | Required | EHR database name |
clinical_username |
str | Required | SQL Server username |
clinical_password |
str | Required | SQL Server password |
| Parameter | Type | Default | Description |
|---|---|---|---|
use_knowledge_graph |
bool | True | Enable biomedical knowledge graph |
use_clinical_records |
bool | True | Enable clinical records access |
output_dir |
str | "." | Directory for result files |
max_tokens |
int | 20000 | Maximum response tokens (Claude Sonnet 4 supports up to 200k) |
max_iterations |
int | 50 | Maximum tool call iterations |
log_level |
str | "INFO" | Logging verbosity (DEBUG, INFO, WARNING, ERROR) |
namespace |
str | "MedCP" | Tool namespace prefix for MCP tools |
stream_callback |
function | None | Custom callback function for real-time text streaming |
{
"success": bool, # Whether analysis completed successfully
"error": str, # Error message if success=False
"response": str, # Claude Sonnet 4's complete response with streaming
"file_path": str, # Path to saved markdown file
"usage": { # Enhanced token usage statistics
"input_tokens": int, # Tokens sent to Claude
"output_tokens": int, # Tokens generated by Claude
"total_tokens": int # Combined token usage
},
"elapsed_time": float, # Analysis duration in seconds (includes streaming time)
"tool_calls": List[Dict] # Details of all MCP tool calls made with embedded server
}fMedCP provides Claude with these specialized medical tools:
- Purpose: Discover available biomedical entities and relationships
- Returns: Schema of nodes, properties, and relationship types
- Use Case: Understanding what medical knowledge is available
- Purpose: Execute read-only Cypher queries on biomedical knowledge
- Parameters:
cypher_query: Cypher query stringparameters: Query parameters dictionary
- Use Case: Drug interactions, pathway analysis, disease relationships
- Purpose: List all available clinical data tables
- Returns: Table schemas, names, and types
- Use Case: Understanding available clinical data structure
- Purpose: Execute read-only SQL queries on patient data
- Parameters:
sql_query: SELECT statement for clinical data
- Use Case: Patient demographics, diagnosis patterns, treatment outcomes
Here are examples of medical questions fMedCP can handle:
result = run_medcp(
question_name="ms_prevalence",
question_text="What is the current prevalence of multiple sclerosis in our adult population?"
)result = run_medcp(
question_name="drug_interactions",
question_text="What are the contraindications and interactions for patients taking both warfarin and metformin?"
)result = run_medcp(
question_name="diabetes_outcomes",
question_text="Compare treatment outcomes for Type 2 diabetes patients using metformin vs combination therapy in our patient population."
)result = run_medcp(
question_name="biomarkers",
question_text="What biomarkers are associated with cardiovascular risk in patients with chronic kidney disease?"
)- All database queries are strictly validated to prevent modifications
- Knowledge graph queries filtered to exclude write operations (MERGE, CREATE, SET, DELETE)
- Clinical queries limited to SELECT statements only
- SQL injection protection through parameterized queries
- Cypher injection prevention with parameter binding
- Automatic query sanitization and validation
- No patient data is stored or cached by the system
- All connections use encrypted protocols
- Credentials managed through environment variables
Each analysis generates:
- Console Output: Real-time progress and final results
- Markdown File: Comprehensive report including:
- Original question and timestamp
- Claude's complete response
- Detailed reasoning steps
- Performance metrics
- Tool usage summary
results/
├── Q1.2.md # Question analysis results
├── drug_interactions.md # Drug safety analysis
└── patient_demographics.md # Population analysis
The biomedical knowledge graph requires the APOC plugin for Neo4j:
# Install APOC plugin in Neo4j
# Add to neo4j.conf:
dbms.security.procedures.unrestricted=apoc.*For large datasets, increase timeout settings:
result = run_medcp(
# ... other parameters ...
max_tokens=30000,
max_iterations=75
)If you encounter Claude API rate limits:
- Reduce
max_tokensto 15000 or lower (Claude Sonnet 4 has generous limits) - Decrease
max_iterationsto 25-30 for simpler questions - The streaming implementation handles rate limits with automatic retries
- Consider using selective data sources to reduce complexity
fMedCP provides detailed error messages for:
- Missing credentials or configuration
- Database connection failures (embedded MCP server handles reconnections)
- Invalid query syntax with specific validation errors
- API rate limit issues (automatic retry with streaming API)
- Tool execution errors with embedded server diagnostics
- Streaming interruptions and callback errors
Check the console output (with real-time streaming) and generated markdown files for specific error details. The embedded MCP server provides enhanced error reporting.
- Be specific about the clinical population or context
- Include relevant time frames when applicable
- Specify desired analysis depth or output format
- Start with smaller token limits for exploratory questions (Claude Sonnet 4 is more efficient)
- Use selective data sources (
use_knowledge_graph=Falseoruse_clinical_records=False) for focused analysis - Monitor token usage for cost management with real-time streaming feedback
- Leverage the embedded MCP server's optimized performance for faster tool execution
- Use custom streaming callbacks to process output in real-time for better user experience
- Store credentials in environment files, never in code
- Use read-only database accounts when possible
- Regularly rotate API keys and database passwords
- Review generated outputs before sharing
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Make your changes with appropriate tests
- Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
For issues, questions, or contributions:
- Issues: GitHub Issues
- Documentation: This README and inline code documentation
- Examples: See
example.pyfor complete usage patterns
- Built on Anthropic's Claude Sonnet 4 with streaming API support
- Uses FastMCP for embedded Model Context Protocol server implementation
- Integrates with Neo4j for biomedical knowledge graphs (requires APOC plugin)
- Supports SQL Server for electronic health records
- Enhanced with real-time streaming capabilities and optimized resource management
Disclaimer: This tool is designed for research and analytical purposes. All medical decisions should involve qualified healthcare professionals. The system provides informational analysis only and should not be used for direct patient care without proper medical oversight.
