Skip to content

BaranziniLab/fMedCP

Repository files navigation

fMedCP - Functional Medical Context Protocol MedCP Logo


Functional interface with MedCP, using Claude Sonnet 4 with streaming API and embedded MCP server

fMedCP provides a streamlined Python interface for medical AI analysis by combining Claude Sonnet 4's advanced reasoning capabilities with specialized medical databases. It integrates electronic health records (EHR) and biomedical knowledge graphs to answer complex medical questions with evidence-based responses and real-time streaming output.

Features

  • Dual Database Integration: Query both clinical records (SQL Server) and biomedical knowledge graphs (Neo4j)
  • Claude Sonnet 4 Integration: Leverage the latest Claude Sonnet 4 model with streaming API for real-time responses
  • Read-Only Safety: All database queries are strictly read-only for data protection
  • Real-Time Streaming: Live output streaming with customizable callback functions
  • Embedded MCP Server: Integrated Model Context Protocol server with optimized resource management
  • Comprehensive Logging: Detailed reasoning steps and performance metrics with suppressed verbose output
  • Flexible Configuration: Enable/disable specific data sources as needed
  • Automatic Result Export: Responses saved to structured markdown files with enhanced formatting

Architecture

fMedCP consists of three main components:

  1. Embedded MCP Server: Provides secure, read-only access to medical databases using FastMCP framework
  2. Claude Sonnet 4 Integration: Orchestrates AI-driven medical analysis with streaming API support
  3. Resource Management: Automatic connection pooling, cleanup, and optimization for production use

The system uses an embedded Model Context Protocol (MCP) server to safely expose medical data to Claude while maintaining strict security controls and optimal performance through in-process communication.

Prerequisites

Dependencies

pip install anthropic>=0.7.0 neo4j>=5.28.2 pymssql>=2.3.7 fastmcp>=2.11.2 python-dotenv>=1.0.0 pydantic>=2.0.0 mcp>=1.0.0 asyncio-mqtt

Required Services

  1. Claude API: Get your API key from Anthropic Console - requires access to Claude Sonnet 4
  2. Neo4j Database: Biomedical knowledge graph (requires APOC plugin for schema introspection)
  3. SQL Server: Electronic health records database (read-only access sufficient)

Database Requirements

Knowledge Graph (Neo4j)

  • APOC Plugin: Required for schema introspection
  • Database: Default expects "spoke" database
  • Access: Read-only Cypher queries supported

Clinical Records (SQL Server)

  • Tables: Must have accessible clinical data tables
  • Permissions: Read-only SELECT permissions required
  • Connection: Standard SQL Server connection parameters

Installation

fMedCP can be installed using either uv (recommended for faster installation) or pip. Both methods are fully supported.

Method 1: Using uv (Recommended)

uv is a fast Python package manager that can significantly speed up installation and dependency resolution.

Installing uv

First, install uv if you don't have it already:

On macOS and Linux:
curl -LsSf https://astral.sh/uv/install.sh | sh
On Windows (PowerShell):
powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"
Alternative: Using pip
pip install uv
Alternative: Using Homebrew (macOS)
brew install uv

Installing fMedCP with uv

  1. Clone the repository:
git clone https://github.com/yourusername/fMedCP.git
cd fMedCP
  1. Create and activate a virtual environment:
uv venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate
  1. Install dependencies with uv:
uv add anthropic>=0.7.0 neo4j>=5.28.2 pymssql>=2.3.7 fastmcp>=2.11.2 python-dotenv>=1.0.0 pydantic>=2.0.0 mcp>=1.0.0 asyncio-mqtt
  1. Create environment configuration:
cp example.env .env

Method 2: Using pip (Traditional)

Installing fMedCP with pip

  1. Clone the repository:
git clone https://github.com/yourusername/fMedCP.git
cd fMedCP
  1. Create and activate a virtual environment (recommended):
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
  1. Install dependencies with pip:
pip install anthropic>=0.7.0 neo4j>=5.28.2 pymssql>=2.3.7 fastmcp>=2.11.2 python-dotenv>=1.0.0 pydantic>=2.0.0 mcp>=1.0.0 asyncio-mqtt
  1. Create environment configuration:
cp example.env .env

Alternative: Using requirements.txt

For either method, you can also use a requirements file:

Create requirements.txt:

anthropic>=0.7.0
neo4j>=5.28.2
pymssql>=2.3.7
fastmcp>=2.11.2
python-dotenv>=1.0.0
pydantic>=2.0.0
mcp>=1.0.0
asyncio-mqtt

Install with uv:

uv add -r requirements.txt

Install with pip:

pip install -r requirements.txt

Performance Comparison

Method Installation Time Dependency Resolution Virtual Environment
uv ~10-30 seconds Very fast Automatic
pip ~2-5 minutes Standard Manual setup recommended

Recommendation: Use uv for faster installation and better dependency management, especially in development environments.

Configuration

Regardless of installation method, configure your .env file with your credentials:

# Claude API Configuration
CLAUDE_API_KEY=your_claude_api_key_here

# Neo4j Knowledge Graph Configuration
KNOWLEDGE_GRAPH_URI=neo4j://localhost:7687
KNOWLEDGE_GRAPH_USERNAME=your_kg_username
KNOWLEDGE_GRAPH_PASSWORD=your_kg_password
KNOWLEDGE_GRAPH_DATABASE=spoke

# SQL Server Clinical Records Configuration
CLINICAL_RECORDS_SERVER=your_clinical_server_address
CLINICAL_RECORDS_DATABASE=your_clinical_database_name
CLINICAL_RECORDS_USERNAME=your_clinical_username
CLINICAL_RECORDS_PASSWORD=your_clinical_password

# MedCP Configuration
MEDCP_LOG_LEVEL=INFO
MEDCP_NAMESPACE=MedCP

Verifying Installation

Test your installation by running the example:

python example.py

If successful, you should see streaming output from Claude analyzing your medical question.

Usage

Basic Example with Streaming

The provided example.py demonstrates a complete workflow with real-time streaming:

import os
from dotenv import load_dotenv
from MedCP import run_medcp

# Load environment variables from .env file
load_dotenv()

# Run a medical question analysis with streaming output
result = run_medcp(
    question_name="test",
    question_text="How many patients with diabetes were prescribed metformin in 2022?",
    claude_api_key=os.getenv("CLAUDE_API_KEY"),
    
    # Knowledge graph configuration
    kg_uri=os.getenv("KNOWLEDGE_GRAPH_URI"),
    kg_username=os.getenv("KNOWLEDGE_GRAPH_USERNAME"),
    kg_password=os.getenv("KNOWLEDGE_GRAPH_PASSWORD"),
    kg_database=os.getenv("KNOWLEDGE_GRAPH_DATABASE", "spoke"),
    
    # Clinical records configuration
    clinical_server=os.getenv("CLINICAL_RECORDS_SERVER"),
    clinical_database=os.getenv("CLINICAL_RECORDS_DATABASE"),
    clinical_username=os.getenv("CLINICAL_RECORDS_USERNAME"),
    clinical_password=os.getenv("CLINICAL_RECORDS_PASSWORD"),
    
    # Optional parameters
    log_level=os.getenv("MEDCP_LOG_LEVEL", "INFO"),
    namespace=os.getenv("MEDCP_NAMESPACE", "MedCP"),
    max_tokens=20000,
    max_iterations=50,
    use_clinical_records=True,
    use_knowledge_graph=True,
    output_dir="results"
)

# Check results with enhanced output
if result["success"]:
    print(f"✅ Analysis complete! Results saved to: {result['file_path']}")
    print(f"📊 Performance: {result['elapsed_time']:.2f}s, {result['usage']['total_tokens']:,} tokens")
    print(f"🔧 Tools used: {len(result['tool_calls'])} calls")
else:
    print(f"❌ Error: {result['error']}")

Running the Example

  1. Set up your environment:
# Copy and configure environment file
cp example.env .env
# Edit .env with your actual credentials
  1. Run the example:
python example.py
  1. View results:
  • Real-time streaming: See Claude's analysis as it happens with live text output
  • Console metrics: Performance data including token usage and tool calls
  • Detailed results: Saved to results/test.md with complete reasoning steps
  • Enhanced formatting: Structured markdown with performance metrics and tool usage summary

Advanced Configuration

Selective Data Source Usage

You can enable/disable specific data sources:

# Use only knowledge graph (no clinical records)
result = run_medcp(
    question_name="drug_interactions",
    question_text="What are the known interactions between warfarin and NSAIDs?",
    # ... credentials ...
    use_knowledge_graph=True,
    use_clinical_records=False
)

# Use only clinical records (no knowledge graph)
result = run_medcp(
    question_name="patient_demographics",
    question_text="What is the age distribution of patients with diabetes?",
    # ... credentials ...
    use_knowledge_graph=False,
    use_clinical_records=True
)

Custom Output Directory

# Organize results by date
from datetime import datetime
output_dir = f"results/{datetime.now().strftime('%Y-%m-%d')}"

result = run_medcp(
    question_name="daily_analysis",
    question_text="Your medical question here",
    # ... credentials ...
    output_dir=output_dir
)

Custom Streaming Callbacks

# Custom streaming callback for processing output in real-time
def my_stream_handler(text_chunk):
    # Process each chunk as it arrives
    print(f"[STREAM] {text_chunk}", end="")
    # Could log to file, send to UI, etc.

result = run_medcp(
    question_name="streaming_analysis",
    question_text="What are the biomarkers for early Alzheimer's detection?",
    # ... credentials ...
    stream_callback=my_stream_handler  # Custom callback function
)

Performance Tuning

# For complex questions requiring extensive analysis
result = run_medcp(
    question_name="complex_analysis",
    question_text="Comprehensive drug-disease interaction analysis for polypharmacy patients",
    # ... credentials ...
    max_tokens=30000,          # Allow longer responses (Claude Sonnet 4 supports up to 200k)
    max_iterations=75,         # More tool calls for thorough analysis
    log_level="DEBUG"          # Detailed logging for debugging
)

API Reference

Main Function: run_medcp()

Required Parameters

Parameter Type Description
question_name str Unique identifier for the question
question_text str The medical question to analyze
claude_api_key str Your Claude API key

Knowledge Graph Parameters (when use_knowledge_graph=True)

Parameter Type Default Description
kg_uri str Required Neo4j connection URI
kg_username str Required Neo4j username
kg_password str Required Neo4j password
kg_database str "spoke" Neo4j database name

Clinical Records Parameters (when use_clinical_records=True)

Parameter Type Default Description
clinical_server str Required SQL Server host
clinical_database str Required EHR database name
clinical_username str Required SQL Server username
clinical_password str Required SQL Server password

Optional Parameters

Parameter Type Default Description
use_knowledge_graph bool True Enable biomedical knowledge graph
use_clinical_records bool True Enable clinical records access
output_dir str "." Directory for result files
max_tokens int 20000 Maximum response tokens (Claude Sonnet 4 supports up to 200k)
max_iterations int 50 Maximum tool call iterations
log_level str "INFO" Logging verbosity (DEBUG, INFO, WARNING, ERROR)
namespace str "MedCP" Tool namespace prefix for MCP tools
stream_callback function None Custom callback function for real-time text streaming

Return Value

{
    "success": bool,              # Whether analysis completed successfully
    "error": str,                 # Error message if success=False
    "response": str,              # Claude Sonnet 4's complete response with streaming
    "file_path": str,             # Path to saved markdown file
    "usage": {                    # Enhanced token usage statistics
        "input_tokens": int,      # Tokens sent to Claude
        "output_tokens": int,     # Tokens generated by Claude
        "total_tokens": int       # Combined token usage
    },
    "elapsed_time": float,        # Analysis duration in seconds (includes streaming time)
    "tool_calls": List[Dict]      # Details of all MCP tool calls made with embedded server
}

Available Tools

fMedCP provides Claude with these specialized medical tools:

Knowledge Graph Tools

get_knowledge_graph_schema()

  • Purpose: Discover available biomedical entities and relationships
  • Returns: Schema of nodes, properties, and relationship types
  • Use Case: Understanding what medical knowledge is available

query_knowledge_graph(cypher_query, parameters={})

  • Purpose: Execute read-only Cypher queries on biomedical knowledge
  • Parameters:
    • cypher_query: Cypher query string
    • parameters: Query parameters dictionary
  • Use Case: Drug interactions, pathway analysis, disease relationships

Clinical Records Tools

list_clinical_tables()

  • Purpose: List all available clinical data tables
  • Returns: Table schemas, names, and types
  • Use Case: Understanding available clinical data structure

query_clinical_records(sql_query)

  • Purpose: Execute read-only SQL queries on patient data
  • Parameters:
    • sql_query: SELECT statement for clinical data
  • Use Case: Patient demographics, diagnosis patterns, treatment outcomes

Example Questions

Here are examples of medical questions fMedCP can handle:

Clinical Epidemiology

result = run_medcp(
    question_name="ms_prevalence",
    question_text="What is the current prevalence of multiple sclerosis in our adult population?"
)

Drug Safety Analysis

result = run_medcp(
    question_name="drug_interactions",
    question_text="What are the contraindications and interactions for patients taking both warfarin and metformin?"
)

Treatment Outcomes

result = run_medcp(
    question_name="diabetes_outcomes",
    question_text="Compare treatment outcomes for Type 2 diabetes patients using metformin vs combination therapy in our patient population."
)

Biomarker Analysis

result = run_medcp(
    question_name="biomarkers",
    question_text="What biomarkers are associated with cardiovascular risk in patients with chronic kidney disease?"
)

Security Features

Read-Only Access

  • All database queries are strictly validated to prevent modifications
  • Knowledge graph queries filtered to exclude write operations (MERGE, CREATE, SET, DELETE)
  • Clinical queries limited to SELECT statements only

Query Validation

  • SQL injection protection through parameterized queries
  • Cypher injection prevention with parameter binding
  • Automatic query sanitization and validation

Data Privacy

  • No patient data is stored or cached by the system
  • All connections use encrypted protocols
  • Credentials managed through environment variables

Output Format

Each analysis generates:

  1. Console Output: Real-time progress and final results
  2. Markdown File: Comprehensive report including:
    • Original question and timestamp
    • Claude's complete response
    • Detailed reasoning steps
    • Performance metrics
    • Tool usage summary

Sample Output Structure

results/
├── Q1.2.md                    # Question analysis results
├── drug_interactions.md       # Drug safety analysis
└── patient_demographics.md    # Population analysis

Troubleshooting

Common Issues

"APOC plugin not installed"

The biomedical knowledge graph requires the APOC plugin for Neo4j:

# Install APOC plugin in Neo4j
# Add to neo4j.conf:
dbms.security.procedures.unrestricted=apoc.*

Connection timeouts

For large datasets, increase timeout settings:

result = run_medcp(
    # ... other parameters ...
    max_tokens=30000,
    max_iterations=75
)

Rate limits

If you encounter Claude API rate limits:

  • Reduce max_tokens to 15000 or lower (Claude Sonnet 4 has generous limits)
  • Decrease max_iterations to 25-30 for simpler questions
  • The streaming implementation handles rate limits with automatic retries
  • Consider using selective data sources to reduce complexity

Error Handling

fMedCP provides detailed error messages for:

  • Missing credentials or configuration
  • Database connection failures (embedded MCP server handles reconnections)
  • Invalid query syntax with specific validation errors
  • API rate limit issues (automatic retry with streaming API)
  • Tool execution errors with embedded server diagnostics
  • Streaming interruptions and callback errors

Check the console output (with real-time streaming) and generated markdown files for specific error details. The embedded MCP server provides enhanced error reporting.

Best Practices

Question Formulation

  • Be specific about the clinical population or context
  • Include relevant time frames when applicable
  • Specify desired analysis depth or output format

Performance Optimization

  • Start with smaller token limits for exploratory questions (Claude Sonnet 4 is more efficient)
  • Use selective data sources (use_knowledge_graph=False or use_clinical_records=False) for focused analysis
  • Monitor token usage for cost management with real-time streaming feedback
  • Leverage the embedded MCP server's optimized performance for faster tool execution
  • Use custom streaming callbacks to process output in real-time for better user experience

Data Security

  • Store credentials in environment files, never in code
  • Use read-only database accounts when possible
  • Regularly rotate API keys and database passwords
  • Review generated outputs before sharing

Contributing

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Make your changes with appropriate tests
  4. Commit your changes (git commit -m 'Add amazing feature')
  5. Push to the branch (git push origin feature/amazing-feature)
  6. Open a Pull Request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Support

For issues, questions, or contributions:

  • Issues: GitHub Issues
  • Documentation: This README and inline code documentation
  • Examples: See example.py for complete usage patterns

Acknowledgments

  • Built on Anthropic's Claude Sonnet 4 with streaming API support
  • Uses FastMCP for embedded Model Context Protocol server implementation
  • Integrates with Neo4j for biomedical knowledge graphs (requires APOC plugin)
  • Supports SQL Server for electronic health records
  • Enhanced with real-time streaming capabilities and optimized resource management

Disclaimer: This tool is designed for research and analytical purposes. All medical decisions should involve qualified healthcare professionals. The system provides informational analysis only and should not be used for direct patient care without proper medical oversight.

About

Functional interface with MedCP, using the Claude API key and the MedCP MCP tools

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages