fMedCP - Functional Medical Context Protocol

Functional interface with MedCP, using Claude Sonnet 4 with streaming API and embedded MCP server

fMedCP provides a streamlined Python interface for medical AI analysis by combining Claude Sonnet 4's advanced reasoning capabilities with specialized medical databases. It integrates electronic health records (EHR) and biomedical knowledge graphs to answer complex medical questions with evidence-based responses and real-time streaming output.

Features

Dual Database Integration: Query both clinical records (SQL Server) and biomedical knowledge graphs (Neo4j)
Claude Sonnet 4 Integration: Leverage the latest Claude Sonnet 4 model with streaming API for real-time responses
Read-Only Safety: All database queries are strictly read-only for data protection
Real-Time Streaming: Live output streaming with customizable callback functions
Embedded MCP Server: Integrated Model Context Protocol server with optimized resource management
Comprehensive Logging: Detailed reasoning steps and performance metrics with suppressed verbose output
Flexible Configuration: Enable/disable specific data sources as needed
Automatic Result Export: Responses saved to structured markdown files with enhanced formatting

Architecture

fMedCP consists of three main components:

Embedded MCP Server: Provides secure, read-only access to medical databases using FastMCP framework
Claude Sonnet 4 Integration: Orchestrates AI-driven medical analysis with streaming API support
Resource Management: Automatic connection pooling, cleanup, and optimization for production use

The system uses an embedded Model Context Protocol (MCP) server to safely expose medical data to Claude while maintaining strict security controls and optimal performance through in-process communication.

Prerequisites

Dependencies

pip install anthropic>=0.7.0 neo4j>=5.28.2 pymssql>=2.3.7 fastmcp>=2.11.2 python-dotenv>=1.0.0 pydantic>=2.0.0 mcp>=1.0.0 asyncio-mqtt

Required Services

Claude API: Get your API key from Anthropic Console - requires access to Claude Sonnet 4
Neo4j Database: Biomedical knowledge graph (requires APOC plugin for schema introspection)
SQL Server: Electronic health records database (read-only access sufficient)

Database Requirements

Knowledge Graph (Neo4j)

APOC Plugin: Required for schema introspection
Database: Default expects "spoke" database
Access: Read-only Cypher queries supported

Clinical Records (SQL Server)

Tables: Must have accessible clinical data tables
Permissions: Read-only SELECT permissions required
Connection: Standard SQL Server connection parameters

Installation

fMedCP can be installed using either uv (recommended for faster installation) or pip. Both methods are fully supported.

Method 1: Using uv (Recommended)

uv is a fast Python package manager that can significantly speed up installation and dependency resolution.

Installing uv

First, install uv if you don't have it already:

On macOS and Linux:

curl -LsSf https://astral.sh/uv/install.sh | sh

On Windows (PowerShell):

powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"

Alternative: Using pip

pip install uv

Alternative: Using Homebrew (macOS)

brew install uv

Installing fMedCP with uv

Clone the repository:

git clone https://github.com/yourusername/fMedCP.git
cd fMedCP

Create and activate a virtual environment:

uv venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate

Install dependencies with uv:

uv add anthropic>=0.7.0 neo4j>=5.28.2 pymssql>=2.3.7 fastmcp>=2.11.2 python-dotenv>=1.0.0 pydantic>=2.0.0 mcp>=1.0.0 asyncio-mqtt

Create environment configuration:

cp example.env .env

Method 2: Using pip (Traditional)

Installing fMedCP with pip

Clone the repository:

git clone https://github.com/yourusername/fMedCP.git
cd fMedCP

Create and activate a virtual environment (recommended):

python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

Install dependencies with pip:

pip install anthropic>=0.7.0 neo4j>=5.28.2 pymssql>=2.3.7 fastmcp>=2.11.2 python-dotenv>=1.0.0 pydantic>=2.0.0 mcp>=1.0.0 asyncio-mqtt

Create environment configuration:

cp example.env .env

Alternative: Using requirements.txt

For either method, you can also use a requirements file:

Create requirements.txt:

anthropic>=0.7.0
neo4j>=5.28.2
pymssql>=2.3.7
fastmcp>=2.11.2
python-dotenv>=1.0.0
pydantic>=2.0.0
mcp>=1.0.0
asyncio-mqtt

Install with uv:

uv add -r requirements.txt

Install with pip:

pip install -r requirements.txt

Performance Comparison

Method	Installation Time	Dependency Resolution	Virtual Environment
uv	~10-30 seconds	Very fast	Automatic
pip	~2-5 minutes	Standard	Manual setup recommended

Recommendation: Use uv for faster installation and better dependency management, especially in development environments.

Configuration

Regardless of installation method, configure your .env file with your credentials:

# Claude API Configuration
CLAUDE_API_KEY=your_claude_api_key_here

# Neo4j Knowledge Graph Configuration
KNOWLEDGE_GRAPH_URI=neo4j://localhost:7687
KNOWLEDGE_GRAPH_USERNAME=your_kg_username
KNOWLEDGE_GRAPH_PASSWORD=your_kg_password
KNOWLEDGE_GRAPH_DATABASE=spoke

# SQL Server Clinical Records Configuration
CLINICAL_RECORDS_SERVER=your_clinical_server_address
CLINICAL_RECORDS_DATABASE=your_clinical_database_name
CLINICAL_RECORDS_USERNAME=your_clinical_username
CLINICAL_RECORDS_PASSWORD=your_clinical_password

# MedCP Configuration
MEDCP_LOG_LEVEL=INFO
MEDCP_NAMESPACE=MedCP

Verifying Installation

Test your installation by running the example:

python example.py

If successful, you should see streaming output from Claude analyzing your medical question.

Usage

Basic Example with Streaming

The provided example.py demonstrates a complete workflow with real-time streaming:

import os
from dotenv import load_dotenv
from MedCP import run_medcp

# Load environment variables from .env file
load_dotenv()

# Run a medical question analysis with streaming output
result = run_medcp(
    question_name="test",
    question_text="How many patients with diabetes were prescribed metformin in 2022?",
    claude_api_key=os.getenv("CLAUDE_API_KEY"),
    
    # Knowledge graph configuration
    kg_uri=os.getenv("KNOWLEDGE_GRAPH_URI"),
    kg_username=os.getenv("KNOWLEDGE_GRAPH_USERNAME"),
    kg_password=os.getenv("KNOWLEDGE_GRAPH_PASSWORD"),
    kg_database=os.getenv("KNOWLEDGE_GRAPH_DATABASE", "spoke"),
    
    # Clinical records configuration
    clinical_server=os.getenv("CLINICAL_RECORDS_SERVER"),
    clinical_database=os.getenv("CLINICAL_RECORDS_DATABASE"),
    clinical_username=os.getenv("CLINICAL_RECORDS_USERNAME"),
    clinical_password=os.getenv("CLINICAL_RECORDS_PASSWORD"),
    
    # Optional parameters
    log_level=os.getenv("MEDCP_LOG_LEVEL", "INFO"),
    namespace=os.getenv("MEDCP_NAMESPACE", "MedCP"),
    max_tokens=20000,
    max_iterations=50,
    use_clinical_records=True,
    use_knowledge_graph=True,
    output_dir="results"
)

# Check results with enhanced output
if result["success"]:
    print(f"✅ Analysis complete! Results saved to: {result['file_path']}")
    print(f"📊 Performance: {result['elapsed_time']:.2f}s, {result['usage']['total_tokens']:,} tokens")
    print(f"🔧 Tools used: {len(result['tool_calls'])} calls")
else:
    print(f"❌ Error: {result['error']}")

Running the Example

Set up your environment:

# Copy and configure environment file
cp example.env .env
# Edit .env with your actual credentials

Run the example:

python example.py

View results:

Real-time streaming: See Claude's analysis as it happens with live text output
Console metrics: Performance data including token usage and tool calls
Detailed results: Saved to results/test.md with complete reasoning steps
Enhanced formatting: Structured markdown with performance metrics and tool usage summary

Advanced Configuration

Selective Data Source Usage

You can enable/disable specific data sources:

# Use only knowledge graph (no clinical records)
result = run_medcp(
    question_name="drug_interactions",
    question_text="What are the known interactions between warfarin and NSAIDs?",
    # ... credentials ...
    use_knowledge_graph=True,
    use_clinical_records=False
)

# Use only clinical records (no knowledge graph)
result = run_medcp(
    question_name="patient_demographics",
    question_text="What is the age distribution of patients with diabetes?",
    # ... credentials ...
    use_knowledge_graph=False,
    use_clinical_records=True
)

Custom Output Directory

# Organize results by date
from datetime import datetime
output_dir = f"results/{datetime.now().strftime('%Y-%m-%d')}"

result = run_medcp(
    question_name="daily_analysis",
    question_text="Your medical question here",
    # ... credentials ...
    output_dir=output_dir
)

Custom Streaming Callbacks

# Custom streaming callback for processing output in real-time
def my_stream_handler(text_chunk):
    # Process each chunk as it arrives
    print(f"[STREAM] {text_chunk}", end="")
    # Could log to file, send to UI, etc.

result = run_medcp(
    question_name="streaming_analysis",
    question_text="What are the biomarkers for early Alzheimer's detection?",
    # ... credentials ...
    stream_callback=my_stream_handler  # Custom callback function
)

Performance Tuning

# For complex questions requiring extensive analysis
result = run_medcp(
    question_name="complex_analysis",
    question_text="Comprehensive drug-disease interaction analysis for polypharmacy patients",
    # ... credentials ...
    max_tokens=30000,          # Allow longer responses (Claude Sonnet 4 supports up to 200k)
    max_iterations=75,         # More tool calls for thorough analysis
    log_level="DEBUG"          # Detailed logging for debugging
)

API Reference

Main Function: `run_medcp()`

Required Parameters

Parameter	Type	Description
`question_name`	str	Unique identifier for the question
`question_text`	str	The medical question to analyze
`claude_api_key`	str	Your Claude API key

Knowledge Graph Parameters (when `use_knowledge_graph=True`)

Parameter	Type	Default	Description
`kg_uri`	str	Required	Neo4j connection URI
`kg_username`	str	Required	Neo4j username
`kg_password`	str	Required	Neo4j password
`kg_database`	str	"spoke"	Neo4j database name

Clinical Records Parameters (when `use_clinical_records=True`)

Parameter	Type	Default	Description
`clinical_server`	str	Required	SQL Server host
`clinical_database`	str	Required	EHR database name
`clinical_username`	str	Required	SQL Server username
`clinical_password`	str	Required	SQL Server password

Optional Parameters

Parameter	Type	Default	Description
`use_knowledge_graph`	bool	True	Enable biomedical knowledge graph
`use_clinical_records`	bool	True	Enable clinical records access
`output_dir`	str	"."	Directory for result files
`max_tokens`	int	20000	Maximum response tokens (Claude Sonnet 4 supports up to 200k)
`max_iterations`	int	50	Maximum tool call iterations
`log_level`	str	"INFO"	Logging verbosity (DEBUG, INFO, WARNING, ERROR)
`namespace`	str	"MedCP"	Tool namespace prefix for MCP tools
`stream_callback`	function	None	Custom callback function for real-time text streaming

Return Value

{
    "success": bool,              # Whether analysis completed successfully
    "error": str,                 # Error message if success=False
    "response": str,              # Claude Sonnet 4's complete response with streaming
    "file_path": str,             # Path to saved markdown file
    "usage": {                    # Enhanced token usage statistics
        "input_tokens": int,      # Tokens sent to Claude
        "output_tokens": int,     # Tokens generated by Claude
        "total_tokens": int       # Combined token usage
    },
    "elapsed_time": float,        # Analysis duration in seconds (includes streaming time)
    "tool_calls": List[Dict]      # Details of all MCP tool calls made with embedded server
}

Available Tools

fMedCP provides Claude with these specialized medical tools:

Knowledge Graph Tools

`get_knowledge_graph_schema()`

Purpose: Discover available biomedical entities and relationships
Returns: Schema of nodes, properties, and relationship types
Use Case: Understanding what medical knowledge is available

`query_knowledge_graph(cypher_query, parameters={})`

Purpose: Execute read-only Cypher queries on biomedical knowledge
Parameters:
- cypher_query: Cypher query string
- parameters: Query parameters dictionary
Use Case: Drug interactions, pathway analysis, disease relationships

Clinical Records Tools

`list_clinical_tables()`

Purpose: List all available clinical data tables
Returns: Table schemas, names, and types
Use Case: Understanding available clinical data structure

`query_clinical_records(sql_query)`

Purpose: Execute read-only SQL queries on patient data
Parameters:
- sql_query: SELECT statement for clinical data
Use Case: Patient demographics, diagnosis patterns, treatment outcomes

Example Questions

Here are examples of medical questions fMedCP can handle:

Clinical Epidemiology

result = run_medcp(
    question_name="ms_prevalence",
    question_text="What is the current prevalence of multiple sclerosis in our adult population?"
)

Drug Safety Analysis

result = run_medcp(
    question_name="drug_interactions",
    question_text="What are the contraindications and interactions for patients taking both warfarin and metformin?"
)

Treatment Outcomes

result = run_medcp(
    question_name="diabetes_outcomes",
    question_text="Compare treatment outcomes for Type 2 diabetes patients using metformin vs combination therapy in our patient population."
)

Biomarker Analysis

result = run_medcp(
    question_name="biomarkers",
    question_text="What biomarkers are associated with cardiovascular risk in patients with chronic kidney disease?"
)

Security Features

Read-Only Access

All database queries are strictly validated to prevent modifications
Knowledge graph queries filtered to exclude write operations (MERGE, CREATE, SET, DELETE)
Clinical queries limited to SELECT statements only

Query Validation

SQL injection protection through parameterized queries
Cypher injection prevention with parameter binding
Automatic query sanitization and validation

Data Privacy

No patient data is stored or cached by the system
All connections use encrypted protocols
Credentials managed through environment variables

Output Format

Each analysis generates:

Console Output: Real-time progress and final results
Markdown File: Comprehensive report including:
- Original question and timestamp
- Claude's complete response
- Detailed reasoning steps
- Performance metrics
- Tool usage summary

Sample Output Structure

results/
├── Q1.2.md                    # Question analysis results
├── drug_interactions.md       # Drug safety analysis
└── patient_demographics.md    # Population analysis

Troubleshooting

Common Issues

"APOC plugin not installed"

The biomedical knowledge graph requires the APOC plugin for Neo4j:

# Install APOC plugin in Neo4j
# Add to neo4j.conf:
dbms.security.procedures.unrestricted=apoc.*

Connection timeouts

For large datasets, increase timeout settings:

result = run_medcp(
    # ... other parameters ...
    max_tokens=30000,
    max_iterations=75
)

Rate limits

If you encounter Claude API rate limits:

Reduce max_tokens to 15000 or lower (Claude Sonnet 4 has generous limits)
Decrease max_iterations to 25-30 for simpler questions
The streaming implementation handles rate limits with automatic retries
Consider using selective data sources to reduce complexity

Error Handling

fMedCP provides detailed error messages for:

Missing credentials or configuration
Database connection failures (embedded MCP server handles reconnections)
Invalid query syntax with specific validation errors
API rate limit issues (automatic retry with streaming API)
Tool execution errors with embedded server diagnostics
Streaming interruptions and callback errors

Check the console output (with real-time streaming) and generated markdown files for specific error details. The embedded MCP server provides enhanced error reporting.

Best Practices

Question Formulation

Be specific about the clinical population or context
Include relevant time frames when applicable
Specify desired analysis depth or output format

Performance Optimization

Start with smaller token limits for exploratory questions (Claude Sonnet 4 is more efficient)
Use selective data sources (use_knowledge_graph=False or use_clinical_records=False) for focused analysis
Monitor token usage for cost management with real-time streaming feedback
Leverage the embedded MCP server's optimized performance for faster tool execution
Use custom streaming callbacks to process output in real-time for better user experience

Data Security

Store credentials in environment files, never in code
Use read-only database accounts when possible
Regularly rotate API keys and database passwords
Review generated outputs before sharing

Contributing

Fork the repository
Create a feature branch (git checkout -b feature/amazing-feature)
Make your changes with appropriate tests
Commit your changes (git commit -m 'Add amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Support

For issues, questions, or contributions:

Issues: GitHub Issues
Documentation: This README and inline code documentation
Examples: See example.py for complete usage patterns

Acknowledgments

Built on Anthropic's Claude Sonnet 4 with streaming API support
Uses FastMCP for embedded Model Context Protocol server implementation
Integrates with Neo4j for biomedical knowledge graphs (requires APOC plugin)
Supports SQL Server for electronic health records
Enhanced with real-time streaming capabilities and optimized resource management

Disclaimer: This tool is designed for research and analytical purposes. All medical decisions should involve qualified healthcare professionals. The system provides informational analysis only and should not be used for direct patient care without proper medical oversight.

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
__pycache__		__pycache__
benchmarking		benchmarking
fMedCP.egg-info		fMedCP.egg-info
results		results
.DS_Store		.DS_Store
.gitignore		.gitignore
MedCP.py		MedCP.py
README.md		README.md
WARP.md		WARP.md
auto_sync.bat		auto_sync.bat
auto_sync.sh		auto_sync.sh
benchmark.py		benchmark.py
example.env		example.env
example.py		example.py
logo.png		logo.png
pyproject.toml		pyproject.toml
uv.lock		uv.lock

BaranziniLab/fMedCP

Folders and files

Latest commit

History

Repository files navigation