Docker Model Runner, Docker MCP Toolkit, and Promptfoo

This repo contains a few examples of how to use Docker Model Runner, Docker MCP Toolkit, and Promptfoo together to compare models, evaluate MCP servers, and even perform LLM red-teaming from the comfort of your own dev machine.

Prerequisites

Enable Docker MCP Toolkit in Docker Desktop per https://docs.docker.com/ai/mcp-catalog-and-toolkit/get-started/#enable-docker-mcp-toolkit.
Enable Docker Model Runner in Docker Desktop or Docker Engine per https://docs.docker.com/ai/model-runner/#enable-docker-model-runner.
Use the Docker Model Runner CLI to pull the following models

docker model pull ai/gemma3:4B-Q4_K_M
docker model pull ai/smollm3:Q4_K_M
docker model pull ai/mxbai-embed-large:335M-F16

Install Promptfoo

npm install -g promptfoo

Run the model comparison evaluation

export ANTHROPIC_API_KEY=<your_api_key_here>
promptfoo eval -c promptfooconfig.comparison.yaml
promptoo view

Run the MCP Direct example

promptfoo eval -c promptfooconfig.mcp-direct.yaml

Run the MCP Red-Team Example

export ANTHROPIC_API_KEY=<your_api_key_here>
promptfoo redteam run -c promptfooconfig.mcp-repo-summarizer.yaml

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
LICENSE		LICENSE
promptfooconfig.comparison.yaml		promptfooconfig.comparison.yaml
promptfooconfig.mcp-direct.yaml		promptfooconfig.mcp-direct.yaml
promptfooconfig.mcp-repo-summarizer.yaml		promptfooconfig.mcp-repo-summarizer.yaml
readme.md		readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Docker Model Runner, Docker MCP Toolkit, and Promptfoo

Prerequisites

Run the model comparison evaluation

Run the MCP Direct example

Run the MCP Red-Team Example

About

Uh oh!

Uh oh!

License

docker/docker-model-runner-and-mcp-with-promptfoo

Folders and files

Latest commit

History

Repository files navigation

Docker Model Runner, Docker MCP Toolkit, and Promptfoo

Prerequisites

Run the model comparison evaluation

Run the MCP Direct example

Run the MCP Red-Team Example

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Uh oh!