This guide helps you install, run, and integrate Microsoft Foundry Local on Windows and Mac. All steps and commands are validated against Microsoft Learn docs.
- Get Started: https://learn.microsoft.com/azure/ai-foundry/foundry-local/get-started
- Architecture: https://learn.microsoft.com/azure/ai-foundry/foundry-local/concepts/foundry-local-architecture
- CLI Reference: https://learn.microsoft.com/azure/ai-foundry/foundry-local/reference/reference-cli
- Integrate SDKs: https://learn.microsoft.com/azure/ai-foundry/foundry-local/how-to/how-to-integrate-with-inference-sdks
- Compile HF Models (BYOM): https://learn.microsoft.com/azure/ai-foundry/foundry-local/how-to/how-to-compile-hugging-face-models
- Windows AI: Local vs Cloud: https://learn.microsoft.com/windows/ai/cloud-ai#key-decision-factors-for-app-developers
- Install:
winget install Microsoft.FoundryLocal- Upgrade:
winget upgrade --id Microsoft.FoundryLocal- Version check:
foundry --versionInstall / Mac
MacOS: Open a terminal and run the following command:
brew tap microsoft/foundrylocal
brew install foundrylocal- Model:
foundry model --help
foundry model list
foundry model run gpt-oss-20b- Service:
foundry service --help
foundry service status
foundry service ps- Cache:
foundry cache --help
foundry cache listNotes:
- The service exposes an OpenAI-compatible REST API. The endpoint port is dynamically allocated; use
foundry service statusto discover it. - Use the SDKs for convenience; they handle endpoint discovery automatically where supported.
Foundry Local assigns a dynamic port each time the service starts:
foundry service statusUse the reported http://localhost:<PORT> as your base_url with OpenAI-compatible paths (for example, /v1/chat/completions).
set BASE_URL=http://localhost:PORT
python - <<PY
from openai import OpenAI
client = OpenAI(base_url="%BASE_URL%/v1", api_key="")
resp = client.chat.completions.create(
model="gpt-oss-20b",
messages=[{"role":"user","content":"Say hello from Foundry Local."}],
max_tokens=64,
)
print(resp.choices[0].message.content)
PYReferences:
- SDK Integration: https://learn.microsoft.com/azure/ai-foundry/foundry-local/how-to/how-to-integrate-with-inference-sdks
If you need a model not in the catalog, compile it to ONNX for Foundry Local using Olive.
High-level flow (see docs for steps):
foundry cache cd models
foundry cache list
foundry model run llama-3.2 --verboseDocs:
- BYOM compile: https://learn.microsoft.com/azure/ai-foundry/foundry-local/how-to/how-to-compile-hugging-face-models
- Check service status and logs:
foundry service status
foundry service diag- Clear or move cache:
foundry cache list
foundry cache remove <model>
foundry cache cd <path>- Update to latest preview:
winget upgrade --id Microsoft.FoundryLocal- Windows local vs cloud AI choices, including Foundry Local and Windows ML: https://learn.microsoft.com/windows/ai/cloud-ai#key-decision-factors-for-app-developers
- VS Code AI Toolkit with Foundry Local (use
foundry service statusto get the chat endpoint URL): https://learn.microsoft.com/azure/ai-foundry/foundry-local/concepts/foundry-local-architecture#key-components