A Swift command-line HTTP server that exposes the Apple Intelligence system language model over a simple JSON REST API using Apple's FoundationModels framework. Run it locally on supported devices to perform on-device language model inference.
This package is intended for experimentation and local development with Apple Intelligence.
- macOS 26.0+ (or iOS 26.0+, iPadOS 26.0+, visionOS 26.0+)
- Swift 6.2+
- Apple Intelligence enabled on your device
- Xcode 17+ (for building)
Only devices that support Apple Intelligence can use this server. You must enable Apple Intelligence in System Settings.
apple-intelligence-foundation-server/
├── Package.swift # Swift Package Manager configuration
├── Sources/
│ └── App/
│ └── main.swift # Server implementation
└── README.md
- Clone or navigate to the project directory
- Resolve dependencies:
swift package resolve
swift runThe server will start on:
http://localhost:8080
All responses are JSON. Errors are also returned as JSON with a consistent shape:
{
"error": "Human-readable error message"
}| Method | Path | Description |
|---|---|---|
| POST | /inference |
Run text generation via Apple Intelligence |
| GET | /health |
Basic liveness/health check |
Send a prompt and receive a generated response from Apple Intelligence.
Request:
curl -X POST http://localhost:8080/inference \
-H "Content-Type: application/json" \
-d '{"prompt": "What is Swift programming?"}'Request body:
{
"prompt": "Your prompt text here"
}Successful response:
{
"response": "Generated text from Apple Intelligence..."
}If the model is not available, you will receive an error response describing the issue.
Health check endpoint to verify the server is running.
Request:
curl http://localhost:8080/healthResponse:
{
"status": "ok"
}- Web framework: Vapor 4.89.0
- AI integration:
FoundationModelsframework (Apple's on-device language model) - Architecture: Async/await with Actor-based inference service for concurrency safety
- Port: 8080 (default Vapor HTTP port)
- Context window: Up to 4,096 tokens per session (approximately 12,000-16,000 characters for English)
The Apple Intelligence system language model excels at:
- Text generation
- Summarization
- Entity extraction
- Creative writing
- Classification
Not suitable for: Basic math, code generation, complex logical reasoning
- Keep prompts focused and specific
- Limit prompt length for faster responses
- Use phrases like "in three sentences" to get concise responses
- Create a new session for each independent request
- Stay within the 4,096 token context limit
If you get "Model is unavailable" errors:
-
Device not eligible
Your device may not support Apple Intelligence. Check Apple's compatibility list. -
Apple Intelligence not enabled
Go to System Settings → Apple Intelligence and enable it. -
Model not ready
The model may still be downloading. Wait a few minutes and try again.
If you encounter module import errors:
- Ensure you're running macOS 26.0+ or equivalent platform version
- Verify you're using Xcode 17+
- Re-resolve dependencies:
swift package resolve
- If necessary, clean and rebuild:
swift package clean swift build
- FoundationModels Framework Documentation
- Generating content and performing tasks with Foundation Models
- Apple Intelligence
This project is licensed under the MIT License.
See the LICENSE file for details.