Skip to content

Apple Intelligence

Apple Intelligence is the default provider for Human on Mac. It runs entirely on your device using Apple’s built-in foundation model — no API keys, no cloud accounts, no per-token billing.

  • Apple Silicon Mac (M1 or later)
  • macOS 26 (Tahoe) or newer
  • Apple Intelligence enabled in System Settings

No third-party dependencies — Human includes its own on-device inference server (human-ondevice) built on Apple’s FoundationModels framework via Network.framework.

If you installed Human via Homebrew, human-ondevice was built and installed automatically:

Terminal window
brew install human

Verify it works:

Terminal window
human-ondevice --help

If you built from source, build the on-device server separately:

Terminal window
cd apps/tools/human-ondevice
swift build -c release
cp .build/release/human-ondevice /usr/local/bin/

Apple Intelligence is selected by default when you run human onboard on a Mac:

Terminal window
human onboard

Or skip all prompts:

Terminal window
human onboard --apple

This creates ~/.human/config.json with:

{
"default_provider": "apple",
"default_model": "apple-foundationmodel"
}

Human talks to the on-device inference server’s OpenAI-compatible HTTP endpoint:

human agent → HTTP → human-ondevice → FoundationModels API → on-device LLM

All inference runs locally on the Neural Engine. No data leaves your machine. The macOS app also embeds the server in-process — no separate binary needed.

Human’s adaptive model router uses Apple Intelligence for reflexive-tier messages — quick replies, acknowledgments, and simple questions. More complex messages (emotional, analytical, multi-step reasoning) automatically route to a cloud provider if one is configured.

To configure a cloud fallback for complex messages:

{
"default_provider": "apple",
"default_model": "apple-foundationmodel",
"providers": [{ "name": "gemini" }],
"agent": {
"model_router": {
"conversational_model": "gemini-3-flash-preview",
"analytical_model": "gemini-3-flash-preview",
"deep_model": "gemini-3.1-pro-preview"
}
}
}

| Constraint | Detail | | -------------- | ------------------------------------------------------ | | Context window | 4,096 tokens (~3,000 words) | | Model | Single model (apple-foundationmodel), not configurable | | Guardrails | Apple’s safety system may block some prompts | | Speed | On-device inference, not cloud-scale | | No embeddings | Vector embeddings not available | | No vision | Image/multimodal input not supported |

| Field | Default | Description | | -------------------------------------- | ------------------------- | ------------------------------------------- | | default_provider | "apple" | Set to "apple" to use Apple Intelligence | | default_model | "apple-foundationmodel" | The on-device model name | | agent.model_router.on_device_model | "apple-foundationmodel" | Override the on-device model name | | agent.model_router.on_device_enabled | true (macOS) | Set to false to disable on-device routing |

“Model unavailable” — Ensure Apple Intelligence is enabled in System Settings > Apple Intelligence & Siri, and that the on-device server is running (human-ondevice or the macOS app).

Slow responses — On-device inference depends on your Mac’s Neural Engine. First responses may be slower while the model loads.

Guardrail blocks — Apple’s safety system is conservative. Try rephrasing the prompt.

Server not found — The on-device server listens on port 11435 by default. Check with curl http://127.0.0.1:11435/health. If using the third-party apfel instead, it runs on port 11434 and is also detected automatically.