Apple Intelligence: Inside the New Apple Foundation Model
Apple Intelligence, an on‑device AI suite debuting with iOS 18.1 beta, centers on the Apple Foundation Model—a 3‑billion‑parameter on‑device LLM (and a larger undisclosed cloud version) trained on TPUs with novel RL algorithms and mixed‑precision quantization, delivering Siri, writing assistance, photo search, and benchmark performance that surpasses GPT‑4, though currently limited to paid developers.
Apple has launched Apple Intelligence, an on‑device AI suite available to developers with the iOS 18.1 beta. The rollout includes a revamped Siri that supports both voice and text interaction, a writing assistant that can polish tweets and comments, and a photo search feature powered by natural‑language queries.
The core of Apple Intelligence is the Apple Foundation Model (AFM), a family of large language models. The on‑device version has roughly 3 B parameters, while a larger cloud version is kept undisclosed. Both models use a 32 k context window.
AFM is trained exclusively on Google TPU hardware (8192 TPUv4 chips for the cloud model and 2048 TPUv5p chips for the on‑device model), with no Nvidia GPUs involved. Training is performed with Apple’s JAX‑based AXLearn framework, employing tensor‑parallelism and pipeline‑parallelism.
Data for pre‑training comes from Applebot‑crawled web pages and publicly licensed code and math datasets, all under permissive licenses (MIT, Apache, CC0). The training pipeline consists of three stages: core training (6.3 T tokens, 4096‑token window), continued training (1 T tokens, 8192‑token window) and context‑extension (up to 32 k tokens, 100 B tokens).
After pre‑training, AFM undergoes supervised fine‑tuning (SFT) and reinforcement learning from human feedback (RLHF). Apple introduced two novel RL algorithms: iTeC (Iterative Teaching Committee) and MDLOO (online RL with leave‑one‑out estimation), which combine preference‑based optimization, DPO, and online policy updates.
For on‑device efficiency, Apple applies a mixed‑precision quantization “palette” strategy: projection weights share 4‑bit constants per 16‑column group, embeddings use 8‑bit per‑channel quantization, and less critical layers are compressed to 2‑bit. Accuracy‑Recovery Adapters are added to mitigate quantization loss.
Evaluation shows AFM surpasses GPT‑4 on several instruction‑following and summarization benchmarks, achieving SOTA on IF‑Eval and strong results on AlpacaEval, GSM8K, and MATH. Safety tests indicate lower violation rates under adversarial prompts compared to other models.
Access to Apple Intelligence is limited to registered developers (US$99/year) on devices with M‑series or A17 Pro chips, and requires US regional settings and English language configuration. The full feature set is expected to roll out later, with the public release possibly delayed.
Java Tech Enthusiast
Sharing computer programming language knowledge, focusing on Java fundamentals, data structures, related tools, Spring Cloud, IntelliJ IDEA... Book giveaways, red‑packet rewards and other perks await!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.