On Monday at WWDC, Apple unveiled Apple Intelligence, a suite of features bringing generative AI tools such as rewriting a draft email, summarizing notifications and creating custom emoji to iPhone, iPads and Macs. Apple spent a significant portion of its keynote explaining the usefulness of the tools — and an almost equal share of time assuring customers how well the new AI system keeps your data private.
This privacy is possible thanks to a dual approach to generative AI that Apple began explaining in its keynote and subsequently provided more details on in articles and presentations. They show that Apple Intelligence is built with a philosophy about the device being able to quickly perform common AI tasks that users want, like transcribing calls and organizing their schedules. However, Apple Intelligence can also go to cloud servers for more complex AI requests that include sending personal contextual data – and ensuring both deliver good results while keeping your data private is where Apple has concentrated its efforts.
The big news is that Apple is using its own in-house AI models for Apple Intelligence. Apple notes that it does not train its models with private data or user interactions, which is unique compared to other companies. Apple instead uses licensed material and publicly available online data that is retrieved by the company’s Applebot web crawler. Publishers must opt out if they do not want their data ingested by Apple, which is similar to Google and OpenAI’s policies. Apple also claims it fails to provide Social Security and credit card numbers that circulate online, and ignores “profanity and other low-quality content.”
One of Apple Intelligence’s biggest selling points is its deep integration into Apple’s operating systems and applications, as well as how the company optimizes its models for energy efficiency and size to fit iPhones. Keeping AI requests local is key to alleviating many privacy concerns, but the tradeoff is using smaller, less capable models on the device.
To make these local models useful, Apple uses fine-tuning, which trains models to improve at specific tasks such as proofreading or text summarization. Skills are presented as “adaptors”, which can be placed on the base model and swapped out for the task at hand, similar to applying power-up attributes to your character in a Roleplay. Similarly, Apple’s delivery model for Image Playground and Genmoji also uses adapters to achieve different art styles like illustration or animation (which makes people and animals look like cheap Pixar characters).
Apple claims to have optimized its models to speed up the time between sending a prompt and providing a response, and uses techniques such as “speculative decoding”, “context pruning” and “attention group queries” to take advantage of Apple Silicon’s neural system. Engine. Chipmakers have only recently started adding neural cores (NPUs) to the chip, which helps reduce CPU and GPU bandwidth when processing machine learning and AI algorithms. This partly explains why only Macs and iPads with M-series chips and only the iPhone 15 Pro and Pro Max support Apple Intelligence.
The approach is similar to what we see in the Windows world: Intel launched its 14th generation Meteor Lake architecture featuring a chip with an NPU, and Qualcomm’s new Snapdragon X chips designed for Microsoft’s Copilot Plus PCs are also equipped with them. As a result, many AI features on Windows are reserved for new devices that can perform work locally on those chips.
According to Apple’s research, out of 750 answers tested for text summarization, the AI built into Apple’s device (with the appropriate adapter) produced results more appealing to humans than the Phi-3 model -mini from Microsoft. This seems like a great achievement, but most chatbot services today use much larger models in the cloud to get better results, and that’s where Apple is trying to tread a cautious line on privacy . To compete with larger models, Apple concocts a transparent process that sends complex queries to cloud servers while trying to prove to users that their data remains private.
If a user request requires a more capable AI model, Apple sends the request to its Private Cloud Compute (PCC) servers. PCC runs on its own operating system based on the “iOS foundations” and has its own machine learning stack that powers Apple Intelligence. According to Apple, PCC has its own Secure Boot and Secure Enclave to maintain encryption keys that only work with the requesting device, and Trusted Execution Monitor ensures that only signed and verified code executes.
Apple says the user’s device creates an end-to-end encrypted connection to a PCC cluster before sending the request. Apple says it can’t access PCC data because it lacks server management tools, so there’s no remote shell. Apple also does not provide persistent storage to the PCC, so queries and possible personal contextual data extracted from Apple Intelligence’s Semantic Index are apparently deleted on the cloud afterward.
Each version of PCC will have a virtual version that the public or researchers can inspect, and only versions signed and registered as inspected will go into production.
One of the big outstanding questions is exactly what types of requests will be sent to the cloud. When processing a request, Apple Intelligence has a step called Orchestration, where it decides whether to proceed on the device or use PCC. We don’t yet know exactly what constitutes a query complex enough to trigger a cloud process, and we probably won’t know until Apple Intelligence is available in the fall.
There’s another way Apple handles privacy issues: making them someone else’s problem. Apple’s revamped Siri can send certain requests to ChatGPT in the cloud, but only with permission after asking very difficult questions. This process transfers the issue of confidentiality into the hands of OpenAI, which has its own policies, and the user, who must agree to offload their request. In an interview with Marques Brownlee, Apple CEO Tim Cook said ChatGPT would be tapped for requests involving “knowledge of the world” that are “outside the realm of personal context.”
Apple’s on-premises and cloud approach to Apple Intelligence isn’t entirely new. Google offers a Gemini Nano model that can run locally on Android devices, as well as its Pro and Flash models that process on the cloud. Meanwhile, Microsoft Copilot Plus PCs can process AI requests locally while the company continues to build on its deal with OpenAI and also build its own MAI-1 model in-house. However, none of Apple’s competitors have placed as much emphasis on their privacy commitments.
Of course, this all looks great in staged demonstrations and edited articles. However, the real test will come later this year, when we see Apple Intelligence in action. We’ll have to see if Apple can successfully achieve this balance between quality of AI experiences and privacy – and continue to develop it in the years to come.