Background

The Complex Plan, Where Apple Plans To Improve Its AI By Privately Analyzing User Data

Robot hand holding an Apple

The race towards creating the best and the most useful Large Language Models rages on, and Apple is left behind.

Despite announcing updates and improvements to its Siri-powered AI and Apple Intelligence, the Cupertino-based company simply couldn't keep up with the generative AI trend spearheaded by the likes of OpenAI with ChatGPT.

In a bid to regain its pace, Apple said that it plans to use user data to improve its AI.

Apple is going to such extent, because the more traditional way of training AI models with synthetic data has been ineffective, the company said in a blog post.

But to train its AI using user data is difficult since it can involve personal and sensitive information.

This is where Apple is using what it calls the "differential privacy" approach.

"At Apple, we believe privacy is a fundamental human right. And we believe in giving our users a great experience while protecting their privacy. For years, we’ve used techniques like differential privacy as part of our opt-in device analytics program. This lets us gain insights into how our products are used, so we can improve them, while protecting user privacy by preventing Apple from seeing individual-level data from those users."

Traditionally, Apple enhances its AI features, like summarization and writing tools, using synthetic data.

The approach includes the creation of synthetic emails with the help of large language models that mimic real messages in style and topic, but contain no actual user data. These synthetic messages are transformed into embeddings (mathematical representations), which are then shared with opted-in devices. Each device compares these embeddings with its own private email data—without sharing actual content—then selects the closest matches.

But that approach only isn't sufficient.

Using differential privacy, Apple digs deeper into using user data, but without having to compromise privacy.

For those users who opt into Device Analytics, Apple is collecting anonymized, randomized signals from their devices to understand which prompt patterns are most common—like fun combinations such as "dinosaur in a cowboy hat."

Each device may send a true signal or a deliberately noisy one, meaning Apple can only detect trends when a term is used by many people, never exposing rare or unique prompts.

To ensure privacy and user anonymity, the method strips user data from any linked IP addresses, Apple IDs, or specific devices.

Apple
Diagram showing how Apple generates different variants of synthetic messages.
"Only users who have opted-in to send Device Analytics information to Apple participate. The contents of the sampled emails never leave the device and are never shared with Apple. A participating device will send only a signal indicating which of the variants is closest to the sampled data on the device, and Apple learns which selected synthetic emails are most often selected across all devices, not which embedding was selected by any individual device."

According to Apple, the company "currently uses differential privacy to improve Genmoji."

But in upcoming releases, the company plans to use this approach, with the same privacy protections, to enhance features like Image Playground, Image Wand, Memories Creation and Writing Tools in Apple Intelligence, as well as in Visual Intelligence.

This time, Apple Intelligence has struggled to provide users with accurate summaries, and senior executives at the company said in an internal meeting that the company’s delays to key updates for Siri, its virtual AI-powered voice assistant, have been ugly and embarrassing.

Siri, once a pioneer in the the realms of virtual assistants, is now lagging behind rivals, most of which are already far ahead in offering more advanced AI features.

Published: 
14/04/2025