Blogs & Articles
Apple WWDC24 3-minute summary, is this what happens with the 3B model?

Apple WWDC24 3-minute summary, is this what happens with the 3B model?

Apple's WWDC24 highlighted the introduction of Apple Intelligence, integrating Siri with generative AI for tasks like email management and image generation using a 3B parameter model. The on-device AI model focuses on specialized tasks with a compact foundation, while the cloud AI handles more complex tasks on Apple Silicon servers. This dual approach enhances user experience and maintains data security.

Have you seen the Apple Intelligence announced at WWDC24? Siri meets generative AI to perform various tasks from email to image generation. It is said that all of this was possible with just a 3B model. Here is a short and bold summary of the technical aspects of LLM applied to Apple Intelligence.

Let's briefly summarize the  notable technical contents related to LLM from Apple's Worldwide Developers Conference (WWDC) 24, held from June 10th to 14th.

Is this the physiognomy that will become the king of on-device AI? This is what happens with 3B!

Have you seen Apple Intelligence announced at WWDC  ?

Apple introduced it as a personal intelligence system integrated into iOS 18, iPadOS 18, and macOS Sequoia. Here's a look at what it can do, broadly divided into images and text.


• AI-generated personalized "Zenmoji (based on real photos of me and my friends)", illustration

• Create detailed images from rough sketches, create images based on text

• Organize your photos automatically, search for photos with natural language


• Things that existing generative AI does well, like correcting email grammar and style, and summarizing key points.

• Find and forward schedules, prioritize notifications, and summarize texts and emails.

• iPad’s new Smart Script feature – cleans up your handwriting in real time, converting text in messages or web pages into your own handwriting.

If you'd like to see an actual implementation screen, watch the Apple Intelligence 5-minute summary video below.

Apple Intelligence is made up of several generative AI models, two of which Apple has detailed.

🍎 Introducing Apple's On-Device and Server-Based Models

The on-device model is a 3B (3 billion) parameter language model , and  the server-based model is a model that runs on Apple Silicon servers . Let's summarize what they've revealed.

[On-Device AI]

• After creating a small foundation model at the 3B level, eight major tasks were selected and fine-tuned to handle each well.

• As it is a small model, it is set up to specialize in specialized tasks rather than performing multiple tasks with a single model.

• The foundation model is trained in 16 bits and then quantized to 4 bits.

• Fine tuning is a mix of 2-bit and 4-bit quantization with LoRA tuning.

• Improved speed through optimization (30 tokens generated per second on iPhone 15 Pro)

• Dynamically determine which fine-tuning layer to use with the adapter.

[Cloud AI]

• Difficult tasks on-device are handled using Apple’s own cloud servers.

• Cloud is a server based on Apple Silicon, designed and developed by Apple itself, and a dedicated cloud OS.

• No physical storage and no remote access to prevent personal information leaks

• Encryption of all processes from uploading to the cloud and receiving output from the device. Apple cannot know which user's work is being done.

Another surprising thing is that the user information within the system is converted into app entities, stored on the local device, and used for RAG.

As expected from Apple, which pursues responsible development, the performance evaluation of the models continues with evaluations of harmful content and safe answers.

The performance evaluation compares the on-device models with open source models (Phi-3, Gemma, Mistral, DBRX), and the server models with similar-sized commercial models (GPT-3.5-Turbo, GPT-4-Turbo).  In the benchmark, Apple's 3B on-device model outperforms Phi-3-mini, Mistral-7B, and Gemma-7B.  The server model is evaluated by itself to be very efficient compared to DBRX-Instruct, Mixtral-8x22B, and GPT-3.5-Turbo.

Apple's Shrinking Strategy? A Strategy for Safe Generative AI!

When OpenAI and Google are competing in performance by creating huge foundation models and collecting information from around the world, why did Apple choose a small model? It must have been necessary to create  a model that took into account size, speed, and computing performance for more than 1 billion iPhone users around the world. It is also a strategy for a smooth user experience that Apple values.  It seems that Apple has tried to increase transparency about specific decisions by creating multiple models with customized data specifically designed for the function to counteract hallucinations and biased content of generative AI. It is a practical and cautious approach to generative AI. I think that  the strategy is to include “adapters” so that they can focus on specialized tasks.

If you look at the Apple Intelligence demonstration screen, you can see Siri connecting to ChatGPT. In the future, Apple plans to connect ChatGPT, Google Gemini, and other external large language models (LLMs). OpenAI is just one of the partners in Apple's ecosystem.

This two-track strategy of utilizing on-device AI and external models seems to not want to miss both responsible AI and scalability. It  applies Apple Intelligence only to tasks that produce expected results while limiting AI misuse, risks, and safety issues, and sends the problem externally in the form of switching to ChatGPT when asking questions that are not in Siri.

When creating an image in Image Playground (on-device model), you can see that the image style is limited to animation, illustration, and sketch. This prevents the creation of deep fake images.

If you would like to learn more about Apple's implementation of on-device AI and cloud AI, please check out the links below.

Introducing Apple's official on-device and server models

Will Apple Become a Powerhouse in the Generative AI Market?

If 2023 was the year of testing generative AI, this year can be seen as the stage of realizing generative AI.

Allganize is a B2B AI company that makes generative AI accessible for businesses of all sizes through a best-in-class RAG that avoids the costs of running LLMs specific to an industry or business while delivering the benefits of highest accuracy and lowest operating costs. .

B2B AI solutions are evolving in the direction of solving problems by complexly combining LLMs in the workplace and creating sophisticated and accurate results.

The technology that finds answers in complex tables in corporate documents is an area where Allganize is better than OpenAI's Retriever. You can select and use various LLMs according to your company's work, and Allganize's Ali LLM app market, which allows you to use over 100 work automation tools at once, is also evolving toward a full-stack AI tool.

If you are curious about AI native workflow tools, contact Allganize!