Blogs & Articles
🍎 Is Apple also entering the generative AI market? Multimodal LLM with iPhone?

🍎 Is Apple also entering the generative AI market? Multimodal LLM with iPhone?

Apple has quietly released its multimodal LLM paper. Have we entered the generative AI race in earnest? Will proprietary generative AI be available on iPhones beyond the partnership with ChatGPT? We analyze whether Apple could become a full-fledged powerhouse in generative AI.

Today, we're going to take a look at the paper MM1, which was quietly released by Apple, which seemed to be one step behind the generative AI craze, and introduce the multimodal generative AI model that Apple has also jumped into.

1. Apple Announces MM1 AI Model, Competing for Multimodal LLMs Over LLMs

While big tech companies such as OpenAI, Google, and Meta were growing the generative AI and LLM market significantly, Apple was somehow silent. Many people expected that the hardware and mobile device powerhouse was slowly following along.

In January of this year, Samsung equipped the Galaxy S24 with Google's Gemini, which sparked interest in 'on-device AI'. Samsung has adopted Google's Gemini Pro and Google DeepMind's most advanced text-to-image conversion and AI image generation tool, Imagen 2. Apple was said to be in preliminary negotiations with Google to put Gemini on the iPhone before they partnered with OpenAI to bring ChatGPT to the iPhone.

I've been paying attention to when Apple will release its own LLM and AI models.
Apple's engineers quietly posted a research paper online on March 14. A new generative AI model called MM1 analyzes and understands text and images as you input them. MM1 probably stands for MultiModal 1.

The paper discusses how to build highly efficient multimodal large language models (MLLMs). It looks similar to the design of the latest AI models such as OpenAI's GPT-4o, Google's Gemini, and Antropic's Claude3. The trend in the latest models is showing strength in OCR and image interpretation, and Apple has also unveiled a model that works with text and images.
Let me summarize the contents of the paper. (The summary borrows the power of GPT-4)

Key findings:

The image above is from the paper. Give them a picture of two bottles of beer on a table and an image of the menu, and ask the user how much they should pay for every beer on the table.
The MM1 30B model results in exactly $12, and the rationale is well explained.
Being able to interpret both images, understand the relationship between them, and even calculate numbers is good for the 30B model, which has relatively few parameters.

What makes the MM1 paper interesting is that it reveals details about how the model was trained. It details how to increase image resolution, blend text and image data, and improve model performance. Apple has a reputation for being closed to its technology. Recently, when it comes to attracting AI engineers, companies that disclose their research methods and data have gained an advantage, which seems to have contributed to Apple's paper.

2. Generative AI, when will it enter the iPhone?

Apple's iPhone already has Siri, an AI assistant. However, since the advent of ChatGPT, Siri seems to have been a bit buried. Amazon and Google have announced that they are integrating LLM technology into Alexa and Google Assistant, and Google is helping Android phone users take advantage of Gemini.

Apple is already partnering with Google Instead of developing its own web search technology. Google reportedly paid more than $18 billion to make Google the default search engine on iPhones. Google Maps used to be a standard feature on the iPhone, but in 2012 Apple replaced it with its own Maps app.

Apple CEO Tim Cook said at the company's annual shareholder meeting in February that the company will unveil more details about its plans to use generative AI later this year. It was facing pressure to keep up with rival smartphone makers like Samsung and Google, which are pushing for on-device AI.

Apple is already using generative AI in its internal processes and customer service. You could use AI-generated playlists in Apple Music, or you could have AI-powered productivity tools similar to Microsoft's creations on pages or keynotes.

In June of this year, the Worldwide Developers Conference announced new AI-related features powered by OpenAI’s ChatGPT. At the same time that Apple introduces ChatGPT, it is likely building generative AI tools on top of MM1 and its own models to leverage both Google and its own AI.

3. Can Apple be a winner in the generative AI market?

Apple reportedly invests around $1 billion annually in LLMs. On March 15, the company also acquired DarwinAI, a generative AI startup that specializes in making AI models smaller and faster. DarwinAI specializes in vision-based technologies that monitor parts in the manufacturing process, but according to Bloomberg, there are also technologies that increase the efficiency of AI applications. These small, fast AI models are useful for on-device generative AI.

iOS 18 will have more task automation features through Siri. When you can understand and respond to text and images together, you can build an "agent" that can understand the flow of the user and run the application on your phone on its own, or use the output in a series of ways to solve the task on its own.

There is a clear advantage that Apple has in its generative AI+ hardware. Apple has control over everything from the software to the hardware stack. When Apple launched the iPhone X in 2017, it included a custom neural engine in the power supply chip. It's a neural engine designed to speed up speech and image processing.

The lead author of the MM1 paper wrote to X that this model is just the beginning, and that they are already working on developing the next generation model. We can look forward to seeing how Apple will stand out in generative AI.

B2B AI solutions are evolving to combine LLMs in the workplace to solve problems and produce sophisticated and accurate results. The art of finding answers in complex tables of corporate documents is an area where Allganize is doing better than OpenAI's retriever.

AllAge's Ali LLM app market, which allows you to select and use various LLMs to suit your company's work, and allows you to use more than 100 work automation tools at once, is also evolving towards full-stack AI tools.

If you're curious about AI-native workflow tools, contact Allganize!

Learn more about LLM apps for businesses you can start using today