Apple builds a slimmed-down AI model using Stanford, Google innovations

Apple logo

Jeenah Moon/Bloomberg via Getty Images

The world is watching to see what Apple will do to counter the dominance of Microsoft and Google in generative AI. Most assume the tech giant’s innovations will take the form of neural nets on the iPhone and other iOS devices. Small clues are popping up here and there.

Also: How Apple’s AI advances could make or break the iPhone 16

Apple just introduced its own “embedded” large language model (LLM) to run on mobile devices, OpenELM, essentially by mashing together the breakthroughs of several research institutions, including Google’s deep learning scholars and academics at Stanford and elsewhere. 

All of the code for the OpenELM program is posted on GitHub, along with various documentation for the training approach. 

Apple’s work, detailed in a paper by Sachin Mehta and team, “OpenELM: An Efficient Language Model Family with Open-source Training and Inference Framework”, posted on the arXiv pre-print server, is focused on mobile devices as the size of the neural net they use has just 1.3 billion neural weights, or, parameters. 

That number is far below the hundreds of billions of parameters used by models such as OpenAI’s GPT-4 or Google’s Gemini. More parameters directly increase the computer memory required, so a smaller neural net could likely fit into a mobile device more easily. 

Mehta and team’s mashup would be rather unremarkable without a key contribution: efficiency. The researchers adjust the layers of the deep neural network so the AI model is more efficient than earlier models in how much data needs to be computed when training the neural network. 

Also: 2024 may be the year AI learns in the palm of your hand

Specifically, they can meet or beat the results of a slew of neural nets for mobile computing “while requiring 2× fewer pre-training tokens”, where tokens are the individual characters, words, or sentence fragments in the training data. 

Apple starts from the same approach as many LLMs: a transformer. The transformer is the signature neural net in language understanding, introduced by Google scientists in 2017. Every major language model since, including Google’s BERT and OpenAI’s GPT family of models, has adopted the transformer. 

Apple achieves high efficiency by melding the transformer with a technique introduced in 2021 by researchers at the University of Washington, Facebook AI Research, and the Allen Institute for AI, called DeLighT. That work broke away from the conventional approach in which all the neural weights are the same for every “layer” of the network, the successive mathematical computations through which the data pass. 

Instead, the researchers selectively adjusted each layer to have a different number of parameters. Because some layers have relatively few parameters, they called their approach a “deep and light-weight transformer”, hence the name, DeLighT.

Also: Snowflake says its new LLM outperforms Meta’s Llama 3 on half the training

The researchers say that: “DeLighT matches or improves the performance of baseline Transformers with 2 to 3 times fewer parameters on average.”

Apple, using DeLighT, creates OpenELM, where each layer of the neural net has a distinct number of neural parameters, a non-uniform approach to parameters. 

“Existing LLMs use the same configuration for each transformer layer in the model, resulting in a uniform allocation of parameters across layers,” write Mehta and team. “Unlike these models, each transformer layer in OpenELM has a different configuration (e.g., number of heads and feed forward network dimension), resulting in variable number of parameters in each layer of the model.” 

The non-uniform approach, they write, “lets OpenELM better utilize the available parameter budget for achieving higher accuracies.”

Also: Yikes! Microsoft Copilot failed every single one of my coding tests

The competition Apple measures itself against uses similarly small neural nets. These competitors include MobiLlama from Mohamed bin Zayed University of AI and collaborating institutions, and OLMo, introduced this year by researchers at the Allen Institute for Artificial Intelligence and scholars from the University of Washington, Yale University, New York University, and Carnegie Mellon University.

The experiments by Apple are not carried out on a mobile device. Instead, the company uses an Intel-based workstation with a single Nvidia GPU and Ubuntu Linux. 

On numerous benchmark tests, the OpenELM program achieves better scores, despite being smaller and/or using fewer tokens. For example, on six out of seven tests, OpenELM beats OLMo despite having fewer parameters — 1.08 billion versus 1.18 billion — and only 1.5 trillion training tokens versus 3 trillion for OLMo.

Also: How to avoid the headaches of AI skills development

Although OpenELM can be more accurate than those models more efficiently, the authors note further research areas where OpenELM is slower in some cases to produce its predictions. 

An open question for Apple’s iOS AI work has been whether the tech giant will license technology from Google or another party that leads AI development. The investment by Apple in open-source software confers the intriguing possibility that Apple might be trying to reinforce an open ecosystem from which its own devices can benefit. 

Source Link

LEAVE A REPLY

Please enter your comment!
Please enter your name here