The world of artificial intelligence (AI) exists mostly in cloud-computing facilities and rarely touches your smartphone. When you use a tool like ChatGPT to answer prompts, the hard work of training the program so that it works correctly has been done behind the scenes days, weeks and months ago, in massive AI data centers built by Microsoft. and others.
However, 2024 may be the year the divide is crossed — and it may be when AI starts learning in your pocket. Efforts are underway to make it possible to train a neural net—even a large language model (LLM)—on your personal device with little or no connection to the cloud.
Also: 7 ways to make sure your data is ready for generative AI
The most obvious benefits of on-device training include: avoiding delays by connecting to the cloud; learning from local information in a constant and personalized manner; and preserving privacy that would be violated by sending personal data to a cloud data center.
The effect of on-device training can be a transformation of neural network capabilities. AI can be personalized to your own actions as you move, tap, scroll and drag. AI can learn from the environment you pass through during your daily routine, gathering cues about the world.
Also: How Apple’s AI Advances Could Make or Break the iPhone 16
Recent work by Apple engineers suggests that the company is looking to bring larger “generative” neural networks of the type represented by OpenAI’s ChatGPT to run natively on the iPhone.
More broadly, Google introduced a radically scaled-down AI approach called TinyML several years ago. TinyML can run neural nets on devices with as few milliwatts of power, such as smart sensors embedded in machines.
The biggest challenge for tech companies is that such neural networks can not only make predictions on a phone, but also learn new things on the phone — to run the training locally.
Training a neural net requires much more processing power, much more memory, and much more bandwidth for any computer than using a finished neural net to make predictions in this effort.
Also: Machine learning on the edge: TinyML is growing up
Efforts are underway to conquer that computing mountain by doing things like selectively updating only parts of the neural net’s “weights” or “parameters.” A signature effort there is MIT’s TNTLwhich uses a technique known as transfer learning as a way to refine a neural net that is already mostly trained.
TinyTL has so far been used for small things, such as facial recognition. But the state-of-the-art is now tackling LLMs in Generative AI with OpenAI’s GPT-4. LLM has hundreds of billions of neural weights that need to be stored in memory and then sent to the processor to be updated as new information arrives. This training challenge is performed on a scale never attempted before.
A research report this month It was discovered by staff at European chip-making giant STMicroelectronics that this training effort isn’t enough to guess at the mobile device — instead, the client device must train the neural network to keep it fresh.
“Just enabling model estimation on the device is not enough,” write Danilo Pietro Pau and Fabrizio Maria Aymon. “The performance of AI models, in fact, tends to deteriorate as time passes since the last training cycle; a phenomenon known as concept drift,” the solution to which is to update the program with new training data.
Also: How Google and OpenAI pushed GPT-4 to provide more timely answers
The authors recommend making a neural net slimmer, so it’s easier to train a model on a memory-constrained device. Specifically, they tested removing something called “back-propagation,” the mathematical method of LLM that is the most computationally intensive part of the training.
Pau and Aimone found that replacing back-propagation with simple math can reduce the amount of on-device memory required for neural weights by 94%.
Some scientists advocate dividing the training task among many client devices, which is called “federated learning”.
Kyung Hee University researcher Chu Myet Thawal and team Adapted this month A form of LLM used for image recognition on more than 50 workstation computers, each running a single Nvidia GPU gaming card. Their code takes less memory on the device for training than standard versions of neural nets without losing accuracy.
Some experts, meanwhile, argue that network communications need to be adjusted, so mobile devices can communicate better when performing federated learning.
Also: MongoDB CTO says AI will drastically change software development
Scholars of Institute of Electrical and Electronic Engineering Estimated this month A communication network using the upcoming 6G standard, where most LLM training is first completed in a data center. Then, the cloud coordinates with a set of client devices that “fine-tune” the LLM with local data.
This type of “federated fine-tuning”, where each device learns parts of an LLM, rather than starting from scratch, can be done with much less processing power on battery-powered devices than full training.
Many methods aim to reduce the memory and processing required for each neural weight. The final approach is what is called a “binary neural network”, where instead of each weight having a numerical value, the weights have only one or a zero, which greatly reduces the amount of storage required on the device.
Also: AI scaling problems? MIT offers sub-photon optical deep learning at the edge
Many of the technical concerns mentioned above are abstract words, but some consider the use case of training a neural net locally.
A team from Nanyang Technological University in Singapore This month uses device learning By training each individual device with its own local version of an AI-based “intrusion-detection system,” or IDS, a common cybersecurity program, to deal with cyber threats.
Instead of having client devices communicate with a central server, the team was able to download an initial draft of IDS code and then fine-tune it for local security situations. Not only is this type of training more specific to a local security threat, it also prevents sensitive security information from being passed over the network and back, where it could be intercepted by malicious parties.
Apple is rumored to be eyeing greater on-board AI functionality for iOS devices and has hinted at what could be accomplished in a mobile context.
inside A paper from August, Apple scientists describe a way to automatically learn mobile app attributes, called the Never-ending UI Learner. The program runs on a smartphone and automatically performs button presses and other interactions to determine what type of controls are required for the user interface.
The goal is to use each device to learn automatically, rather than relying on a bunch of human workers who spend their time pressing buttons and annotating app functions.
The test was conducted in a controlled setting by Apple employees. If trials were to be attempted in the wild using real consumers’ iPhones, “a privacy-preserving approach (eg, training on the device) would be required,” the authors wrote.
There was another mobile-based concept Described by Apple scientists in 2022 In a paper titled “Training Large Vocabulary Neural Language Models by Private Federated Learning for Resource-Constrained Devices”.
Their goal was to train speech-recognition AI on mobile devices using a federated learning approach.
Also: Nvidia makes the case for AI PCs at CES 2024
Each person’s device uses samples of interactions with a “voice assistant” (presumably Siri) to train the neural net. Then, the neural network parameters developed by each phone are sent to the network, where they are combined to form an advanced neural net.
A big takeaway from all these research efforts is that scientists are working hard to find ways to compress and partition the training task so that it’s feasible on battery-powered devices with less memory and less processing power than workstations and servers.
Whether this research effort breaks through in 2024 remains to be seen. However, what is already clear is that the training of neural networks is moving beyond the cloud and, possibly, into the palm of your hand.