Real-Time Video AI Arrives on Smartphones

Real-time video AI technology has now arrived on smartphones, bringing advanced editing and enhancement tools directly to mobile devices. This development allows users to process and improve videos instantly, opening up new possibilities for creativity and communication on the go.
Tl;dr
AI-Generated Video Moves to Mobile Devices
It’s a shift that few would have predicted even just a year ago: artificial intelligence-driven video creation, once confined to powerful desktops or specialized servers, is now finding its way directly onto the latest smartphones. At the recent MWC 2024, companies like Qualcomm, MediaTek, and various Asian network operators offered live demonstrations, underscoring just how quickly the field is evolving. Most notably, researchers at Snap showcased an impressive achievement—generating AI videos at ten frames per second on an iPhone 16 Pro Max. Such progress hints at a future where anyone can craft complex video content from their pocket.
A Compact Yet Potent Model
Getting there was no small feat. The standard for cutting-edge video generation has been the Diffusion Transformer (DiT), models renowned for their power but also notorious for their appetite for computational resources. Initially, the research team worked with a massive two-billion-parameter model. To make mobile deployment feasible, they turned to « pruning », reducing the model’s size to below one billion parameters without sacrificing too much quality. This was coupled with a meticulous period of « finetuning »—the essential step ensuring compressed models still deliver visually appealing results.
Several technical strategies proved essential in this success:
Adopting tri-pruning guided by model sensitivity and knowledge distillation (KD-guided) techniques.
Implementing step-by-step distillation, cutting required inferences to just four.
Paving the Way for Creative Freedom
While the focus so far has centered on technological milestones, there’s another dimension that merits attention. If these advances become widely adopted, smartphones might soon handle far more than basic text or static images—they could generate personalized video sequences instantly, even constructing entire virtual worlds as optimization continues. Researchers have dubbed this vision « instant imagination », emphasizing not only speed but new creative possibilities.
Towards Widespread Democratization?
Making real-time AI video creation possible on mobile devices demands a perfect combination: robust hardware (think CPU/GPU/NPU), ultra-optimized models, and cutting-edge compression and distillation techniques. The upshot? A silent revolution may be underway. Thanks to these breakthroughs, users everywhere could access powerful creative tools locally—no cloud required—opening doors previously reserved for professionals with high-end equipment. The real test will come as app developers and device makers decide how—and if—to weave this capability into daily life.