We’ve always known it was coming, what we couldn’t predict is where it would come from and how fast. The best AI video generation technology has now moved from a trickle to a tidal wave of product releases and research.
One name in particular has exploded onto our screens in the past few weeks: Bytedance. The company has just released two stunningly good text-to-video AI models, which rival the best in the world.
@mattfarmerai
♬ original sound – Matt Farmer | AI & Marketing
For those who don’t know, Bytedance is the infamous owner of TikTok. And now the company has released OmniHuman-1, a new multimodal video generation framework which can take a single image and generate extremely sophisticated video with audio attached. The model is special because of its ability to combine video, audio and lip-syncing in a near perfect match.
We’re not talking about pretty good video here, we’re talking extremely high quality output in every way. The project’s GitHub demo page features a raft of beautifully crafted videos, all taken from a single image plus an audio file. The lip syncing is almost perfect, the image resolution is spectacular, and there are remarkably few glitches in the output that we can see.
The platform is not limited to photorealistic video either, it can produce cartoons, artificial animated objects, animals and even some quite complicated and challenging poses.
In the past few days the company has also dropped Goku, which offers similar text to video quality, but with an interesting twist. First the Goku model only features 8B parameters, which is incredibly small for this kind of quality. It’s clear the company is specifically targeting the advertising market, based no doubt on its massive back catalog of TikTok videos and shopping experiences.
These moves propel the Chinese company into the AI big league, alongside other Chinese AI giants Alibaba, Tencent and DeepSeek. Suddenly the landscape has changed completely in ways no one could have imagined even a year ago.
Other Chinese companies like Kling AI have already shown what’s possible, but the Bytedance tech is different because it comes from a company which probably owns the largest video media library on earth after Facebook.
Meanwhile, Goku also moves AI generation further down the yellow brick road, and targets one of the biggest industries in the world: advertising.
The demo videos on the project page show a range of clips which are quite clearly aimed at short or long form social media advertising applications. Women and men using body products and other cheesy demo clips predominate.
These video tools are not just destined to sell us more products, it’s obvious there’s a much larger agenda at work here. After advertising, the next domino to fall is almost certainly going to be animated art in all its forms. Even if we don’t see full length animations using this technology in the short term, there’s no question that it’s already being deployed as part of the production process.
Before we get too excited we should remember that the computing demands of this kind of AI model are still colossal. There’s a reason it took Sora so long to appear on the market. It’s also important to note that both OmniHuman and Goku exist only in the lab, with no public facing application for anybody to play with. Yet.
Both OmniHuman and Goku exist only in the lab, with no public facing application for anybody to play with. Yet.
However, anyone needing a glimpse into the massive disruption that Chinese AI is bringing to the video animation world, should take a look at Kling AI.
The AI generated video that’s coming out of this publicly accessible commercial service is nothing less than staggering. All generated from a simple text prompt. And in case you think it’s all just ten second clips, take a look at this mock up of a well-known television show.
Bottom line
Last summer mega industry publication The Hollywood Reporter ran a front page story entitled ‘Hollywood at a Crossroads: Everyone Is Using AI, But They Are Scared to Admit It’. The undertone of the article was fatalistic. Basic movie worker’s labor would inevitably be “displaced” first by AI. Followed later by a creeping AI penetration which would consume everything in its path over time. This process has definitely already started.
In the same way digital technology has almost completely unseated analog movie making, AI with its massive cost efficiencies will inevitably perform the same kind of industry disruption. The one certainty is we’ll see these impacts much sooner than any of us expect.
Polish director Besaleel sums up the mood: “I foresee that film and TV productions will eventually employ only leading and perhaps supporting actors, while the entire world of background and minor characters will be created digitally”, The video and movie world is changing folks, and at warp speed 9.
More from Tom’s Guide
This article was originally published by a www.tomsguide.com . Read the Original article here. .