Google’s new anything-to-anything AI model is wild

Recorded: May 23, 2026, 11 a.m.

Original

Summarized

Google’s new anything-to-anything AI model is wild | The VergeSkip to main contentThe homepageThe VergeThe Verge logo.The VergeThe Verge logo.TechReviewsScienceEntertainmentAIPolicyNotificationsNotificationsHamburger Navigation ButtonThe homepageThe VergeThe Verge logo.NotificationsNotificationsHamburger Navigation ButtonNavigation DrawerThe VergeThe Verge logo.Login / Sign UpcloseCloseSearchTechExpandAmazonAppleFacebookGoogleMicrosoftSamsungBusinessSee all techReviewsExpandSmart Home ReviewsPhone ReviewsTablet ReviewsHeadphone ReviewsSee all reviewsScienceExpandSpaceEnergyEnvironmentHealthSee all scienceEntertainmentExpandTV ShowsMoviesAudioSee all entertainmentAIExpandOpenAIAnthropicSee all AIPolicyExpandAntitrustPoliticsLawSecuritySee all policyGadgetsExpandLaptopsPhonesTVsHeadphonesSpeakersWearablesSee all gadgetsVerge ShoppingExpandBuying GuidesDealsGift GuidesSee all shoppingGamingExpandXboxPlayStationNintendoSee all gamingStreamingExpandDisneyHBONetflixYouTubeCreatorsSee all streamingTransportationExpandElectric CarsAutonomous CarsRide-sharingScootersSee all transportationFeaturesVerge VideoExpandTikTokYouTubeInstagramPodcastsExpandDecoderThe VergecastVersion HistoryNewslettersArchivesStoreVerge Product UpdatesSubscribeFacebookThreadsInstagramYoutubeRSSThe VergeThe Verge logo.Google’s new anything-to-anything AI model is wildNotificationsNotificationsComments DrawerNotificationsCommentsLoading commentsGetting the conversation ready...TechCloseTechPosts from this topic will be added to your daily email digest and your homepage feed.FollowFollowSee All TechAICloseAIPosts from this topic will be added to your daily email digest and your homepage feed.FollowFollowSee All AIGoogleCloseGooglePosts from this topic will be added to your daily email digest and your homepage feed.FollowFollowSee All GoogleGoogle’s new anything-to-anything AI model is wildOmni sent my kid’s stuffie rafting and deepfaked me in front of the Eiffel Tower. But it’s not quite the singularity.by Allison JohnsonCloseAllison JohnsonPosts from this author will be added to your daily email digest and your homepage feed.FollowFollowSee All by Allison JohnsonMay 23, 2026, 11:00 AM UTCLinkShareGiftJust a stuffed deer having the time of his life. | Image: Gemini / The VergePart OfGoogle I/O 2026: All the news and announcementssee all updates Allison JohnsonCloseAllison JohnsonPosts from this author will be added to your daily email digest and your homepage feed.FollowFollowSee All by Allison Johnson is a senior reviewer with over a decade of experience writing about consumer tech. She has a special interest in mobile photography and telecom. Previously, she worked at DPReview.Last year I deepfaked my kid’s stuffed animal to make it look like his plush deer was on vacation.It was an experiment to see if I could re-create the events depicted in a Gemini ad Google was running, and I never showed the videos of Buddy the deer on his adventures to my four-year-old. But it was a revealing exercise that made me think a lot about the difference between some harmless fun with generative AI and full-on slop. Maybe that Venn diagram is a perfect circle! Maybe not. But what I know for sure is that the tools to make realistic videos are surprisingly good, requiring surprisingly little effort and know-how. And that trend is continuing hot into Gemini’s Omni era.Omni is a new family of generative models that will allegedly one day be able to turn any kind of input — photo, video, text — into anything else. But for starters, it’s just creating video. Omni Flash is the first of these models Google has released, now available in the company’s AI video generation and editing platform, Flow. You can still use the previous model, Veo, if you want, but Omni improves on Veo in a few ways.With Omni, you can upload a video and use that along with a text prompt as the starting point for your AI-generated creation. Google also claims Omni incorporates more real-world knowledge when producing videos and can do a better job of keeping characters consistent throughout a video as a result. There was only one way to really know if those claims are true: I brought back AI Buddy to pack his little AI-generated bags for another adventure.The results are such a mixed bag they’re baffling. Some were very good — much more consistent and true to my prompt than when I was testing out Veo five months ago. But even the best clips Omni cooked up for me still have certain AI jump scares, like when Buddy suddenly switches orientation while he’s skydiving.For another video, I gave Omni some artistic freedom. “Create a montage of Buddy packing for a vacation and embarking on a cruise ship for a tropical vacation. The mood is cute and playful. Buddy packs something funny in his suitcase that comes into play later in the clip.” It had Buddy pack a jar of honey; later in the clip he reaches for it as if it’s a bottle of sunscreen. “Uh oh,” the character says as he squirts honey onto his hoof.Honestly, not a bad bit. Except that the bottle of honey constantly changes throughout the video, from a jar, to a clear squirt bottle filled with water, then back to a squeeze bottle filled with honey. And I can’t even begin to describe how the model came up with the final frame of the video — almost as if it just barfed up a bunch of elements of the sequence it just made.You can use text-based prompts to suggest edits to your videos, and I’ll give Google credit: This works better with Omni than it did when I tested Veo 3. But the results were bad with Veo — so bad that I found it way easier to just prompt a new video from scratch every time I wanted something changed. Omni will actually take your edits on board, but the results don’t always hit.I had it emphasize Buddy’s facial reactions in his vacation clips, and the results just wound up looking strange. It would also give Buddy antlers from time to time, which he does not have. Buddy is a baby, thank you very much. When I prompted it to remove the antlers that appeared in one scene, it obliged — and then added antlers in all the other ones.The thing is, none of this is free. Generating videos costs credits, varying from 15 to 40 credits based on the length of the scene and the “ingredients” you start with. One round of edits costs 40 credits. I have the $20-per-month AI Pro plan that comes with 1,000 credits each month. After around 20 clips generated with a few edits on some, I’m down to 145. If you have specific ideas about the video you want Omni to generate, you might be looking at a lot of costly back-and-forth with the model to get a video that’s close to your vision.I can genuinely say I wasn’t prepared for what I sawOne of Omni’s purported strengths is adding AI-generated stuff to real videos, so I gave Buddy a break and deepfaked myself. Starting with a selfie video with a neutral expression, I prompted Omni to generate videos of me eating a plate of spaghetti, sitting in an airplane seat, and standing in front of the Eiffel Tower taking a bite out of a baguette. And I can genuinely say I wasn’t prepared for what I saw.There are AI tells in my deepfake videos. The clink of the fork hitting the bowl of pasta is a little too manufactured. There’s a woman in the background of the airplane video who shows up twice. But aside from those little glitches and a vaguely uncanny sense about them, they’re convincing as hell.I showed my husband the pasta clip; he knew I was testing an AI video tool but I didn’t tell him what in the scene had been generated by AI. Without knowing what was AI-generated about it, he bought that I was sitting in front of a camera eating pasta, and said that his only clue something was up was that the bowl looked unfamiliar. The pasta-eating itself looked real enough to convince my husband. A man who has looked at me in real life basically every single day for the last decade.My other deepfakes are varying levels of “good enough to fool people on social media.” A couple of the Eiffel Tower clips look slightly cartoonish, but one of them is convincing enough that you might need to rewatch it a few times to clock that it’s AI. I know it’s not me when the AI me turns her head and reveals her hair pulled back in a ponytail. But I’m not sure anyone else would know the difference, and that makes me feel weird.We’re definitely deep in the uncanny valleyI’m a little exhausted by it all, to be honest. I was shocked when I tested Veo 3 at the realism it could produce. I’ve been shocked at how easy it is to make fake people in fake photos again and again over the past few years. I should probably be shocked by Omni too, and I guess I am, but the edge has worn off.It’s still not quite as easy to make an AI-generated cinematic masterpiece as Google would like you to believe. But Omni does improve on Veo in some recognizable ways. If you have a Google account and a credit card, then you can take a video of yourself sitting at home and make it look like you’re on a flight to Maui with a trivial amount of effort. I don’t think we’re at the “foothills of the singularity” exactly, but we’re definitely deep in the uncanny valley.All images and videos in this story were generated by Google Gemini.Follow topics and authors from this story to see more like this in your personalized homepage feed and to receive email updates.Allison JohnsonCloseAllison JohnsonPosts from this author will be added to your daily email digest and your homepage feed.FollowFollowSee All by Allison JohnsonAICloseAIPosts from this topic will be added to your daily email digest and your homepage feed.FollowFollowSee All AIGoogleCloseGooglePosts from this topic will be added to your daily email digest and your homepage feed.FollowFollowSee All GoogleGoogle I/O 2026CloseGoogle I/O 2026Posts from this topic will be added to your daily email digest and your homepage feed.FollowFollowSee All Google I/O 2026Hands-onCloseHands-onPosts from this topic will be added to your daily email digest and your homepage feed.FollowFollowSee All Hands-onReviewsCloseReviewsPosts from this topic will be added to your daily email digest and your homepage feed.FollowFollowSee All ReviewsTechCloseTechPosts from this topic will be added to your daily email digest and your homepage feed.FollowFollowSee All TechMore in: Google I/O 2026: All the news and announcementsThe cost of the smart home is going upJennifer Pattison TuohyMay 21More Google Home speakers could be on the way.Jennifer Pattison TuohyMay 21I can’t believe how fast Google vibe coded my first Android appSean HollisterMay 21Most PopularMost PopularIf I could only have one laptop for work and gaming, I’d get this oneGitHub faces a fight for its survival at MicrosoftMicrosoft starts canceling Claude Code licensesThe Trump phone is not hereAnker’s new earbuds have the best call quality I’ve ever heardThe Verge DailyA free daily digest of the news that matters most.Email (required)Sign UpBy submitting your email, you agree to our Terms and Privacy Notice. This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.Advertiser Content FromThis is the title for the native adMore in TechTwelve South’s AirFly Pro 2 has hit one of its best prices ahead of summer travelMeta’s Forum is part Reddit, part Facebook, and part Google AI OverviewGovee’s colorful, JBL-tuned Lamp Pro 2 is matching its best price to dateGoogle appeals search monopoly ruling, says it won business ‘fair and square’The Trump phone is not hereThe literary world isn’t prepared for AITwelve South’s AirFly Pro 2 has hit one of its best prices ahead of summer travelSheena VasaniMay 22Meta’s Forum is part Reddit, part Facebook, and part Google AI OverviewStevie BonifieldMay 22Govee’s colorful, JBL-tuned Lamp Pro 2 is matching its best price to dateBrandon WidderMay 22Google appeals search monopoly ruling, says it won business ‘fair and square’Lauren FeinerMay 22The Trump phone is not hereDominic PrestonMay 22The literary world isn’t prepared for AIGaby Del ValleMay 22Advertiser Content FromThis is the title for the native adTop StoriesMay 22If I could only have one laptop for work and gaming, I’d get this oneMay 22Boots Riley turns class struggle into comedy with I Love BoostersMay 22Spotify says its AI remix tool is for superfans, but I’m not convincedMay 22The Trump phone is not hereMay 22The literary world isn’t prepared for AI2:58 AM UTCIt’s a bottomless pit deeper than hell.The VergeThe Verge logo.FacebookThreadsInstagramYoutubeRSSContactTip UsCommunity GuidelinesArchivesAboutEthics StatementHow We Rate and Review ProductsCookie SettingsTerms of UsePrivacy NoticeCookie PolicyLicensing FAQAccessibilityPlatform Status© 2026 Vox Media, LLC. All Rights ReservedNotifications DrawerThe VergeThe Verge logo.Sign in to see your notifications or create an account to join the conversation.Sign in

Google's development of the anything-to-anything AI model, embodied by the Omni family of generative models, aims to allow the system to transform any input, such as photos, videos, or text, into entirely different outputs. Allison Johnson explored the capabilities of this technology through experiments involving deepfaking, noting the surprisingly accessible nature of creating realistic visual content with minimal effort and specialized knowledge. The release of Omni Flash, the first model in this family, is integrated into Google's video generation and editing platform, Flow.

Omni is presented as an advancement over previous models like Veo, claiming enhanced capabilities in generating video. Specifically, Omni is purported to incorporate more real-world knowledge during video production and exhibit superior ability in maintaining character consistency across extended sequences. Johnson tested these claims by generating videos using text prompts combined with uploaded video starting points. While some generated clips proved considerably more consistent and accurate to the prompts than those produced by Veo, flaws persisted, including noticeable artificial artifacts such as jump scares or inconsistencies in object manipulation, as demonstrated when a character suddenly changed orientation during freefall. Furthermore, attempts to fine-tune the output presented issues, such as the model inconsistently adding or removing physical features like antlers on a character, indicating a lack of stable control over generated details.

The ability to edit these generated videos via text prompts was found to be more effective with Omni than with Veo, although the results were not consistently flawless. The process required significant iterative refinement, highlighting that while Omni is designed to incorporate user edits, the final output often failed to perfectly align with the original vision. This iterative process is resource-intensive, as video generation consumes credits, which depend on the scene length and input materials, leading to substantial costs for users engaging in complex back-and-forth prompting to achieve desired results.

Johnson also investigated Omni's capability to synthesize photorealistic scenarios, including deepfaking herself into various highly specific settings, such as eating spaghetti or posing in front of landmarks like the Eiffel Tower. The resulting deepfakes displayed telltale signs of artificiality, including manufactured sounds, subtle visual glitches, and an overall uncanny quality. Despite the convincing nature of some outputs—where viewers could not easily distinguish between real and fabricated footage—Johnson noted a general shift into the uncanny valley, suggesting that while the realism is high, an underlying sense of artificiality remains. The practical implication is that users can achieve cinematic results with relatively little effort when leveraging Google's tools, which blurs the distinction between harmless generative fun and more complex applications. Ultimately, the development of Omni demonstrates that while the technology offers powerful tools for realistic media creation, the ongoing challenge lies in achieving seamless, entirely believable results and managing the associated complexities and costs.