Gemini task automation is slow, clunky, and super impressive

Recorded: March 21, 2026, 3 p.m.

Original

Summarized

Gemini task automation is slow, clunky, and super impressive | The VergeSkip to main contentThe homepageThe VergeThe Verge logo.The VergeThe Verge logo.TechReviewsScienceEntertainmentAIPolicyHamburger Navigation ButtonThe homepageThe VergeThe Verge logo.Hamburger Navigation ButtonNavigation DrawerThe VergeThe Verge logo.Login / Sign UpcloseCloseSearchTechExpandAmazonAppleFacebookGoogleMicrosoftSamsungBusinessSee all techReviewsExpandSmart Home ReviewsPhone ReviewsTablet ReviewsHeadphone ReviewsSee all reviewsScienceExpandSpaceEnergyEnvironmentHealthSee all scienceEntertainmentExpandTV ShowsMoviesAudioSee all entertainmentAIExpandOpenAIAnthropicSee all AIPolicyExpandAntitrustPoliticsLawSecuritySee all policyGadgetsExpandLaptopsPhonesTVsHeadphonesSpeakersWearablesSee all gadgetsVerge ShoppingExpandBuying GuidesDealsGift GuidesSee all shoppingGamingExpandXboxPlayStationNintendoSee all gamingStreamingExpandDisneyHBONetflixYouTubeCreatorsSee all streamingTransportationExpandElectric CarsAutonomous CarsRide-sharingScootersSee all transportationFeaturesVerge VideoExpandTikTokYouTubeInstagramPodcastsExpandDecoderThe VergecastVersion HistoryNewslettersArchivesStoreVerge Product UpdatesSubscribeFacebookThreadsInstagramYoutubeRSSThe VergeThe Verge logo.Gemini task automation is slow, clunky, and super impressiveComments DrawerCommentsLoading commentsGetting the conversation ready...TechCloseTechPosts from this topic will be added to your daily email digest and your homepage feed.FollowFollowSee All TechAICloseAIPosts from this topic will be added to your daily email digest and your homepage feed.FollowFollowSee All AIGoogleCloseGooglePosts from this topic will be added to your daily email digest and your homepage feed.FollowFollowSee All GoogleGemini task automation is slow, clunky, and super impressiveIt took nine minutes to order my dinner, but it still feels like the future.It took nine minutes to order my dinner, but it still feels like the future.by Allison JohnsonCloseAllison JohnsonPosts from this author will be added to your daily email digest and your homepage feed.FollowFollowSee All by Allison JohnsonMar 21, 2026, 11:30 AM UTCLinkShareGiftAn AI assistant that can actually get things done.Allison JohnsonCloseAllison JohnsonPosts from this author will be added to your daily email digest and your homepage feed.FollowFollowSee All by Allison Johnson is a senior reviewer with over a decade of experience writing about consumer tech. She has a special interest in mobile photography and telecom. Previously, she worked at DPReview.I’ve been testing out Gemini’s new task automation on the Pixel 10 Pro and the Galaxy S26 Ultra, which for the first time lets Gemini take the wheel and use apps for you. It’s limited to a small subset right now — a handful of food delivery and rideshare services — and it’s still in beta. It’s slow, it’s clunky at times, and it doesn’t solve any serious problem you had using your phone. But it’s impressive as hell, and I don’t think it’s hyperbole to say this is a glimpse of the future. We’re still a long way off, but this is the first time I’ve seen a true AI assistant actually working on a phone — not in a keynote presentation or a carefully controlled demo inside a convention hall.First off: Gemini is much slower than you, or me, or most anyone at using their phone. If you need to order an Uber right this second, you’re still the best person for the job. Before you write it off, though, remember that task automation is designed to run in the background while you do other things on your phone. Even better, it keeps working while you’re not looking at your phone, so you can do things like check that your passport is in your bag for the 10th time.But if you’re curious, like I am, you can watch the whole thing happen. While it’s working, text appears at the bottom of the screen indicating what Gemini is doing. Stuff like “Selecting a second portion of Chicken Teriyaki for the combo,” which it did when I directed it to order my dinner on Saturday night. Watching Gemini figure things out on the fly honestly kinda rules. I asked for a chicken combo plate; the menu presented options in half- portion increments, so it correctly added two half servings of chicken.Gemini figured out that two half portions would equal one order of chicken teriyaki.Gemini had more trouble finding the side of greens featured right in the middle of the screen here.It’s for the best that when you start an automation with Gemini, the default behavior is for it to run in the background. You have to tap a button and open another window if you want to watch Gemini working through the task. And it can be excruciating. Watching the computer try to find a side of greens on a menu in Uber Eats when it’s sitting right there at the top of the screen is like watching a horror movie and knowing the murderer is in the closet right next to the protagonist. I mean, except for the murder part. Gemini made a couple of wrong turns as it put together my teriyaki order, which it eventually figured out on its own, but the whole episode took about nine minutes. Not ideal.Gemini is supposed to carry out your task right up to the point where it’s time to hit confirm and order your car or dinner so you can double-check its work. This, I think, is the only sane way to use this feature right now, and I don’t mind the added friction of completing the order. In the tests I’ve run over the past five days, I’ve never had it go rogue and finish my order for me. And it is surprisingly accurate; I’ve had to make very few adjustments to the final order. If it fails — which I have seen happen a couple of times — it tends to be within the first minute or two when something about the app needs my attention, like giving it permission to use my location, or changing the delivery location to home rather than Nevada, which was the last place I used that app. I had to figure out what the problem was in cases like this, but once it was sorted out I was able to restart the automation without an issue.Here’s the one that really got me. I put an event on my calendar for a flight to San Francisco the following day (a pretend trip for me, but real flight details). I gave Gemini a vague prompt to schedule an Uber that would get me to the airport in time for my flight tomorrow. Because Gemini has access to my email and calendar, it can go find that information. It did need a little extra guidance — possibly because the flight wasn’t in my email like it expected. But with that, it found the flight information, suggested leaving by 11:30 or 11:45AM (logical timing for a 1:45PM flight given I live close to the airport), and asked if I wanted to schedule a ride for one of those times. I confirmed the time, and it went about setting up the ride in about three minutes with no further input required on my part.It’s a little more impressive when you consider that Uber doesn’t even refer to it as scheduling a ride — you reserve a ride. That’s the key difference between the digital assistants we’ve been using and the AI assistants emerging now. Being able to use natural language when talking to the computer makes a huge difference when you’re controlling your smart home or placing your dinner order. If the computer is going to get tripped up and ask for clarification when you forget that the restaurant calls your meal a “plate” and not a “combo,” or if you ask for “slaw” instead of “shredded cabbage,” then it’s no more useful than the assistants we’ve been using for the past decade to set timers and play music.That said, watching Gemini tap and scroll around Uber Eats makes one thing painfully obvious: If you were designing an application for AI to use, it would look nothing like the ones we have today. You know, apps designed for humans. An AI assistant won’t be tempted by a big ad in the middle of a page to save 30 percent on your order. An appetizing, well-staged photo of the dish it’s ordering isn’t any more convincing than a low-quality one. You would give it a database, not a bunch of clutter to weed through — something the industry is working toward in Model Context Protocol, or MCP.An AI model reasoning its way through a human-centric interface feels like the most impractical and brittle way to place a pizza order. It does hit a snag occasionally, and it’s not great at telling you why it couldn’t do something. This version of task automation feels like a stopgap until app developers adopt more robust methods: MCP or Android’s app functions. Google’s head of Android, Sameer Samat, told me recently that Gemini takes the reasoning approach in the absence of the other two. Maybe this version of task automation is our preview of what’s possible, or a way to prod developers into adopting one of the other methods. Either way, this feels like a notable first step toward a new way of using our mobile assistants — awkward, slow, but very promising.Photography by Allison Johnson / The VergeFollow topics and authors from this story to see more like this in your personalized homepage feed and to receive email updates.Allison JohnsonCloseAllison JohnsonPosts from this author will be added to your daily email digest and your homepage feed.FollowFollowSee All by Allison JohnsonAICloseAIPosts from this topic will be added to your daily email digest and your homepage feed.FollowFollowSee All AIGoogleCloseGooglePosts from this topic will be added to your daily email digest and your homepage feed.FollowFollowSee All GoogleHands-onCloseHands-onPosts from this topic will be added to your daily email digest and your homepage feed.FollowFollowSee All Hands-onReviewsCloseReviewsPosts from this topic will be added to your daily email digest and your homepage feed.FollowFollowSee All ReviewsTechCloseTechPosts from this topic will be added to your daily email digest and your homepage feed.FollowFollowSee All TechMost PopularMost PopularThis is Microsoft’s plan to fix Windows 11Google Search is now using AI to replace headlinesBelkin’s wireless HDMI adapter freed me from a long annoying cable when I travelThe improved battery-powered Starlink Mini is hereMicrosoft is ending the Windows Update nightmare — and letting you pause them indefinitelyThe Verge DailyA free daily digest of the news that matters most.Email (required)Sign UpBy submitting your email, you agree to our Terms and Privacy Notice. This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.Advertiser Content FromThis is the title for the native adMore in Tech8Verge ScoreThe improved battery-powered Starlink Mini is hereJury finds Elon Musk’s ‘stupid tweets’ caused Twitter investors’ lossesAn automated moderation error left Tumblr users panickedFuture Sony PlayStation games will use AI to imagine new framesMicrosoft is ending the Windows Update nightmare — and letting you pause them indefinitelyWindows 11 is finally getting a movable taskbarThe improved battery-powered Starlink Mini is hereThomas Ricker7:29 AM UTCJury finds Elon Musk’s ‘stupid tweets’ caused Twitter investors’ lossesJay PetersMar 20An automated moderation error left Tumblr users panickedStevie BonifieldMar 20Future Sony PlayStation games will use AI to imagine new framesSean HollisterMar 20Microsoft is ending the Windows Update nightmare — and letting you pause them indefinitelySean HollisterMar 20Windows 11 is finally getting a movable taskbarTom WarrenMar 20Advertiser Content FromThis is the title for the native adTop Stories7:29 AM UTCThe improved battery-powered Starlink Mini is here11:00 AM UTCKodiak CEO says making trucks drive themselves is only half the battleTwo hours agoOeuf is a punishing platformer in a cozy shellTwo hours agoThe new MacBook Pro is still fast as hellTwo hours agoAn early contender for movie of the year15 minutes agoHere are 20 of our favorite outdoor deals from REI’s Member Days SaleThe VergeThe Verge logo.FacebookThreadsInstagramYoutubeRSSContactTip UsCommunity GuidelinesArchivesAboutEthics StatementHow We Rate and Review ProductsCookie SettingsTerms of UsePrivacy NoticeCookie PolicyLicensing FAQAccessibilityPlatform Status© 2026 Vox Media, LLC. All Rights Reserved

Gemini’s task automation, as demonstrated by Allison Johnson in The Verge, represents a nascent but potentially transformative step toward truly intelligent mobile assistants. The feature, currently in beta and tied to a limited selection of food delivery and rideshare services, operates on Pixel 10 Pro and Galaxy S26 Ultra devices, aiming to automate tasks within existing applications. Despite its impressive aspirations, the current implementation suffers from significant drawbacks, primarily characterized by slowness and clunkiness. The process often takes nine minutes to complete a relatively simple order, mirroring the user’s own pace and highlighting the current gap between AI and human efficiency. Johnson meticulously details this experience, illustrating how Gemini, while capable of understanding and responding to prompts—such as accurately calculating chicken portion sizes—can also encounter difficulties, like struggling to locate a side of greens on a menu, demonstrating the limitations of relying solely on AI reasoning within a complex and visually-driven interface.

The system’s reliance on background operation, requiring manual activation and offering a visually demanding observation window, further contributes to its cumbersome nature. Johnson’s accounts emphasize the need for constant user oversight, as Gemini occasionally deviates from the intended task, necessitating intervention to correct its course. This dependence on user guidance underscores the current stage of development, where the AI is more of an assistant requiring substantial human direction rather than a truly autonomous agent.

However, the core demonstration of Gemini’s potential lies in scenarios like scheduling a ride for a flight. Leveraging access to Johnson’s email and calendar, Gemini autonomously gathers necessary information—flight details, appropriate departure times—and initiates the ride booking process. This functionality, while requiring some initial input, showcases the AI’s ability to synthesize information from multiple sources, a critical element for future, more sophisticated automation. The article touches upon the critical need for app developers to move beyond human-centric interfaces towards more streamlined data representation, referencing Google’s Model Context Protocol (MCP) as a potential solution.

The current iteration of Gemini’s task automation is ultimately presented as a proof of concept, a stepping stone towards a future where AI seamlessly integrates with our daily routines. While the system’s awkwardness and limitations are undeniable, Johnson’s detailed account provides a valuable glimpse of this future, acknowledging the complexities and challenges inherent in developing truly intelligent and autonomous assistants, and highlighting the potential impact on app design and development. The system's reliance on manual intervention and occasional errors showcases that the technology is far from polished and requires further refinement before achieving genuine practicality.