Xiaomi MiMo-v2.5 Series API Permanent Price Reduction Up to 99%
Recorded: May 26, 2026, 7:03 p.m.
| Original | Summarized |
Xiaomi MiMo API Open PlatformMiMo-V2.5 prices cut by up to 99%. Token Plan gets more Credits — quota increased 5–8×, all used Credits within validity period fully reset. Effective May 27, 2026 at 00:00 AM CST. Learn More →MiMo-V2.5 prices cut by up to 99%. Token Plan gets more Credits — quota increased 5–8×, all used Credits within validity period fully reset. Effective May 27, 2026 at 00:00 AM CST. Learn More →MiMo-V2.5 prices cut by up to 99%. Token Plan gets more Credits — quota increased 5–8×, all used Credits within validity period fully reset. Effective May 27, 2026 at 00:00 AM CST. Learn More →MiMo-V2.5 prices cut by up to 99%. Token Plan gets more Credits — quota increased 5–8×, all used Credits within validity period fully reset. Effective May 27, 2026 at 00:00 AM CST. Learn More →MiMo-V2.5 prices cut by up to 99%. Token Plan gets more Credits — quota increased 5–8×, all used Credits within validity period fully reset. Effective May 27, 2026 at 00:00 AM CST. Learn More →MiMo-V2.5 prices cut by up to 99%. Token Plan gets more Credits — quota increased 5–8×, all used Credits within validity period fully reset. Effective May 27, 2026 at 00:00 AM CST. Learn More →MiMo-V2.5 prices cut by up to 99%. Token Plan gets more Credits — quota increased 5–8×, all used Credits within validity period fully reset. Effective May 27, 2026 at 00:00 AM CST. Learn More →Refer & earnEnglishRefer & earnContact UsDocumentationToken PlanConsoleBlogEnglishCtrlWelcomeQuick StartFirst API CallModel and Rate LimitsModel HyperparametersError CodesPricingPay‑As‑You‑Go APIToken PlanSubscription InstructionsQuick AccessNewsMiMo-V2.5 Series Price Adjustment Announcement | 100 Trillion Token Creator Incentive Plan ConcludesXiaomi MiMo-V2.5 series open-sourced & Orbit 100 trillion token plan launchedXiaomi MiMo-V2.5-TTS-Series + ASR Officially Launched: Your Voice, Under Your ControlXiaomi MiMo-V2.5 Series Large Model Launches Public BetaPrevious NewsAPI ReferenceChatOpenAI APIAnthropic APIIntegration ExtensionOverview of AI ToolsOpenCode ConfigurationClaude Code ConfigurationOpenClaw ConfigurationHermes Agent ConfigurationKilo Code ConfigurationCherry Studio ConfigurationQwen Code ConfigurationCodeBuddy ConfigurationCline ConfigurationUsage GuideTool CallingWeb SearchMultimodal UnderstandingImage UnderstandingAudio UnderstandingVideo UnderstandingSpeech synthesis (MiMo-V2.5-TTS Series)Speech synthesis (MiMo-V2-TTS)[Important Notice]Passing Back reasoning_content in Multi-Turn Conversations for Agent ProductsPromotionsRefer & earnFAQUpdate LogModel ReleaseFeature UpdatesTerms & AgreementsService AgreementPrivacy PolicyDeveloper CommunityTry MiMo Claw for FreeHOTDocumentationNewsMiMo-V2.5 Series Price Adjustment Announcement | 100 Trillion Token Creator Incentive Plan ConcludesMiMo-V2.5 Series Price Adjustment Announcement | 100 Trillion Token Creator Incentive Plan Concludes MiMo-V2.5 Series API Permanent Price Reduction Token Plan billing system optimization, with usage increased to 5-8 times the original The Creator Incentive Program for Quadrillion Tokens Concludes Successfully Full reset of the current effective Token Plan user quota Effective Time: 0:00, May 27, 2026, Beijing Time This price adjustment officially takes effect at 0:00 on May 27th, Beijing time, with global synchronization. We sincerely invite all developers to integrate and experience it. Increase the quantity without increasing the price, with the usage volume increased to 5-8 times the original, unlocking more abundant productivity for you Billing rules have been adjusted to be clearer, more understandable, and what you see is what you get. The Creator Incentive Program for Quadrillion Tokens Concluded Successfully Surprise: All existing TokenPlan user quotas have been fully reset |
The Xiaomi MiMo platform has announced permanent price reductions and significant system optimizations for its MiMo-V2.5 Series API, alongside adjustments to its Token Plan billing system, effective May 27, 2026. This update stems from continuous technological improvement and efforts to facilitate large-scale application of the models. A central component of the announcement involves the MiMo-V2.5 Series API price adjustment, which includes a potential reduction of up to ninety-nine percent compared to original API pricing, eliminating differentiation based on input length. This adjustment is coupled with the optimization of the Token Plan billing system, which is designed to increase productivity for users by expanding their quota to five to eight times the original amount without an increase in cost. Furthermore, all used credits within the validity period for the Token Plan will be fully reset, ensuring billing rules are clearer and more straightforward for all subscribers. The announcement also confirms the successful conclusion of the Quadrillion Token Creator Incentive Program, where all one hundred trillion tokens were distributed ahead of schedule. This event underscores the platform's commitment to enabling broader access to MiMo for developers and users. All existing Token Plan user quotas will be fully reset at the specified time, which impacts users who participated in the incentive program and those who previously subscribed to the plan. The underlying technological improvements contributing to these changes focus on optimizing the inference system. Xiaomi's technical team implemented optimizations, including support for Sliding Window Attention based on SGLang HiCache. This mechanism functions by reducing the data transfer volume of the Key-Value (KV) Cache across various storage levels, such as GPU memory, CPU memory, and SSD, to nearly one-seventh of the previous volume. Simultaneously, the system was enhanced to cache nearly five times the number of tokens compared to previous states, which significantly boosts the cache hit rate and overall inference efficiency. Additional enhancements were made to improve the cluster's input throughput capacity through optimizations in expert parallelism schemes and input length bucketing strategies, all aimed at continuously reducing the service cost per token while maintaining service quality. Ultimately, these systemic changes aim to foster the construction of a complete AI infrastructure chain by providing model services that combine low costs with high-quality capabilities. The mission emphasized is to leverage sustained, large-scale inference demand through continuous technological innovation to enable more individuals to utilize superior models. |