Xiaomi MiMo-v2.5 Series API Permanent Price Reduction Up to 99%

Recorded: May 26, 2026, 7:03 p.m.

Original

Summarized

Xiaomi MiMo API Open PlatformMiMo-V2.5 prices cut by up to 99%. Token Plan gets more Credits — quota increased 5–8×, all used Credits within validity period fully reset. Effective May 27, 2026 at 00:00 AM CST. Learn More →MiMo-V2.5 prices cut by up to 99%. Token Plan gets more Credits — quota increased 5–8×, all used Credits within validity period fully reset. Effective May 27, 2026 at 00:00 AM CST. Learn More →MiMo-V2.5 prices cut by up to 99%. Token Plan gets more Credits — quota increased 5–8×, all used Credits within validity period fully reset. Effective May 27, 2026 at 00:00 AM CST. Learn More →MiMo-V2.5 prices cut by up to 99%. Token Plan gets more Credits — quota increased 5–8×, all used Credits within validity period fully reset. Effective May 27, 2026 at 00:00 AM CST. Learn More →MiMo-V2.5 prices cut by up to 99%. Token Plan gets more Credits — quota increased 5–8×, all used Credits within validity period fully reset. Effective May 27, 2026 at 00:00 AM CST. Learn More →MiMo-V2.5 prices cut by up to 99%. Token Plan gets more Credits — quota increased 5–8×, all used Credits within validity period fully reset. Effective May 27, 2026 at 00:00 AM CST. Learn More →MiMo-V2.5 prices cut by up to 99%. Token Plan gets more Credits — quota increased 5–8×, all used Credits within validity period fully reset. Effective May 27, 2026 at 00:00 AM CST. Learn More →Refer & earnEnglishRefer & earnContact UsDocumentationToken PlanConsoleBlogEnglishCtrlWelcomeQuick StartFirst API CallModel and Rate LimitsModel HyperparametersError CodesPricingPay‑As‑You‑Go APIToken PlanSubscription InstructionsQuick AccessNewsMiMo-V2.5 Series Price Adjustment Announcement | 100 Trillion Token Creator Incentive Plan ConcludesXiaomi MiMo-V2.5 series open-sourced & Orbit 100 trillion token plan launchedXiaomi MiMo-V2.5-TTS-Series + ASR Officially Launched: Your Voice, Under Your ControlXiaomi MiMo-V2.5 Series Large Model Launches Public BetaPrevious NewsAPI ReferenceChatOpenAI APIAnthropic APIIntegration ExtensionOverview of AI ToolsOpenCode ConfigurationClaude Code ConfigurationOpenClaw ConfigurationHermes Agent ConfigurationKilo Code ConfigurationCherry Studio ConfigurationQwen Code ConfigurationCodeBuddy ConfigurationCline ConfigurationUsage GuideTool CallingWeb SearchMultimodal UnderstandingImage UnderstandingAudio UnderstandingVideo UnderstandingSpeech synthesis (MiMo-V2.5-TTS Series)Speech synthesis (MiMo-V2-TTS)[Important Notice]Passing Back reasoning_content in Multi-Turn Conversations for Agent ProductsPromotionsRefer & earnFAQUpdate LogModel ReleaseFeature UpdatesTerms & AgreementsService AgreementPrivacy PolicyDeveloper CommunityTry MiMo Claw for FreeHOTDocumentationNewsMiMo-V2.5 Series Price Adjustment Announcement | 100 Trillion Token Creator Incentive Plan ConcludesMiMo-V2.5 Series Price Adjustment Announcement | 100 Trillion Token Creator Incentive Plan Concludes
Over the past few months, through activities such as MiMo Orbit and the Quadrillion Token Creator Incentive Program, we have enabled more people to experience MiMo and solve real problems - this is the first step for MiMo on the path to large-scale application.
Now, with the continuous improvement of underlying technologies, we can finally do something more thorough - permanently renovate the entire model pricing system.
Quick Overview of the Core of This Announcement:

MiMo-V2.5 Series API Permanent Price Reduction

Token Plan billing system optimization, with usage increased to 5-8 times the original

The Creator Incentive Program for Quadrillion Tokens Concludes Successfully

Full reset of the current effective Token Plan user quota

Effective Time: 0:00, May 27, 2026, Beijing Time
MiMo-V2.5 Series API Permanent Price Reduction
Compared to the original API pricing, the new pricing can have a maximum reduction of up to 99%, and no longer differentiates based on the input length.

This price adjustment officially takes effect at 0:00 on May 27th, Beijing time, with global synchronization. We sincerely invite all developers to integrate and experience it.
Optimization of TokenPlan Billing System

Increase the quantity without increasing the price, with the usage volume increased to 5-8 times the original, unlocking more abundant productivity for you

Billing rules have been adjusted to be clearer, more understandable, and what you see is what you get.

The Creator Incentive Program for Quadrillion Tokens Concluded Successfully
Since its launch on April 28, the "Trillion Token Creator Incentive Program" has been enthusiastically pursued and widely followed by users worldwide. As of 16:08 on May 26, Beijing Time, all 100T Tokens have been fully distributed ahead of schedule, and the event has concluded successfully ahead of schedule. We thank all developers for their enthusiastic participation!
Note: The exclusive welfare activities for members of the Apache Software Foundation are valid for a long term, can continue to be applied for, and are not affected by this finalization.

Surprise: All existing TokenPlan user quotas have been fully reset
Regardless of the current usage of the package, the Credits quota of all users who have subscribed to the Token Plan and are still within the validity period (including users who participated in the Quadrillion Token Creator Incentive Program and obtained the Token Plan, covering users with exclusive benefits from the Apache Software Foundation) will be fully reset at 0:00 on May 27th, Beijing Time, and implemented according to the new billing rules.
One More Thing: For historical paid users whose Token Plan has expired, we have also prepared surprise gifts, which will be announced within the next week. Please stay tuned.
Optimization Instructions for Inference Technology
Behind this price adjustment is the continuous optimization of the inference system by Xiaomi's technical team.
We fully support SWA (Sliding Window Attention) based on SGLang HiCache, reducing the data transfer volume of KV Cache among multi-level storage such as GPU memory, CPU memory, and SSD to nearly 1/7 of that before optimization, and increasing the number of cacheable tokens to nearly 5 times of that before optimization, significantly improving cache hit rate and inference efficiency.
Meanwhile, we further enhanced the input throughput capacity of the cluster by optimizing the expert parallelism scheme, input length bucketing strategy, etc., thereby continuously reducing the service cost per token while ensuring service quality.
Conclusion
The value of technology ultimately lies in the breadth of its use.
Relying on continuous technological innovation, we hope to leverage real, sustainable, and large-scale inference demand by providing model services that combine low cost with top-notch capabilities, thereby promoting the construction of a complete AI infrastructure chain.
Enabling more people to use better models - this is MiMo's unwavering mission.Update Time May 27, 2026Quick AccessXiaomi MiMo-V2.5 series open-sourced & Orbit 100 trillion token plan launchedTable of ContentsScroll to topCopyright©2025 Xiaomi. All Rights Reserved | Xiaomi MiMo Open Platform Service Agreement | Xiaomi MiMo Open Platform Privacy Policy | Cookie Policy | Cookie Preferences

The Xiaomi MiMo platform has announced permanent price reductions and significant system optimizations for its MiMo-V2.5 Series API, alongside adjustments to its Token Plan billing system, effective May 27, 2026. This update stems from continuous technological improvement and efforts to facilitate large-scale application of the models.

A central component of the announcement involves the MiMo-V2.5 Series API price adjustment, which includes a potential reduction of up to ninety-nine percent compared to original API pricing, eliminating differentiation based on input length. This adjustment is coupled with the optimization of the Token Plan billing system, which is designed to increase productivity for users by expanding their quota to five to eight times the original amount without an increase in cost. Furthermore, all used credits within the validity period for the Token Plan will be fully reset, ensuring billing rules are clearer and more straightforward for all subscribers.

The announcement also confirms the successful conclusion of the Quadrillion Token Creator Incentive Program, where all one hundred trillion tokens were distributed ahead of schedule. This event underscores the platform's commitment to enabling broader access to MiMo for developers and users. All existing Token Plan user quotas will be fully reset at the specified time, which impacts users who participated in the incentive program and those who previously subscribed to the plan.

The underlying technological improvements contributing to these changes focus on optimizing the inference system. Xiaomi's technical team implemented optimizations, including support for Sliding Window Attention based on SGLang HiCache. This mechanism functions by reducing the data transfer volume of the Key-Value (KV) Cache across various storage levels, such as GPU memory, CPU memory, and SSD, to nearly one-seventh of the previous volume. Simultaneously, the system was enhanced to cache nearly five times the number of tokens compared to previous states, which significantly boosts the cache hit rate and overall inference efficiency. Additional enhancements were made to improve the cluster's input throughput capacity through optimizations in expert parallelism schemes and input length bucketing strategies, all aimed at continuously reducing the service cost per token while maintaining service quality.

Ultimately, these systemic changes aim to foster the construction of a complete AI infrastructure chain by providing model services that combine low costs with high-quality capabilities. The mission emphasized is to leverage sustained, large-scale inference demand through continuous technological innovation to enable more individuals to utilize superior models.