A nameless AI model appeared on OpenRouter on March 11, 2026, consumed over one trillion tokens in a single week, and topped the platform’s daily usage charts — all before anyone knew who built it. The developer community assumed DeepSeek. They were wrong. Xiaomi’s MiMo team, led by former DeepSeek researcher Luo Fuli, confirmed on March 18 that Hunter Alpha was an early internal test build of MiMo-V2-Pro, a trillion-parameter flagship model engineered specifically as the “brain” of AI agents.
MiMo-V2-Pro features 42 billion active parameters, a one-million-token context window, and a hybrid attention mechanism with a 7:1 ratio. On ClawEval, it scores 61.5 — approaching Claude Opus 4.6’s 66.3. On SWE-bench Verified, it hits 78.0%, surpassing Claude Sonnet 4.6 in coding ability. The pricing: $1 per million input tokens, $3 per million output tokens. That represents a 67% cost reduction compared to Claude Sonnet 4.6 for input and 80% for output.
The Stealth Launch That Sparked a Global Guessing Game
Hunter Alpha scored 61.5 on ClawEval and 84.0 on PinchBench within days of its anonymous deployment on OpenRouter, triggering immediate speculation that DeepSeek was secretly testing its next-generation V4 system. The reality — that a smartphone manufacturer had quietly built one of the world’s most capable agent models — took a full week to surface.
Anonymous Arrival on OpenRouter
OpenRouter listed the model on March 11, 2026, with zero developer attribution. No company name, no model card, no affiliated repository. The platform described it simply as a “stealth model.” A notice on its profile stated that all user prompts and completions were logged and could be used for model improvement — standard language for a test deployment, but unusual for a production release.
Within 48 hours, the model was processing billions of tokens daily. Developers integrating it into agent frameworks like OpenClaw reported strong multi-step reasoning and unusually stable tool-call behavior. By March 15, according to community tracking, Hunter Alpha had processed over 160 billion tokens and was climbing toward the trillion mark.
The DeepSeek V4 Speculation
When tested, the chatbot identified itself as a Chinese AI model primarily trained in Chinese, with a training data cutoff matching DeepSeek’s publicly available systems. Chinese media had reported that DeepSeek’s V4 model could launch as early as April 2026. The timing seemed too convenient to be coincidental.
AI engineers pointed to the model’s chain-of-thought reasoning patterns as further evidence of DeepSeek lineage. According to Mashable and India Today, the speculation reached mainstream tech coverage within days. Independent benchmark tester Umur Ozkul, however, pushed back. His analysis of token behavior and architectural patterns suggested Hunter Alpha was likely not DeepSeek V4 — a conclusion that proved correct.
Xiaomi Breaks Silence
On March 18, 2026, Xiaomi’s MiMo team issued the confirmation. According to Reuters, Luo Fuli’s team described the rapid anonymous adoption as a “quiet ambush,” reflecting the broader industry shift from chatbot-centric AI toward agent-based systems. The model had been internally designated MiMo-V2-Pro since before its OpenRouter deployment. A separate anonymous model called Healer Alpha, which had also appeared on OpenRouter, was simultaneously confirmed as MiMo-V2-Omni — Xiaomi’s multimodal variant supporting text, image, audio, and video inputs.
| Date | Event | Source |
|---|---|---|
| March 11, 2026 | Hunter Alpha appears anonymously on OpenRouter | OpenRouter platform |
| March 12-14 | DeepSeek V4 speculation spreads across tech media | Mashable, India Today, Reddit r/LocalLLaMA |
| March 15 | Model surpasses 160 billion tokens processed | Community tracking, OpenRouter data |
| March 18, 2026 | Xiaomi MiMo team confirms Hunter Alpha = MiMo-V2-Pro | Reuters, VentureBeat |
| March 18, 2026 | Official public release with API access and framework partnerships | Xiaomi, OpenRouter |
MiMo-V2-Pro Architecture and Technical Specifications
MiMo-V2-Pro is a trillion-parameter model with 42 billion active parameters, a one-million-token context window, and a maximum output length of 32,000 tokens. The architecture is built around a hybrid attention mechanism with a 7:1 ratio and includes a Multi-Token Prediction (MTP) layer — design choices that prioritize sustained reasoning over conversational fluency.
Hybrid Attention and MTP Layer
The 7:1 hybrid attention ratio means the model allocates seven attention layers optimized for local pattern recognition for every one layer devoted to global context aggregation. This balance allows MiMo-V2-Pro to maintain coherent goal-state tracking across extremely long task horizons — the exact failure mode where smaller models degrade in multi-step agent pipelines.
The Multi-Token Prediction layer accelerates inference by predicting multiple tokens simultaneously rather than one at a time. According to Xiaomi’s technical documentation referenced by CTOL Digital Solutions, this architectural choice improves generation efficiency without sacrificing output quality on reasoning-heavy tasks.
The “Agent Brain” Design Philosophy
Xiaomi positions MiMo-V2-Pro explicitly as an “AI agent brain” — not a chatbot, not a general-purpose assistant. The training pipeline focused on real-world task completion ability rather than benchmark optimization. According to BigGo Finance’s coverage of Xiaomi’s announcement, the company emphasized that “the model’s optimization focus is not on benchmark scores but on real-world task completion ability.”
What does that look like in practice? The model holds its thread across dozens of sequential tool calls without drifting off-task. It decomposes problems into sub-steps reliably and corrects its own mistakes mid-chain. During the Hunter Alpha stealth period, Xiaomi iterated on these exact behaviors using real developer feedback from OpenClaw integrations — effectively running a live stress test with no safety net of brand goodwill.
| Specification | MiMo-V2-Pro | Claude Opus 4.6 | Claude Sonnet 4.6 |
|---|---|---|---|
| Total Parameters | 1 trillion | Undisclosed | Undisclosed |
| Active Parameters | 42 billion | Undisclosed | Undisclosed |
| Context Window | 1,048,576 tokens | 200,000 tokens | 200,000 tokens |
| Max Output | 32,000 tokens | 32,000 tokens | 16,000 tokens |
| Multimodal | Text only | Text + Image | Text + Image |
| Architecture | Hybrid attention (7:1), MTP | Transformer | Transformer |
Benchmark Performance: Where Hunter Alpha Stands
MiMo-V2-Pro ranks eighth globally and second in China on the Artificial Analysis Intelligence Index with a score of 49. On agent-specific benchmarks, it places third globally on both PinchBench (84.0) and ClawEval (61.5), approaching Claude Opus 4.6’s leading scores while costing a fraction of the price.
Coding: SWE-bench and Terminal-Bench
On SWE-bench Verified, MiMo-V2-Pro scores 78.0% — surpassing Claude Sonnet 4.6 and approaching the higher-end Opus level. According to VentureBeat’s analysis, the model’s coding performance was one of the earliest signals that Hunter Alpha was not a mid-tier model masquerading under an anonymous label but a genuine frontier contender.
Terminal-Bench 2.0 results reinforce the coding story. The model handles complex multi-file refactoring, dependency resolution, and build-system debugging with consistency that developers in the OpenClaw community have described as “production-ready.” Xiaomi’s related model, MiMo-V2-Flash, ranks as the top open-source model globally on SWE-bench Verified and SWE-bench Multilingual — delivering performance comparable to Claude Sonnet 4.5 at roughly 3.5% of the cost.
Agent Tasks: ClawEval and PinchBench
ClawEval measures end-to-end agent capability — the model’s ability to receive a high-level instruction, decompose it into sub-tasks, execute tool calls, handle errors, and deliver a completed result. MiMo-V2-Pro’s score of 61.5 trails Claude Opus 4.6 (66.3) but significantly outpaces other models in its price range.
PinchBench focuses on narrower agent scenarios: single-tool-call accuracy, parameter extraction reliability, and response format compliance. At 84.0, MiMo-V2-Pro demonstrates the kind of tool-call stability that matters most in production agent pipelines, where a single malformed function call can cascade into complete task failure.
| Benchmark | MiMo-V2-Pro | Claude Opus 4.6 | Claude Sonnet 4.6 |
|---|---|---|---|
| ClawEval (Agent) | 61.5 | 66.3 | — |
| PinchBench (Agent) | 84.0 | — | — |
| SWE-bench Verified | 78.0% | — | <78.0% |
| Artificial Analysis Index | 49 (#8 global, #2 China) | — | — |
Pricing, API Access, and Framework Partnerships
MiMo-V2-Pro is priced at $1 per million input tokens and $3 per million output tokens for contexts up to 256K tokens, rising to $2 and $6 respectively for the 256K-to-1M context tier. Xiaomi offered one week of free API access to developers through partnerships with five major agent frameworks at launch.
Cost Comparison With Competing Models
According to Artificial Analysis, MiMo-V2-Pro ranks as the top cost-effective AI model in its performance tier. The pricing represents a 67% reduction versus Claude Sonnet 4.6 on input tokens and an 80% reduction on output tokens. For agentic workloads — where output token volume can be 3-5x input due to reasoning traces and tool calls — that output cost gap compounds rapidly.
KuCoin’s financial analysis highlighted that the model’s Intelligence Index score of 49 significantly exceeds the average for models in its price tier, creating what analysts have called an “efficiency frontier” — a cost-performance ratio that no other publicly available model currently matches.
| Model | Input / 1M tokens | Output / 1M tokens | Context Window | vs. MiMo-V2-Pro Cost |
|---|---|---|---|---|
| MiMo-V2-Pro (Hunter Alpha) | $1.00 | $3.00 | 1M tokens | Baseline |
| Claude Sonnet 4.6 | $3.00 | $15.00 | 200K tokens | +200% input, +400% output |
| Claude Opus 4.6 | $15.00 | $75.00 | 200K tokens | +1400% input, +2400% output |
| MiMo-V2-Flash | ~$0.10 | ~$0.50 | — | 90% cheaper, lighter model |
Supported Agent Frameworks
Xiaomi partnered with five agent frameworks for the launch: OpenClaw, OpenCode, KiloCode, Blackbox, and Cline. Each platform offered one week of free MiMo-V2-Pro access for developers building agent-based applications. The integrations were not superficial wrappers — MiMo-V2-Pro was tested and iterated within these frameworks during the Hunter Alpha stealth period, with real usage feedback shaping the model’s tool-call stability improvements before official release.
OpenRouter serves as the primary API gateway. The model identifier is xiaomi/mimo-v2-pro in the OpenRouter catalog. Existing OpenAI-compatible integrations require only a model name swap to route through MiMo-V2-Pro. One practical advantage of the million-token context window: a developer can feed an entire medium-sized codebase (roughly 50,000 lines of code) into a single prompt — no chunking, no retrieval-augmented generation overhead, no lost context between segments. For agent pipelines that need to reason across an entire repository to plan a refactor, that is a qualitative capability change, not just a quantitative one.
The Broader MiMo-V2 Family: Flash, Omni, and TTS
Hunter Alpha grabbed the headlines, but it was only one piece. Xiaomi’s March 18 announcement revealed a full model family: MiMo-V2-Pro handles text reasoning, MiMo-V2-Omni covers multimodal input, MiMo-V2-Flash targets cost-sensitive coding workloads, and MiMo-V2-TTS generates speech. Four models, four layers of the agent stack, all launched simultaneously.
MiMo-V2-Omni (Healer Alpha)
A second anonymous model called Healer Alpha had been running on OpenRouter alongside Hunter Alpha. Xiaomi confirmed it as MiMo-V2-Omni — a multimodal model supporting text, image, audio, and video inputs with a 262,000-token context window. While Xiaomi has not issued a full technical announcement for MiMo-V2-Omni, the model’s presence during the stealth testing phase suggests it is nearing production readiness.
MiMo-V2-Flash: The Budget Coding Powerhouse
MiMo-V2-Flash deserves separate attention. It ranks as the number-one open-source model globally on both SWE-bench Verified and SWE-bench Multilingual — delivering performance comparable to Claude Sonnet 4.5 at approximately 3.5% of the cost. For teams that need strong coding assistance without the full agent-reasoning stack, Flash represents an extraordinary cost-performance tradeoff.
MiMo-V2-TTS: Voice for the Agent Stack
Rounding out the family, MiMo-V2-TTS handles text-to-speech synthesis. Xiaomi has released fewer technical details about this model, but its inclusion signals an intent to cover the full agent interaction loop — reasoning, vision, and voice — within a single model ecosystem. For agent pipelines that need to communicate results audibly (phone-based assistants, accessibility interfaces, voice-first applications), having TTS from the same provider reduces integration friction and latency.
What Hunter Alpha Means for the AI Industry
Six months ago, nobody expected Xiaomi to ship a frontier-class AI model. The gap between dedicated AI labs and large technology companies building their own foundation models is closing at a pace that has caught even industry insiders off guard. A smartphone company producing a model that nearly matches Claude Opus 4.6 on agent tasks forces a hard question: who else is building quietly?
Accelerating LLM Commoditization in China
According to CTOL Digital Solutions, MiMo-V2-Pro demonstrates that “China’s AI talent density is commoditizing LLMs faster than anyone expected.” The MiMo team’s lineage — Luo Fuli previously worked at DeepSeek — illustrates how researcher mobility between Chinese AI organizations is compressing the development timeline for frontier models. The result is downward pressure on API pricing globally, as providers must compete against models that deliver near-frontier performance at budget-tier costs.
From Chatbots to Agents
MiMo-V2-Pro’s design philosophy signals where the industry is heading. Optimizing for task completion rather than conversational quality reflects a bet that the next wave of AI value creation will come from autonomous agent systems — not better chat interfaces. The stealth launch strategy reinforced this: Hunter Alpha’s trillion-token adoption happened almost entirely through agent framework integrations, not consumer chat usage.
Frequently Asked Questions
What is Hunter Alpha?
Hunter Alpha is the anonymous codename under which Xiaomi deployed its MiMo-V2-Pro AI model on the OpenRouter platform starting March 11, 2026. The model ran without any developer attribution for one week before Xiaomi officially confirmed ownership on March 18, 2026. During that stealth period, it processed over one trillion tokens and topped OpenRouter’s daily usage charts.
Who created Hunter Alpha?
Xiaomi’s MiMo AI model team built Hunter Alpha. The team is led by Luo Fuli, a former DeepSeek researcher. Xiaomi confirmed the connection on March 18, 2026, after a week of widespread speculation that the model belonged to DeepSeek.
How does Hunter Alpha compare to Claude Opus 4.6?
On ClawEval, MiMo-V2-Pro (Hunter Alpha) scores 61.5 compared to Claude Opus 4.6’s 66.3 — a narrow gap. On coding tasks measured by SWE-bench Verified, MiMo-V2-Pro scores 78.0%, surpassing Claude Sonnet 4.6. The critical differentiator is price: MiMo-V2-Pro costs $1/$3 per million input/output tokens versus Claude Opus 4.6’s $15/$75, making it over 90% cheaper while approaching similar agent performance.
What does Hunter Alpha cost to use?
MiMo-V2-Pro is priced at $1 per million input tokens and $3 per million output tokens for contexts up to 256K tokens. For the extended 256K-to-1M context tier, pricing doubles to $2 and $6 respectively. Xiaomi offered one week of free API access at launch through partnerships with OpenClaw, OpenCode, KiloCode, Blackbox, and Cline.
Is Hunter Alpha / MiMo-V2-Pro open source?
Xiaomi has not released MiMo-V2-Pro’s weights as open source. The model is accessible exclusively through API access on OpenRouter and Xiaomi’s first-party platform. However, the related MiMo-V2-Flash model has open-source availability and ranks as the top open-source model globally on SWE-bench benchmarks.
What is Healer Alpha?
Healer Alpha is the anonymous codename for Xiaomi’s MiMo-V2-Omni, a multimodal AI model that supports text, image, audio, and video inputs with a 262,000-token context window. It appeared on OpenRouter alongside Hunter Alpha and was confirmed by Xiaomi on the same day.
What is Hunter Alpha’s context window?
MiMo-V2-Pro supports a context window of 1,048,576 tokens (approximately one million tokens) — one of the largest available among production AI models. The maximum output length is 32,000 tokens per response.
Where Hunter Alpha Goes From Here
Xiaomi’s MiMo-V2-Pro has already proven its market thesis. A week of anonymous deployment, a trillion tokens of real-world usage, and benchmark scores that rival models costing 10-25x more — that is a data point the industry cannot ignore. The Reuters report confirming Xiaomi’s ownership marked not just the end of a guessing game but the beginning of a new competitive dynamic in AI pricing.
The MiMo team has signaled that future development will focus on high-complexity reasoning and long-horizon task planning — the exact capabilities that enterprise agent deployments demand. With MiMo-V2-Flash already dominating open-source coding benchmarks and MiMo-V2-Omni waiting in the wings for multimodal tasks, Xiaomi is building a full-stack agent infrastructure, not a single model. Hunter Alpha’s stealth period proved something no marketing campaign ever could: the model works at scale, under pressure, with no brand reputation cushioning the results.








