Chinese AI Model MiniMax-M1: Can It Really Beat Gemini 2.5 Pro at ¼ the Cost?

Recently, China’s AI scene has been buzzing. The name “MiniMax” suddenly shot to the center of discussion after the company announced the open-sourcing of its latest large language model—MiniMax-M1.

Officially, M1 is billed as the “world’s first open-source, large-scale hybrid-architecture reasoning model.” That sounds impressive on its own, but the real eye-catcher is its claimed “ultra-low training cost” and “ultra-long context reasoning capability.”

What is a context window, and why does it matter?

First, let’s explain the key term: context window.

In simple terms, this is the maximum amount of text the model can “read” and understand in a single pass.

For example:

- OpenAI’s GPT-4o can handle about 128,000 tokens (roughly the size of a novel).

- Google’s Gemini 2.5 Pro already processes 1 million tokens, with rumored work on 2 million-token versions.

MiniMax-M1 claims it can also run with a 1 million-token context window, putting it among the world’s best in “long-text understanding.”

Imagine discussing the full details of an entire document collection in one conversation—without needing to split things up or lose context.

Training cost: genuinely cheap

Beyond performance, MiniMax’s biggest selling point is cost.

According to their official report, M1’s reinforcement learning (RL) stage used just 512 NVIDIA H800 GPUs over three weeks, with a total cost of around $537,400 (about 3.8 million RMB).

For comparison:

- DeepSeek R1 reportedly cost $5–6 million.

- OpenAI’s GPT-4 is rumored to have exceeded $100 million.

MiniMax itself says this cost is “an order of magnitude lower than originally expected.”

They credit a new RL algorithm called CISPO, which they claim is twice as fast as ByteDance’s DAPO algorithm and needs only half the training steps.

In actual text generation tasks, MiniMax-M1 also claims to beat competitors on efficiency:

- For generating 64K tokens, it uses only half the compute of DeepSeek R1.

- For 100K tokens, it uses only a quarter of the competitor’s compute.

So people aren’t just calling this “cheap”—they’re calling it an efficiency revolution. In AI, getting the same or better results with fewer GPUs is a serious competitive edge.

Strong benchmark results too

Of course, you might ask: “Cheap is good, but is it actually useful?”

In its published benchmarks, MiniMax reports that M1 outperformed Google’s Gemini 2.5 Pro on a test called TAU-bench, which is specifically designed to evaluate models on tool use, reasoning, and planning.

Internationally, it has also attracted real attention. On the Artificial Analysis Intelligence Index rankings, it placed second among all open-source models globally.

In other words, this isn’t just some China-only hype model—it’s actually making waves in the global open-source community.

How can you use it?

Even better, MiniMax-M1 is fully open source:

- You can download the weights directly from GitHub and Hugging Face to deploy or customize them yourself.

- Or you can use their web-based chatbot directly at: [https://chat.minimax.io/](https://chat.minimax.io/)

For developers and companies, this means extremely low barriers to entry—you don’t need a tech giant’s budget to get access to a top-tier foundation model.

Not just the language model: a whole suite of new products

MiniMax’s “release week” wasn’t just about M1. They also launched:

- Hailuo 02 video generation model

- 3× the parameters of the previous generation, with 4× the training data

- Supports native 1080p video generation at a lower cost than peers

- Ranked #2 globally in the international AA Video “Image to Video” leaderboard

- Speech-02 speech model

- Took first place on both the Artificial Analysis and Hugging Face TTS Arena rankings

- Effectively a case of China achieving “corner overtaking” in the speech generation field

- MiniMax Agent

- Their own concept of a “reliable human-like” AI agent

- Capable of multi-step planning, decomposing complex tasks, and delivering expert-level solutions

- Includes a specialized video-creation Agent that can analyze scripts, plan, and generate full videos automatically

The strategy behind it: cost competition via efficiency

Many analysts see MiniMax’s strategy as quite compelling.

In the past, AI competition often looked like a “money-burning contest”—for example, GPT-4’s rumored training costs were sky-high, effectively locking smaller players out.

MiniMax is trying to break that compute-capital barrier through technical innovation. From architecture design, to a new RL algorithm, to ultra-lean cost control, their approach is all about efficiency competition—being cheaper, faster, and still high-performing.

This model could force other AI companies to innovate seriously on technology, rather than relying on “parameter bragging” or hyped valuations.

Back to the question: Can MiniMax-M1 really beat Gemini 2.5 Pro at ¼ the cost?

When it comes to context length, tool-use benchmarks, and cost-efficiency, it really does seem capable of matching or even outperforming Gemini 2.5 Pro in some critical use cases.

Especially for companies and researchers looking for open-source, low-cost deployment options, it’s an extremely appealing alternative.

Of course, whether it can beat the most expensive and powerful closed-source models in every scenario and benchmark still needs more testing. But there’s no doubt MiniMax-M1 has already claimed a very important spot in the global AI landscape.

Put simply: it’s a rare example of a Chinese AI company waging a “real technology-based price war.” It’s not just about being cheap—it’s about being efficient and capable, making global AI competition a lot more interesting to watch.

AI-Generated Code

AI-Generated Code Now Passes Turing Tests—Should Developers Worry?

A new study from UC San Diego has drawn significant attention for presenting what it calls the first solid, real-world evidence that an AI system can successfully navigate a classic three-person Turing test.

AI-Powered ‘Digital Immortality‘

AI-Powered ‘Digital Immortality’: Would You Upload Your Mind?

Movies have been playing with this idea for years. In Source Code, for example, the protagonist’s consciousness is uploaded into an 8-minute time loop, repeating endlessly to solve a mystery.

vivo X Fold5

vivo X Fold5: The Ultimate ‘Android Sidekick’ for iPhone Users

As a result, many Android manufacturers have been brainstorming: what if they could integrate better with the iPhone ecosystem, even manage Apple devices together?