
The large language models (LLMs) war is not only about brain power and sophistication. Speed and efficiency are equally important.
Following the arrival of ChatGPT, the AI landscape was disrupted as tech companies raced to develop ever more capable and intelligent language models. The rapid rise of OpenAI’s conversational AI put immense pressure on competitors to innovate, prompting Elon Musk to take action.
Driven by a desire to contribute to AI development while maintaining control over its ethical deployment, Musk founded xAI and set out to build Grok, which is a series of AI models designed to combine reasoning power with real-world tool integration, accessible to both consumers and enterprises.
The first iteration, Grok-1, was designed to establish a strong foundation for conversational reasoning. Grok-2 improved upon this, refining its contextual understanding and increasing the model’s ability to handle longer conversations while making incremental improvements in accuracy and coherence. Grok-3 marked a significant step forward, achieving much higher reasoning benchmarks and demonstrating the potential of reinforcement learning to optimize intelligence density, while remaining cost-efficient compared to larger models from competitors.
Grok-4 elevated the series to new heights, incorporating advanced reasoning abilities, a larger context window, and integrated tool use for web and X search. It demonstrated performance on par with larger models while keeping token costs under control, proving that a carefully engineered model could deliver frontier-level results without the massive compute requirements of other leading AI systems.
Grok 4 became the benchmark for xAI’s vision of accessible, powerful AI.
Now, xAI announces 'Grok-4 Fast,' which builds on all of these lessons to deliver a truly optimized experience.
Introducing Grok 4 Fast, a multimodal reasoning model with a 2M context window that sets a new standard for cost-efficient intelligence.
Available for free on https://t.co/AnXpIEOhOD, https://t.co/53pltypvkw, iOS and Android apps, and OpenRouter.https://t.co/3YZ1yVwueV— xAI (@xai) September 19, 2025
For starters, Grok-4 Fast preserves the reasoning power of Grok-4.
However, but it does so with exceptional token efficiency, achieving comparable performance using 40% fewer thinking tokens. Its unified architecture merges reasoning and non-reasoning modes, allowing for rapid responses to simple queries and extended reasoning for complex ones.
Through large-scale reinforcement learning, Grok-4 Fast is able to maximize its intelligence density, achieving performance comparable to Grok-4 while using 40% fewer thinking tokens on average. This efficiency translates to a substantial reduction in operational costs, making it 98% cheaper to reach the same benchmark performance.
This model achieves state-of-the-art cost efficiency, using only a 2M token context window.
In other words, Grok-4 Fast is designed to deliver frontier-level performance across both Enterprise and Consumer domains, by combining exceptional token efficiency with high-quality reasoning capabilities.
By pushing the boundaries of smaller and faster AI, Grok 4 Fast makes advanced reasoning more accessible to a broader range of users and developers, enabling high-performance AI applications without the typical costs associated with large-scale models.
This in turn represents the latest advancement in cost-efficient reasoning models.
Results are impressive.
Grok 4 Fast sets a new record on the Pareto Intelligence frontier as reported by @ArtificialAnlys. pic.twitter.com/zPJrmiKu4Y
— xAI (@xai) September 19, 2025
Across multiple reasoning benchmarks, Grok-4 Fast consistently outperforms previous models. On tests like GPQA Diamond, AIME 2025, and HMMT 2025, it demonstrates high accuracy while minimizing token usage. Evaluations show that its intelligence density allows it to deliver maximum performance at minimal cost, with token efficiency significantly higher than Grok 3 Mini and other models in its class.
Independent analyses also confirm that Grok-4 Fast offers a superior price-to-intelligence ratio, making it a leading option for cost-conscious AI deployments.
What's more, Grok-4 Fast also has agentic search abilities, which allow it to browse the web and X, ingesting media, hopping through links, and synthesizing data in real-time. Benchmark results for BrowseComp, SimpleQA, and internal X Browse evaluations highlight its ability to outperform peers in multi-hop search and real-time information retrieval.
We partnered with @lmarena_ai to evaluate Grok 4 Fast on both Search and Text Arena, achieving #1 and #8 respectively. pic.twitter.com/jFmGfJUKpZ
— xAI (@xai) September 19, 2025
In real-world scenarios, Grok 4 Fast demonstrates remarkable reasoning efficiency.
For instance, in gaming inquiries, it can investigate complex questions like the maximum experience points in Path of Exile 2, browsing multiple sources and calculating totals with precision. Its unified architecture enables it to handle both extended reasoning and rapid-response queries with the same model weights, reducing latency and token costs.
This approach ensures an optimal balance of speed and depth for various use cases, from casual questions to high-stakes analysis.
To achieve this feat, the AI model utilizes its native tool-use capabilities to further enhance its versatility.
Trained with reinforcement learning for tool invocation, Grok-4 Fast is able to effectively decide when to employ functions such as code execution or web browsing.
Available through grok.com and on iOS and Android apps, Grok-4 Fast delivers improved performance for search and information-seeking queries in Fast and Auto modes.
Grok 4 Fast is available now for all users in https://t.co/AnXpIEOhOD, https://t.co/53pltypvkw, iOS and Android apps in Fast and Auto modes.
All users, including free users, will have access to our latest model without restrictions, marking a significant step toward…— xAI (@xai) September 19, 2025
Developers can also access it via the xAI API as two distinct models: grok-4-fast-reasoning and grok-4-fast-non-reasoning, each supporting a 2M token context window. This flexibility allows fine-tuning of test-time compute to suit specific applications, making Grok-4 Fast ideal for both personal and professional use.
Continuous updates and enhancements, including multimodal capabilities and agentic features, promise to further expand its capabilities, ensuring that Grok-4 Fast remains at the forefront of efficient, high-performance AI.
With state-of-the-art cost efficiency, cutting-edge web and X search capabilities, and highly optimized intelligence density, Grok-4 Fast represents xAI’s vision of high-performance, accessible AI at a fraction of the cost, bringing advanced reasoning to a wider audience than ever before.
We’d love to hear your feedback on Grok 4 Fast. The team will be incorporating your responses into continuous model upgrades, so please be vocal on X.
If you want to push the limits of reasoning efficiency, please join us! https://t.co/Pn19i7sAOx— xAI (@xai) September 19, 2025