
The global AI race isn't just being run by Western giants like OpenAI, Google, or Meta. China, with its own tech powerhouses, is steadily catching up.
And among its most prominent contenders is Alibaba, the e-commerce titan turned cloud computing and AI juggernaut. It has what it calls 'Qwen' (Tongyi Qianwen), which is Alibaba's family of large language models (LLMs), developed to compete with the likes of ChatGPT from OpenAI and other Westerners.
This time, it has released 'Qwen3, which signals the Chinese tech giant's ambition not just to participate — but to lead.
Qwen 3, considered the third major version of this initiative.
As the successor of the notable Qwen2.5, it comes with notable upgrades across performance, multilingual capabilities, and open-source availability.
Soon after its release, experts are calling it yet another breakthrough in China's booming open-source AI space.
Alibaba unveils #Qwen3, the latest in its open-sourced LLM family, featuring 6 dense models & 2 MoE models to power #AI innovation across industries. From mobile to robotics, the future is here.
Blog: https://t.co/dSAKhzcNho
GitHub: https://t.co/vGBlwSvTlO
Hugging Face:… pic.twitter.com/Uh04PrHrls— Alibaba Group (@AlibabaGroup) April 29, 2025
In a blog post, Alibaba said that:
Qwen3 comes with 2 MoE (mixture-of-experts) models: Qwen3-235B-A22B, a large model with 235 billion total parameters and 22 billion activated parameters, and Qwen3-30B-A3B, a smaller MoE model with 30 billion total parameters and 3 billion activated parameters.
The MoE approach, which was first popularized by Mistral, involves having several different specialty model types combined into one, with only those relevant models to the task at hand being activated when needed in the internal settings of the model (known as parameters).
Introducing Qwen3!
We release and open-weight Qwen3, our latest large language models, including 2 MoE models and 6 dense models, ranging from 0.6B to 235B. Our flagship model, Qwen3-235B-A22B, achieves competitive results in benchmark evaluations of coding, math, general… pic.twitter.com/JWZkJeHWhC— Qwen (@Alibaba_Qwen) April 28, 2025
This architecture enables selective activation of model segments during inference, significantly cutting down deployment costs without compromising performance.
With Qwen 3, Alibaba isn’t just refreshing its LLM lineup — it’s making a debut into the realm of the so-called "hybrid reasoning models," which it says combines traditional LLM capabilities with "advanced, dynamic reasoning."
According to the company, these models are designed to intelligently switch between two modes: a "thinking mode" for solving complex tasks such as coding and a "non-thinking mode" for faster, general-purpose responses.
Qwen3 models are supporting 119 languages and dialects. This extensive multilingual capability opens up new possibilities for international applications, enabling users worldwide to benefit from the power of these models. pic.twitter.com/rwU9GWWP0K
— Qwen (@Alibaba_Qwen) April 28, 2025
Here, with Qwen3, users can engage the more intensive “Thinking Mode” using the button marked as such on the Qwen Chat website, or by embedding specific prompts like /think or /no_think when deploying the model locally or through the API.
"Notably, the Qwen3-235B-A22B MoE model significantly lowers deployment costs compared to other state-of-the-art models, reinforcing Alibaba's commitment to accessible, high-performance AI," Alibaba said.
The flagship, Qwen3-235B-A22B, achieves competitive results in benchmark evaluations of coding, math, general capabilities, etc., when compared to other top-tier models such as DeepSeek's DeepSeek-R1, OpenAI o1, OpenAI o3-mini, xAI Grok-3, and Google Gemini-2.5-Pro.
Overall, the benchmark data positions Qwen3-235B-A22B as one of the most powerful publicly available models, achieving parity or superiority relative to major industry offerings.
Additionally, the small MoE model, Qwen3-30B-A3B, outcompetes QwQ-32B with 10 times of activated parameters, and even a tiny model like Qwen3-4B can rival the performance of Qwen2.5-72B-Instruct.
We have optimized the Qwen3 models for coding and agentic capabilities, and also we have strengthened the support of MCP as well. Below we provide examples to show how Qwen3 thinks and interacts with the environment. pic.twitter.com/7xFyJPp48g
— Qwen (@Alibaba_Qwen) April 28, 2025
Additionally, Qwen3 also comes in 6 dense models, ranging from 0.6B to 235B, which are also open-weighted: Qwen3-32B, Qwen3-14B, Qwen3-8B, Qwen3-4B, Qwen3-1.7B, and Qwen3-0.6B, under Apache 2.0 license.
These models vary in size and architecture, offering users options to fit diverse needs and computational budgets.
By introducing Qwen3 and its variants, it’s a clear move that Alibaba plans on making high-performance AI more accessible, both in China and globally.
According to Alibaba, Qwen LLMs have become some of the the world's most widely adopted open-source AI model series, attracting more than 300 million downloads worldwide and more than 100,000 derivative models on Hugging Face.
"From mobile to robotics, the future is here," the team at Qwen said.