Alibaba Unveils Qwen3-Omni: A Native End-to-End Multimodal Foundation Model

2025-09-22
Alibaba Unveils Qwen3-Omni: A Native End-to-End Multimodal Foundation Model

Alibaba has released Qwen3-Omni, a native end-to-end multilingual omni-modal foundation model. It processes text, images, audio, and video in real-time, delivering streaming responses in text and natural speech. Qwen3-Omni achieves state-of-the-art results across numerous benchmarks, boasts support for multiple languages, and features a novel MoE architecture and flexible control. The model, along with its toolkits, cookbooks, and demos, is open-sourced, providing developers with extensive resources.

AI