Lumina-DiMOO: A Revolutionary Open-Source Multimodal Diffusion Model
2025-09-12
Lumina-DiMOO is an open-source foundational model for seamless multimodal generation and understanding. Unlike previous unified models, it uses a fully discrete diffusion modeling approach for all input and output modalities, resulting in significantly higher sampling efficiency compared to autoregressive or hybrid models. It adeptly handles tasks like text-to-image, image-to-image generation (including editing, subject-driven generation, and inpainting), and image understanding, achieving state-of-the-art performance on multiple benchmarks. The code and checkpoints are publicly available to advance research in multimodal and discrete diffusion modeling.
AI