
The AI industry is getting more and more crowded, as more entities enter the field and race towards supremacy.
Following OpenAI's revelation of ChatGPT, tech companies, large and small, have all began competing in an arms race, where they develop increasingly powerful Large Language Models. And AI-startup Black Forest Labs is just one of the many.
Here, the company announced the launch of its first suite of text-to-image AI models, called FLUX.1.
The German-based company, founded by researchers who developed the technology behind Stable Diffusion and invented the latent diffusion technique, aims to create advanced generative AI for images and videos.
The launch of FLUX.1 came about seven weeks after Stability AI's troubled release of Stable Diffusion 3 Medium in mid-June.
It all began when Stability AI faced widespread criticism, where many complained about its poor performance, including how it generally failed in creating human anatomy.
That problematic launch followed the earlier departure of three key engineers from Stability AI-Robin Rombach, Andreas Blattmann, and Dominik Lorenz-who went on to found Black Forest Labs along with latent diffusion co-developer Patrick Esser and others.
And here, with Black Forest Labs, the developers learned their previous mistakes and teamed up to create three FLUX.1 text-to-image models.
They include the high-end commercial "pro" version, a mid-range "dev" version with open weights for non-commercial use, and a faster open-weights "schnell" version ("schnell" means quick or fast in German).
Black Forest Labs claims that its models outperform existing options like OpenAI's DALL·E 3 in areas such as image quality and adherence to text prompts.
To do this, Black Forest Labs created the FLUX.1 AI models using what the company calls a "hybrid architecture," which combines transformer and diffusion techniques, scaled up to 12 billion parameters.
Black Forest Labs said that it improves on previous diffusion models by incorporating flow matching and other optimizations.
As a result of this, FLUX.1 is more capable in generating human hands and limbs, which was a weak spot in the earlier versions of Stable Diffusion, due to a lack of training images that focused on hands.

"We believe that generative AI will be a fundamental building block of all future technologies," the company stated in its announcement. "By making our models available to a wide audience, we want to bring its benefits to everyone, educate the public and enhance trust in the safety of these models."
Then, after capability, the team aimed for speed.
And here, Black Forest Labs utilizes Runware AI's platform.
Using its custom-designed Sonic Inference Engine, Black Forest Labs is able to launch its FLUX. 1 with almost no delay.
According to Runware in its documentations page, FLUX. 1 leverages its Sonic Inference Engine to be able to deliver high-quality media at sub-second speeds.
"We have built this unique platform from scratch, hosted on our own infrastructure that is powered by green energy. We have also optimized the Stable Diffusion stack from the OS level upwards, achieving exceptional speeds and cost efficiency that we pass on to you," said Runware.
Comment
byu/Runware from discussion
inArtificialInteligence
While text-to-image generation is Black Forest Labs's current focus, the company plans to expand into video generation next.
The company said that FLUX.1 shall soon serve as the foundation of a new text-to-video model in development, which shall compete with OpenAI's Sora, Runway's Gen-3 Alpha, and Kuaishou's Kling in a race towards creating media reality on demand.
"Our video models will unlock precise creation and editing at high definition and unprecedented speed," the Black Forest announcement claims.
It's worth noting through, that FLUX. 1 may be better than Stable Diffusion in generating hands and limbs, but it's still not perfect.