Meta Introduces 'LLaMA 3.1' To Show That 'Open Source Is Leading The Way'

Thanks to the viral sensation of OpenAI's talkative chatbot, it's now an arms race of generative AIs.

The AI field was dull and quiet, and the buzz it created mostly happened within its own field, and rarely reach far beyond its own audience. But when OpenAI introduced ChatGPT as a AI chatbot tool, the internet was quickly captivated.

Meta responded with LLaMA, and later, with LLaMA 2.

Following that, Meta released LLaMA 3, which it said was trained with a "large, high-quality training dataset" featuring over 15 trillion tokens, 7x larger than LLaMA 2, and featuring 4x more code.

This time. Meta announces LLaMA 3.1.

Starting today, open source is leading the way. Introducing Llama 3.1: Our most capable models yet.

Today we’re releasing a collection of new Llama 3.1 models including our long awaited 405B. These models deliver improved reasoning capabilities, a larger 128K token context… pic.twitter.com/1iKpBJuReD
— AI at Meta (@AIatMeta) July 23, 2024

In a blog post, Meta said that it's committed to creating openly accessible AI, because the company believes that open source can bring open intelligence to everyone.

And here, Meta AI's LLaMA 3.1 405B having 405 billion parameters, is a milestone on its own, considering that LLaMA 3.1 is an open-source AI model.

"LLaMA 3.1 405B is in a class of its own, with unmatched flexibility, control, and state-of-the-art capabilities that rival the best closed source models," Meta said.

"Our new model will enable the community to unlock new workflows, such as synthetic data generation and model distillation."

In the blog post, its experimental evaluation suggests that its flagship model is competitive with leading foundation models across a range of tasks, including GPT-4, GPT-4o, and Claude 3.5 Sonnet.

"Additionally, our smaller models are competitive with closed and open models that have a similar number of parameters," Meta said.

With Llama 3.1, we evaluated performance on >150 benchmark datasets spanning a wide range of languages — in addition to extensive human evaluations in real-world scenarios. These results show that the 405B competes with leading closed source models like GPT-4, Claude 2 and Gemini… pic.twitter.com/bc0lSNJfQo
— AI at Meta (@AIatMeta) July 23, 2024

The open source initiative is in line with CEO Mark Zuckerberg vision.

Zuckerberg has created a letter to detail why open source is good for developers, good for Meta, and also good for the world.

"With past LLaMA models, Meta developed them for ourselves and then released them, but didn’t focus much on building a broader ecosystem. We’re taking a different approach with this release. We’re building teams internally to enable as many developers and partners as possible to use Llama, and we’re actively building partnerships so that more companies in the ecosystem can offer unique functionality to their customers as well," the CEO said.

"I believe the Llama 3.1 release will be an inflection point in the industry where most developers begin to primarily use open source, and I expect that approach to only grow from here. I hope you’ll join us on this journey to bring the benefits of AI to everyone in the world."

As Mark Zuckerberg shared in an open letter this morning: we believe that open source will ensure that more people around the world have access to the benefits and opportunities of AI, that power isn't concentrated in the hands of a small few, and that the technology can be…
— AI at Meta (@AIatMeta) July 23, 2024

LLaMA 3.1 405B is Meta's first openly available model that is developed to directly rival the top AI models when it comes to state-of-the-art capabilities in general knowledge, steerability, math, tool use, and multilingual translation.

With the release of the 405B model, Meta poised to supercharge innovation, because the company believes in that the AI can ignite new applications and modeling paradigms, including synthetic data generation to enable the improvement and training of smaller models, as well as model distillation.

According to Meta, this is a capability that "has never been achieved at this scale in open source."

Besides the 405B model, Meta is also introducing upgraded versions of the 8B and 70B models, which have their context length expanded to 128K. Meta also made them to support across eight languages.

" This enables our latest models to support advanced use cases, such as long-form text summarization, multilingual conversational agents, and coding assistants," said Meta.

Following the announcement, Meta is making these models available to the community for download on llama.meta.com and Hugging Face and available for immediate development on its broad ecosystem of partner platforms.

Published:

25/07/2024

Dark Mode

Search form

Meta Introduces 'LLaMA 3.1' To Show That 'Open Source Is Leading The Way'