Nvidia Quietly Announces 'Nemotron', A LLaMA-Based AI That Surpasses OpenAI GPT-4

During the boom of Large Language Models, tech companies race to create better and better AI models.

In the even that happened following OpenAI revelation of ChatGPT, many have entered the fierce battle, where they compete for supremacy. They develop products as fast as they can speak, trying to capitalize the trend.

And Nvidia, is apparently one of the competitors.

But unlike most others, the GPU-maker it a shy one.

Back in June, the company released the Nemotron-4 340B model, which lets developers generate rich synthetic data. This time, the company reveals Nemotron 70B model.

At this time, the market is saturated.

There are plenty of LLMs that can do pretty much the same thing. And as a result, Nvidia is competing in a fierce battle, where many are already neck-and-neck.

But here, Nvidia knows that the current generative AI models are facing challenges related to robustness, accuracy, efficiency, cost, and handling nuanced human-like responses. Despite AIs becoming increasingly smarter, a more scalable and efficient solutions that can deliver precise outputs while being practical for diverse AI applications is still required.

To compete, Nvidia, regarded as one of the biggest players in the field, in terms of influence in the tech world, is moving fast.

Here, Nvidia introduces the Nemotron 70B, as a model built to offer a new benchmark in the realm of LLMs.

Nemotron 70B was trained using the LLaMA 3.1 70B model using RLHF, with a focus on integrating state-of-the-art architectural improvements to outperform competitors in processing speed, training efficiency, and output accuracy.

The method is a reward-type training, which utilizes the REINFORCE algorithm, a policy gradient approach that updates the model’s parameters based on feedback from human evaluators.

This method allows the model to learn from its mistakes and improve over time by maximizing the expected reward from its outputs.

Developed as part of the Meta LLaMA 3.1 family, the AI is relatively smaller than the AI it's based on, but Nvidia tweaked it to be able to par, and at some points, even surpass OpenAI's GPT-4 and GPT-4o, as well as Anthropic Claude 3.5 Sonnet.

And to compete in the realm where developers are increasingly guarding their secrets, Nvidia designed Nemotron 70B to be an open-source project, meant to provide other developers and enterprises access to the increasingly capable AI abilities.

Read: Meta Introduces 'LLaMA 3.1' To Show That 'Open Source Is Leading The Way'

Our Llama-3.1-Nemotron-70B-Instruct model is a leading model on the Arena Hard benchmark (85) from @lmarena_ai.

Arena Hard uses a data pipeline to build high-quality benchmarks from live data in Chatbot Arena, and is known for its predictive ability of Chatbot Arena Elo… pic.twitter.com/HczLQQ6EOp
— NVIDIA AI Developer (@NVIDIAAIDev) October 15, 2024

According to Nvidia in a dedicated web page, Nemotron 70B is designed "to improve the helpfulness of LLM generated responses to user queries."

On top of that, Nvidia also leveraged enhancing multi-query attention and an optimized transformer design that ensures faster computation without compromising accuracy. Compared to earlier models, the LLaMa 3.1 iteration features more advanced learning mechanisms, allowing Nemotron 70B to achieve improved results with fewer resources.

Not to mention, Nvidia that also utilized its vast resources to make this model happen, shows that it can push LLaMA 3.1-70B model beyond its initial limits.

Because this model has a powerful fine-tuning capability, Nvidia has customized the Nemotron 70B, making it highly versatile.

This has allowed the creation of an AI version that is not only more powerful but also more “useful” from a practical point of view.

Published:

18/10/2024