Background

With 'TRIBE v2,' Meta Wants To Better Understand The Human Brain In Order To Create Smarter AIs Of The Future

Meta TRIBE v2

The line between understanding humans and building better machines has always been blurry.

For decades, AI has improved by learning patterns from massive datasets (text, images, behavior logs) without truly "thinking" the way humans do. It recognizes, predicts, and generates, but it doesn't inherently understand context the way a person does. That's where brain-inspired models come in. By studying how humans interpret meaning, react emotionally, and make decisions.

Here, Meta announced TRIBE v2 to help make that happen.

At its core, TRIBE v2 is designed to model how the human brain responds to the world, like how humans process images, language, emotions, and experiences.

On the surface, that sounds like a neuroscience breakthrough, a tool for decoding the complexity of human thought. If an AI can approximate how the human brain responds to something, it can deliver outputs that feel more intuitive, more natural, and ultimately more useful.

But look a little closer, and it becomes clear that this kind of system is just as much about the future of artificial intelligence as it is about understanding ourselves.

According to the research paper:

"Cognitive neuroscience is fragmented into specialized models, each tailored to specific experimental paradigms, hence preventing a unified model of cognition in the human brain. Here, we introduce TRIBE v2, a tri-modal (video, audio and language) foundation model capable of predicting human brain activity in a variety of naturalistic and experimental conditions."

"Leveraging a unified dataset of over 1,000 hours of fMRI across 720 subjects, we demonstrate that our model accurately predicts high-resolution brain responses for novel stimuli, tasks and subjects, superseding traditional linear encoding models, delivering several-fold improvements in accuracy."

Mimicking the human brain isn't about copying biology neuron for neuron.

It's more about capturing the principles behind how we learn and respond. Humans are incredibly efficient learners: we can generalize from very little data, adapt quickly to new environments, and interpret nuance in ways machines still struggle with. Traditional AI often requires enormous amounts of data to achieve what a human can do with just a few examples.

TRIBE v2 models brain activity using a three-step process:

  1. In the tri-modal encoding stage, it leverages pretrained audio, video, and text embeddings to capture patterns that are shared between AI systems and human neural responses.
  2. Next comes universal integration, where these embeddings are fed into a transformer that learns generalized representations across different types of stimuli, tasks, and individuals.
  3. Finally, in the brain mapping stage, a subject-specific layer translates those representations into individual fMRI voxels (three-dimensional units that reflect brain activity through changes in blood flow and oxygen levels).
Meta TRIBE v2
Mooné Rahimi first went viral following her dance video.

By modeling human cognition, developers hope to build systems that learn faster, adapt better, and make more context-aware decisions.

Then, there's also the question of alignment.

One of the biggest challenges in AI today isn't just making systems more powerful, but making sure they behave in ways that make sense to people.

Misalignment happens when an AI technically completes a task but does so in a way that feels wrong, confusing, or even harmful from a human perspective. If an AI understands how humans are likely to perceive or react to its outputs, it can adjust accordingly. In that sense, understanding the human brain becomes a shortcut to making AI more trustworthy and usable.

At the same time, this approach reveals something deeper about the direction of technology.

At this time, the LLM war is not longer just about building increasingly powerful tools that operate alongside us. Instead, the players are also building systems that are "agentic," and are increasingly shaped by us.

Human behavior, preferences, and even vulnerabilities are becoming training data.

TRIBE v2 represents a shift toward AI that doesn't just process information, but actively models the people using it. That creates opportunities for more personalized, responsive technology, but it also raises important questions about privacy and how deeply our inner lives should be mapped and replicated.

There’s a certain irony in all of this.

In trying to build smarter machines, we’re forced to confront how little we fully understand about ourselves. The human brain is still one of the most complex systems known, and every attempt to model it reveals new layers of complexity. Yet that challenge is exactly what makes this direction so compelling.

Each step toward mimicking human cognition doesn’t just improve AI—it also feeds back into science, offering new ways to study perception, behavior, and consciousness.

Ultimately, TRIBE v2 sits at the intersection of two goals that are no longer separate: understanding humans and improving AI. One feeds the other. The more accurately we can model how people think and feel, the closer AI gets to behaving in ways that align with real human experience.

Published: 
30/03/2026