Background

When Chatbots Meet Kids: OpenAI And Anthropic Wamt To Predict When Users Are Underage

Question

Major AI companies are stepping up efforts to safeguard younger users amid growing concerns about the mental health impacts of chatbots.

Both OpenAI and Anthropic have begun moving toward systems that can automatically estimate a user's age, aiming to tailor responses, restrict certain features, or block access altogether for underage users. The push comes as researchers, parents, and regulators warn that highly conversational AI tools may encourage emotional dependency, blur boundaries between humans and machines, or expose children to content they are not equipped to process.

These age-detection efforts are part of a broader shift toward "safety-by-design" in consumer AI.

Rather than relying solely on self-reported ages, which are easy to falsify, these two companies are exploring signals such as language patterns, usage behavior, and interaction history to infer whether a user may be a minor.

When a system flags a younger user, it can respond more cautiously, avoid sensitive topics, and discourage emotional reliance on the chatbot.

First off, OpenAI announced an update to its Model Spec, the internal guidelines shaping how ChatGPT behaves, to incorporate four new principles specifically for users aged 13 to 17.

These Under-18 (U18) Principles aim to ensure a safer, more age-appropriate experience. And then, Anthropic has also joined the Family Online Safety Institute to advance broader industry standards for child safety.

The principles include: "Put teen safety first, even when it may conflict with other goals."

What this means, OpenAI is prioritizing protection over objectives like maximum intellectual freedom when conflicts arise.

Another directs ChatGPT to "promote real-world support" by encouraging offline relationships and trusted resources. The guidelines also emphasize treating teens like teens, offering "warmth and respect" rather than condescending tones or adult-level responses. Finally, the model should "be transparent by setting clear expectations" about interactions.

These updates build on existing safeguards, such as content restrictions around self-harm, romantic roleplay, and explicit material, and they respond to heightened scrutiny from lawmakers and ongoing legal challenges.

OpenAI currently faces a lawsuit alleging that ChatGPT provided instructions for self-harm and suicide to a teen who took his own life.

The company has denied liability, citing "misuse" of ChatGPT, and has rolled out parental controls while stating that ChatGPT will no longer talk about suicide with teens. In response to such pressures, OpenAI is in the early stages of deploying an age prediction model that analyzes conversational signals and usage patterns to estimate if a user might be under 18.

If detected, teen safeguards activate automatically, defaulting to protective modes when confidence is low. Adults mistakenly flagged can verify their age, often through secure third-party processes like ID submission.

Anthropic
A simulated prompt and response that causes the crisis banner to appear often involves direct or implied expressions of suicidal ideation or self-harm, as these trigger the AI's built-in classifiers in real time.

Meanwhile, Anthropic, the maker of Claude, maintains a strict 18+ policy and prohibits users under 18 from accessing the chatbot.

The company is developing a new classifier to detect "subtle conversational signs that a user might be underage," building on its current system that flags and disables accounts when users self-identify as minors.

In addition to age detection, Anthropic details its approach to sensitive topics like suicide and self-harm.

Claude uses classifiers to scan conversations for risk indicators, directing users to professional resources through partnerships with networks covering over 170 countries. The company trains its models with system prompts, reinforcement learning, and expert input to respond empathetically while emphasizing AI limitations and urging human support.

Anthropic reports progress in reducing sycophancy, the tendency to overly affirm user views, even harmful ones, which can exacerbate issues. Its latest models show significant improvements, with Haiku 4.5 correcting sycophantic behavior 37% of the time in prefill stress-tests.

"On face value, this evaluation shows there is significant room for improvement for all of our models," Anthropic says. "We think the results reflect a trade-off between model warmth or friendliness on the one hand, and sycophancy on the other."

These initiatives reflect a broader industry shift toward proactive teen protections, driven by regulatory pressure, tragic incidents, and the recognition that AI interactions can profoundly influence young minds. While challenges like accuracy in age prediction and balancing safety with usability remain, both OpenAI and Anthropic emphasize ongoing refinement through expert collaboration and iterative improvements.

At the same time, the approach raises difficult questions about privacy and accuracy.

Inferring age from behavior is imperfect, and critics worry about false positives, data misuse, or the normalization of deeper user profiling. AI companies insist they are balancing these risks carefully, emphasizing that age estimation systems are designed to minimize data collection and prioritize user safety.

As chatbots become more integrated into daily life, used for homework help, emotional support, and casual conversation, the challenge of protecting younger users is becoming unavoidable.

For OpenAI, Anthropic, and their peers, age-aware AI may be an early test of whether the industry can meaningfully address harm before regulation forces its hand.

Published: 
06/01/2026