Background

From Distorted Text To Hand Gestures, Google Tests New Kind Of reCAPTCHA Method

Google

CAPTCHA has evolved significantly over the years, and Google is exploring what its next generation might look like through its reCAPTCHA initiative.

The company has introduced a new verification method within its reCAPTCHA system called hand gesture verification. This approach is being tested and made available as part of reCAPTCHA Enterprise and Google Cloud Fraud Defense tools for websites seeking to protect against automated abuse. It activates when the system identifies a need for stronger confirmation that the visitor is a live human rather than a bot or scripted program.

The process starts when a protected action on a website triggers a higher-risk assessment.

The browser then displays a prompt requesting permission to use the device's camera.

Once the user grants this permission, on-screen instructions appear guiding the person to perform one or more simple hand gestures or movements directly in front of the camera.

These gestures are typically short and straightforward, such as waving or holding the hand in specific positions.

Google reCAPTCHA

A brief video clip of the hand performing the requested actions is captured.

The system processes this video to detect and track the hand using machine learning models.

Specifically, it identifies 21 key landmark points on the hand, corresponding to the positions of joints and knuckles. These coordinates allow the system to analyze the sequence, timing, and natural dynamics of the movements. The verification succeeds when the detected landmarks confirm that the gestures match the instructions and exhibit characteristics consistent with real-time, live human motion.

After analysis completes, the video is discarded.

According to the documentation for this feature, the recorded videos are never associated with any user's personal identity, no audio is ever captured, and all video data is automatically deleted once the verification process finishes.

The landmark data extracted during processing is used exclusively for the security check and is not retained or repurposed.

Camera permissions can be managed or revoked at any time through standard browser settings, and no video or permission data is transferred to third parties.

This method functions as an optional layer that site owners can enable alongside or instead of other challenges.

It provides liveness detection by requiring physical, real-time interaction with the camera. Users who lack a camera, choose not to grant access, or encounter difficulties due to physical limitations are directed to alternative verification options. These include traditional visual image selection puzzles and audio-based challenges, which remain available.

Development of additional accessible alternatives continues.

Google reCAPTCHA

The hand gesture system builds on the broader reCAPTCHA framework, which evaluates risk signals before deciding whether an explicit challenge is necessary.

When enabled, the gesture prompt appears only for selected interactions or higher-risk traffic.

The idea of using automated tests to separate humans from machines dates back to the early 2000s.

Researchers at Carnegie Mellon University, including Luis von Ahn, developed the first practical systems in response to bots creating fake email accounts and other online abuse. They coined the term CAPTCHA, standing for Completely Automated Public Turing test to tell Computers and Humans Apart.

Early versions presented users with strings of distorted, warped, or noisy text. Humans could usually recognize the characters through visual perception, while contemporary computer programs struggled because optical character recognition technology was not yet advanced enough to handle the distortions reliably.

A major evolution occurred with reCAPTCHA, introduced around 2007.

This version retained the verification purpose but added a secondary benefit by using the human effort to transcribe text from scanned books and newspapers.

Words that automated systems could not accurately read were presented to users as part of the challenge. Correct transcriptions helped digitize large volumes of historical material, contributing to projects that made old books and archives searchable. The service was acquired by Google in 2009 and continued to operate at scale.

Evolution of reCAPTCHA

As computer vision and machine learning progressed, text-based challenges became easier for automated systems to solve.

Providers shifted toward image-based tasks in which users select all squares containing particular objects, such as traffic lights, vehicles, or storefronts.

These puzzles leveraged human strengths in scene understanding that were harder for algorithms at the time.

Google's reCAPTCHA v2 combined a simple checkbox with image challenges when additional verification was required, while v3 moved toward largely invisible operation. It collects behavioral signals such as mouse movements, typing patterns, and browsing context to assign a risk score without always interrupting the user.

Today, advanced AI models can solve many traditional image and text challenges with increasing reliability. This has prompted further adaptations in verification systems.

Current approaches combine risk scoring based on user behavior, policy-driven challenges, and new liveness techniques that require real-time physical interaction. The hand gesture verification method is one example of this ongoing adaptation, focusing on dynamic hand movements captured live through the camera to confirm human presence.

Traditional image and audio options continue to serve users and sites that prefer or require them, creating a layered set of tools that evolve alongside changes in both human capabilities and automated technologies.

Published: 
22/06/2026