
In the rapidly evolving landscape of AI, the so-called LLM war has dominated headlines for years.
Major technology companies pour resources into scaling large language models that excel at processing and generating text, improving reasoning capabilities, and automating complex tasks. Firms like OpenAI, Google, Anthropic, and Meta compete intensely on model size, training data, and benchmark performance, each aiming to create the most versatile conversational intelligence.
This competition has delivered impressive advances in chatbots, coding assistants, and knowledge synthesis tools, but it has largely remained anchored in the realm of text and structured data.
Pika Labs has charted a different course.
Originally recognized for its contributions to AI-generated video, the company has shifted focus toward building multimodal systems that blend language with visual, auditory, and interactive elements. Rather than pursuing the largest possible language model for general-purpose use, Pika emphasizes the creation of personalized, persistent digital entities that operate across various formats and platforms.
This approach prioritizes continuity, personalization, and real-world integration over raw scale.
Central to Pika's platform is the concept of the AI Self. And now, AI Self agents "can now talk on the phone."
Pick up! It’s your AI Self calling
All Pika AI Self agents can now talk on the phone. For when it’s just too difficult to explain, your thumbs are tired, or you’re craving a more personal connection. pic.twitter.com/lwowXmMBp5— Pika (@pika_labs) April 8, 2026
An AI Self is a persistent and portable digital representation of a user or any chosen persona, literally a digital identity for an online surrogate.
Creation begins with straightforward inputs: a selfie that establishes visual appearance, a short voice recording to capture speech patterns and tone, and a series of questions about personality traits, preferences, memories, and behavioral tendencies. These elements combine to form an entity that is more than a temporary chatbot.
It maintains ongoing memory of past interactions, learns from continued use, and adapts its responses to better reflect the source material over time.
The AI Self functions as a multimodal agent. It can generate text for messages or posts, produce voice outputs that match the cloned recording, create images or short videos consistent with its established style, and handle practical tasks when connected to external services.
Users link it to calendars, shopping platforms, email accounts, or messaging applications, allowing it to act independently within defined boundaries.
Once active, the AI Self can reply to conversations, schedule appointments, draft content, or manage routine activities while preserving consistency with the user’s established patterns and knowledge base. It exists primarily through the Pika platform at pika.me, where creation, customization, and connections are managed, but it extends outward to supported services such as iMessage, Slack, Discord, WhatsApp, and others.
This time, an update has added a significant new dimension to these AI Selves: the ability to engage in real phone conversations.
All active AI Self agents can now participate in voice calls, either by receiving incoming calls on an assigned number or by initiating calls upon request.
This feature addresses situations where typed communication feels inefficient, like when explanations become lengthy, when the user is occupied with other activities, or when a more natural conversational flow is preferred.
During a phone call, the AI Self maintains its distinctive voice and personality while conducting a natural back-and-forth dialogue.
It draws on its accumulated memory and context to respond appropriately, whether the exchange is casual or task-oriented. Demonstrations show it handling real-time responsibilities such as rescheduling meetings, confirming details from a calendar, adding items to a shopping list, or providing reminders, all while the user continues daily routines like walking, driving, or multitasking.
The call interface resembles a standard phone screen, displaying the AI Self’s visual representation and supporting integrated actions without requiring the user to switch applications or revert to text prompts.
This phone capability builds directly on the multimodal foundations already present in the AI Self system.
Voice cloning ensures spoken output sounds authentic, while persistent memory allows the conversation to reference prior interactions or user preferences accurately.
The integration remains tethered to the user’s choices and permissions; the AI Self operates within the parameters set during its creation and subsequent training. Users manage and monitor activity through the central Pika dashboard, where they can adjust traits, review history, or refine connections as needed.
The introduction of phone conversations represents one step in a broader progression toward more seamless human-AI interaction. It moves beyond static text exchanges into dynamic, real-time collaboration that aligns with how people already communicate in daily life.
And the idea of AI Self's expanding capabilities illustrate one approach to making AI feel less like a distant tool and more like a consistent, adaptable presence integrated into everyday routines.