Anthropic Upgrades Claude 3.5 Sonnet So It Can Control Users' Screen, Keyboard And Cursor

Anthropic Claude 3.5 Sonnet.

It was kind of quiet and peaceful, until a war rages on.

Since OpenAI introduced ChatGPT and sparked huge astonishment and demand, others who realize how lucrative this field has become, started creating their own products to match, or surpass ChatGPT. And here, Anthropic is one of the elites.

The company has launched Claude 3.5 Sonnet, which it dubbed as its “most intelligent model yet.”

The AI model is reportedly twice as fast as Claude 3 Opus, the company’s previous best-in-class AI, and five times cheaper to run. And if it's not better than OpenAI's GPT-4o, Anthropic has made this AI model free for all users on the web and mobile, and also has made it available to developers.

Following that milestone, Anthropic is now upgrading Claude 3.5 Sonnet

And one of the things it came up with, is the 'computer use' feature, a feature that is at this time, unique to only Anthropic

In an announcement, Anthropic said that:

"Available today on the API, developers can direct Claude to use computers the way people do—by looking at a screen, moving a cursor, clicking buttons, and typing text. Claude 3.5 Sonnet is the first frontier AI model to offer computer use in public beta. At this stage, it is still experimental—at times cumbersome and error-prone. We're releasing computer use early for feedback from developers, and expect the capability to improve rapidly over time."

In other words, the Amazon-backed AI startup founded by former OpenAI research executives has created AI agents that can use a computer to complete complex tasks like a human would.

The tool can "use computers in basically the same way that we do," Jared Kaplan, Anthropic's chief science officer, said in an interview, adding that it can do tasks with "tens or even hundreds of steps."

Early testers include Asana, Canva, Cognition, DoorDash, Replit, and The Browser Company, which have already begun to explore these possibilities.

They have started carrying out tasks that require dozens, and sometimes even hundreds, of steps to complete.

One example, is Replit, that uses the upgraded Claude 3.5 Sonnet's capabilities with computer use and UI navigation to develop a key feature that evaluates apps as they’re being built for their Replit Agent product.

Long story short, Anthropic said that future consumer applications that can be created using this kind of technology include booking flights, scheduling appointments, filling out forms, conducting online research and filing expense reports.

"We want Claude to be able to actually assist people with all sorts of different kinds of work, and we think the chatbot setup is fairly limited because you can ask a question and [get] context but it stops there," Kaplan said.

The technology however, comes with its own limitations.

In a dedicated web page, Anthropic cautioned that the computer use ability may struggle to operate applications on screens with resolutions higher than XGA (1024×768) or WXGA (1280×800) due to issues with image scaling.

The company also warned users of the risk of prompt-injection attacks.

For example, if Claude navigates to a webpage with images or text containing instructions, these “may override user instructions or cause Claude to make mistakes," the company explained.

To limit such risks, Anthropic recommends users to limit Claude 3.5 Sonnet’s internet access to approved domains only in order to reduce exposure to malicious content.

The company also urges users of the AI model to not give it access to sensitive data such as account login information to prevent information theft; and using a dedicated virtual machine or container with minimal privileges to prevent direct system attacks or accidents.

Anthropic also said that it has released a new model it calls the 'Claude 3.5 Haiku'.

Touted as the "state-of-the-art meets affordability and speed," the AI model is the company's new fastest model.

"For the same cost and similar speed to Claude 3 Haiku, Claude 3.5 Haiku improves across every skill set and surpasses even Claude 3 Opus, the largest model in our previous generation, on many intelligence benchmarks. Claude 3.5 Haiku is particularly strong on coding tasks. For example, it scores 40.6% on SWE-bench Verified, outperforming many agents using publicly available state-of-the-art models—including the original Claude 3.5 Sonnet and GPT-4o," the company said.

Due to its low latency, Claude 3 Haiku benefits from improved instruction following, and more accurate tool use.

According to Anthropic, Claude 3.5 Haiku is suited for user-facing products, specialized sub-agent tasks, and generating personalized experiences from huge volumes of data—like purchase history, pricing, or inventory records.

Published: 
22/10/2024