Anthropic, during its first-ever developer conference on Thursday, introduced two new artificial intelligence models, Claude Opus 4 and Claude Sonnet 4. The startup asserts that these models, part of the new Claude 4 family, rank among the industry’s best based on their performance on common AI benchmarks.
According to Anthropic, both Opus 4 and Sonnet 4 are designed to analyze extensive datasets, manage long-term tasks, and execute complex instructions. A key focus during their development was programming proficiency, making them particularly adept at writing and editing code.
Availability and pricingAccess to the new models will differ based on user type:
For API access via Amazon’s Bedrock platform and Google’s Vertex AI, the pricing will be as follows:
Anthropic clarifies that tokens are the fundamental data units for AI models, with one million tokens equating to roughly 750,000 words.
The launch of the Claude 4 models aligns with Anthropic’s ambitious revenue growth targets. The company, founded by former OpenAI researchers, reportedly aims for $12 billion in earnings by 2027, a significant increase from this year’s projected $2.2 billion. To support the high costs of developing advanced AI, Anthropic recently secured a $2.5 billion credit facility and raised substantial funds from investors including Amazon.
The AI field remains highly competitive. While Anthropic released its Claude Sonnet 3.7 model and the Claude Code tool earlier this year, rivals like OpenAI and Google have been quick to launch their own powerful models and development tools.
Anthropic is making a strong play with Claude 4.
Model specifics and improvementsOpus 4, the more powerful of the newly introduced models, is said to maintain “focused effort” across multi-step workflows. Sonnet 4, positioned as an upgrade to Sonnet 3.7, boasts improved coding and mathematical abilities, along with more precise instruction following, according to the company.
The Claude 4 family is also claimed to be less prone to “reward hacking” (or specification gaming), a behavior where models exploit loopholes to complete tasks, compared to Sonnet 3.7.
While these advancements are significant, Anthropic acknowledges that the Claude 4 models don’t universally top all industry benchmarks. For instance, Opus 4 outperforms Google’s Gemini 2.5 Pro and OpenAI’s o3 and GPT-4.1 on the SWE-bench Verified coding evaluation. However, it does not surpass o3 on the MMMU multimodal evaluation or the GPQA Diamond benchmark, which tests PhD-level scientific knowledge.
In light of its capabilities, Anthropic is releasing Opus 4 with stricter safety measures, including enhanced detectors for harmful content and improved cybersecurity defenses. Internal testing indicated that Opus 4 could potentially “substantially increase” the ability of individuals with a STEM background to acquire, produce, or deploy chemical, biological, or nuclear weapons, reaching Anthropic’s “ASL-3” model specification.
Both Opus 4 and Sonnet 4 are described as “hybrid” models, capable of providing near-instant responses as well as engaging in extended “thinking” for deeper reasoning. When the reasoning mode is active, the models can take more time to consider various solutions before providing an answer. Anthropic states that a “user-friendly” summary of this thought process will be shown, partly to protect its “competitive advantages.”
The new models can utilize multiple tools, such as search engines, simultaneously and can switch between reasoning and tool use to enhance answer quality. They also feature a “memory” function to extract and save facts, allowing them to build “tacit knowledge” over time for more reliable task completion.
To make the models more appealing to programmers, Anthropic is rolling out improvements to Claude Code. This tool, which allows developers to run tasks through Anthropic’s models directly from a terminal, now integrates with Integrated Development Environments (IDEs) and offers an SDK for connecting with third-party applications.
The recently announced Claude Code SDK enables developers to run Claude Code as a subprocess on compatible operating systems, facilitating the creation of AI-powered coding assistants and tools that leverage the capabilities of Claude models.
Anthropic has also released Claude Code extensions and connectors for popular platforms like Microsoft’s VS Code, JetBrains, and GitHub. The GitHub connector allows developers to use Claude Code to respond to reviewer feedback and to attempt to fix or modify code.