Anthropic Introduces a New AI Model Capable of Extended Reasoning for as Long as Needed

Anthropic is introducing Claude 3.7 Sonnet, a next-generation AI model designed to “think” about questions for as long as users prefer.
Described as the industry’s first “hybrid AI reasoning model,” Claude 3.7 Sonnet can provide both instant responses and more in-depth, deliberative answers. Users have the option to enable its reasoning mode, allowing the AI to process questions for a shorter or longer duration.
This model aligns with Anthropic’s goal of simplifying AI interactions. Many current AI chatbots require users to choose between multiple models with varying costs and capabilities. Anthropic aims to streamline this by offering a single model that handles both quick and complex reasoning tasks.
Claude 3.7 Sonnet is launching on Monday for all users and developers. However, only subscribers to Anthropic’s premium Claude plans will gain access to its reasoning features. Free users will receive a standard version without advanced reasoning, though Anthropic claims it still surpasses the previous flagship model, Claude 3.5 Sonnet. (The company notably skipped a version number.)
Pricing and Comparison
Pricing for Claude 3.7 Sonnet is set at $3 per million input tokens—equivalent to around 750,000 words, more than the entire Lord of the Rings trilogy—and $15 per million output tokens. While this makes it pricier than OpenAI’s o3-mini ($1.10 per million input tokens/$4.40 per million output tokens) and DeepSeek’s R1 (55 cents per million input tokens/$2.19 per million output tokens), those models specialize in reasoning alone, whereas Claude 3.7 Sonnet integrates both real-time and extended reasoning capabilities.

Claude 3.7 Sonnet is Anthropic’s first AI model designed for “reasoning,” a technique increasingly adopted by AI labs as traditional performance improvements slow down.
Models like o3-mini, R1, Google’s Gemini 2.0 Flash Thinking, and xAI’s Grok 3 (Think) take more time and computing power before generating responses. By breaking down problems into smaller steps, these models typically enhance accuracy. While they don’t think or reason like humans, their approach is inspired by deductive processes.
Future Automation of AI Reasoning
Anthropic aims for future versions of Claude to determine on their own how long to “think” about questions, eliminating the need for users to make that choice manually, according to Dianne Penn, the company’s product and research lead, in an interview with TechCrunch.
In a blog post shared with TechCrunch, Anthropic compared this approach to human cognition: “Just as people don’t have separate brains for immediate answers versus deep thinking, we believe reasoning should be a seamless capability within a frontier model rather than a feature confined to a separate system.”
To enhance transparency, Claude 3.7 Sonnet includes a “visible scratch pad” that reveals its internal planning process. Penn noted that while users will be able to see most of the AI’s reasoning, certain parts may be redacted for trust and safety reasons.

Anthropic has fine-tuned Claude’s reasoning modes for practical applications, such as solving complex coding challenges and handling autonomous tasks. Developers using Anthropic’s API can adjust the model’s “thinking budget,” balancing speed and cost against answer quality.
In real-world coding evaluations, Claude 3.7 Sonnet demonstrated strong performance. On SWE-Bench, a benchmark for coding tasks, it achieved 62.3% accuracy, outperforming OpenAI’s o3-mini, which scored 49.3%. In TAU-Bench, a test assessing AI interaction with simulated users and external APIs in a retail environment, Claude 3.7 Sonnet scored 81.2%, surpassing OpenAI’s o1 model at 73.5%.
Improved Response Flexibility
Anthropic also claims that Claude 3.7 Sonnet is less likely to refuse valid prompts than previous versions. The model is designed to better distinguish between harmful and benign requests, reducing unnecessary refusals by 45% compared to Claude 3.5 Sonnet. This shift comes as some AI labs reconsider their approach to content restrictions.
Alongside Claude 3.7 Sonnet, Anthropic is introducing Claude Code, an agentic coding tool launching as a research preview. This tool allows developers to execute tasks directly from their terminal. In a demo, Anthropic employees showcased how a simple command like “Explain this project structure” enables Claude Code to analyze a codebase. Developers can modify code using plain English, while the tool explains its edits, tests for errors, and even pushes updates to GitHub.
Claude Code will initially be available to a limited number of users on a first come, first serve basis, according to an Anthropic spokesperson.
Anthropic is launching Claude 3.7 Sonnet at a time when AI labs are rapidly releasing new models. The company has traditionally taken a cautious, safety-focused approach, but with this release, it aims to set the pace. However, competition looms—OpenAI’s CEO, Sam Altman, has hinted that OpenAI may introduce its own hybrid AI model within months.
Read the original article on: TechCrunch
Read more: Meta AI Expands to the Middle East and Africa with Arabic Language Support
Leave a Reply