Post by Rahul Patil

CTO at Anthropic

Claude Opus 4.7 is out today. There are some significant behavioural changes. It reasons more, reaches for tools less, follows instructions more literally, and checks its own work before reporting back. That shifts how you prompt and how much oversight the work actually needs. This is the next upgrade in the Opus line, and it also plays a specific role on the path to Mythos. Last week we announced Project Glasswing, our work on cyber safeguards. 4.7 is the first model shipping with those safeguards in production: its cyber capabilities are deliberately less advanced than Mythos Preview, and what we learn from deploying these protections at scale is how we get to a broad Mythos release responsibly. The eval movement with Opus 4.7 on coding is significant. SWE-bench Pro jumps from 53.4 to 64.2. Terminal-Bench 2 from 65.4 to 69.4. In production, Rakuten is putting 3x more tasks through it, and Cursor's internal bench cleared 70% versus 58% on Opus 4.6. On the tool use shift: in most cases more reasoning and fewer tool calls is a better outcome. When you do want heavier tool use, two things help. Raise the effort setting, since high and the new xhigh level show substantially more tool calls in agentic search and coding. And be explicit in your prompt about when and why to use a given tool, including telling the model to err on the side of using it more. Same pricing as 4.6. Available today in Claude Code, on the API, Bedrock, Vertex AI, and Microsoft Foundry. https://lnkd.in/g_km9d_8

Post content