Saturday, January 10, 2026
Claude Code Performance & Usage
The tweets discuss the performance discrepancies between Claude Code and other models like GPT, the importance of using both Opus 4.5 and Claude Code together, practical usage tips for Claude Code, and the value of different approaches for agent tasks.
Droid is currently rank 1 on the terminal-bench 2.0 benchmark using GPT 5.2. Claude Code with Opus 4.5 is rank 20. It feels hard to believe it can be that much better, but today feels like a good day to try it out!
The success of Claude for Coding depends on both Opus 4.5 and Claude Code. You can easily use Opus 4.5 Thinking inside Google’s Antigravity (for free!) and the results are incredibly inferior to using the same model inside Claude Code. Unless Deepseek has also a killer coding