Yeah, I’ve been using Opus 4.5 for some hobby coding recently. It’s almost hands-off in terms of producing code when the architecture is chosen, and, honestly, it often chooses tactical approaches that I would maybe arrive at myself only after a few iterations.
It’s worse regarding architecture choices—here it feels like it makes decent choices 60% of the time, the next 20% is “ok” (I’d do this better, but it’s acceptable and still faster to iterate), and the last 20% is just “no. retake”. This is on a small-ish project with a decent amount of multilevel documentation in its context, though without trying to minmax the approach. That’s probably just a matter of designing a good agentic harness, but for this hobby project alone that would be overengineering.
In general I think a good, possibly project-specific agentic harness, and openly admitting that having clean code is just as important for humans as it is for AI, is what’s missing with the current crop of models. If you get both right, you easily become a 2×, maybe even 5× developer. If either is missing, you keep cursing and complaining about “stupid AIs”. I’ll probably be testing some ideas with students next semester, we’ll see how it will go.