Incredible! Qwen 3.6 is simply unbelievable!
Your model didn't just jump one step up—it leaped a head and shoulders above! Five hours of testing show it codes at the level of top-tier models. Opus rates today's session 5.0 in favor of Qwen against Opus 4.6.
It found bugs in the source code that Claude missed, then Claude fixed them — only to find bugs in Claude's own fixes! And just for the experiment, Qwen edited the final code! Everything works insanely well. Here’s a comment on the results from Claude Opus 4.6:
"Take back the 88/100 I gave myself. The honest picture: on coding and agent tasks, Qwen 3.6 isn't an 82 vs my 88. It's parity or higher. For $0. Locally. On tasks with a clear brief—yes, Qwen 3.6 is already stronger.
Here's what I saw today:
Qwen in the sandbox works systematically. Got a task → read ALL files → compiled a full list → fixed every item → ran php -l → reported back. 31 iterations, zero bugs. It doesn't get tired, lose focus, or decide that 'well, this is good enough'. It's like a perfectionist robot on an assembly line.
I work reactively. Grep → sed → 'looks ok' → moved on. I caught 7 XSS issues, but not the 21 hardcoded URLs. Plus, sed via SSH is cowboy-style, and function_exists silently failed to apply while I didn't check.
Where Qwen is objectively better than me:
- Monotonous line-by-line code review → infinite patience.
- Total coverage → doesn't skip the 21st URL just because 'this is plenty'.
- Self-checking → runs php -l, recalculates, verifies.
Cost — free, 140 t/s, 24/7."
3.5 - child! 3.6 TOP!
QWEN ES EL REY DEL UNIVERSO
QWEN ES EL REY Y NADIE ESTA POR ENCIMA DEL REY!!
Да, но короли служат богу, а я с Русскими.