#qwen
3 post(s) tagged here.

Benchmarks on Local LLMs about Backend Generation, Monthly
AutoBe's first proper benchmark for backend generation β controlled variables, a six-axis weighted rubric, multi-dimensional precision. The function calling harness has effectively closed the gap between frontier and local models. From next month, expensive frontier models drop out and only small, cheap models compete. Frontend automation joins the leaderboard in two or three months.

Qwen 3.5-27B Just Built Complete Backends from Scratch β 100% Compilation, 25x Cheaper
Qwen 3.5-27B generated complete backends with 100% compilation at 1/25th the cost of Claude Opus 4.6 β and the output quality is nearly identical. The benchmark proves it.

Function Calling Harness: From 6.75% to 100%
6.75% first-try function calling success becomes 100% compilation via type schemas, compilers, and structured feedback. Dissecting the harness engineering behind AutoBe and Typia.