Index

	Analyze	actors: 3, documents: 10
	Prisma	namespaces: 3, models: 8
	Interface	operations: 49, schemas: 63
	Test	functions: 46
	Realize	functions: 49

in: 31.06M (159.4K cached)

out: 650.5K

Bbs

	Analyze	actors: 3, documents: 11
	Prisma	namespaces: 4, models: 8
	Interface	operations: 48, schemas: 64
	Test	functions: 52
	Realize	functions: 48

in: 33.81M (258.7K cached)

out: 573.5K

	Analyze	actors: 4, documents: 12
	Prisma	namespaces: 5, models: 17
	Interface	operations: 105, schemas: 118
	Test	functions: 94
	Realize	functions: 105

in: 67.70M (506.2K cached)

out: 1.27M

Shopping

	Analyze	actors: 4, documents: 12
	Prisma	namespaces: 10, models: 40
	Interface	operations: 211, schemas: 248
	Test	functions: 177
	Realize	functions: 211

in: 135.07M (507.8K cached)

out: 2.30M

This model offers a cost-effective balance for generating small to medium backend applications (~20 tables, 150 API endpoints). It performs well for web services like community boards, blogs, or project management tools, supporting CRUD operations, user authentication, permission management, and file uploads. Its strengths are in requirements analysis and API design, producing clear specifications and clean, RESTful API structures, making it ideal for project initialization.

However, it may produce logical errors in complex business logic or fail to fully resolve compilation issues in E2E test code due to its lightweight design. We provide this model first to demonstrate the role of model capacity in code generation and to manage hackathon costs, as more powerful models are expensive. Developers often use it for initial setups, refining output with tools like Claude Code or GitHub Copilot for a cost-efficient workflow.

5.2. `openai/gpt-4.1`

Todo

	Analyze	actors: 1, documents: 11
	Prisma	namespaces: 3, models: 4
	Interface	operations: 15, schemas: 21
	Test	functions: 20
	Realize	functions: 15

in: 8.60M (37.6K cached)

out: 161.8K

Bbs

	Analyze	actors: 2, documents: 11
	Prisma	namespaces: 6, models: 12
	Interface	operations: 59, schemas: 63
	Test	functions: 93
	Realize	functions: 59

in: 31.19M (204.5K cached)

out: 640.4K

	Analyze	actors: 3, documents: 12
	Prisma	namespaces: 10, models: 56
	Interface	operations: 245, schemas: 285
	Test	functions: 257
	Realize	functions: 245

in: 137.52M (404.4K cached)

out: 3.44M

Shopping

	Analyze	actors: 3, documents: 12
	Prisma	namespaces: 10, models: 46
	Interface	operations: 278, schemas: 255
	Test	functions: 286
	Realize	functions: 278

in: 147.59M (265.0K cached)

out: 4.15M

Available after completing openai/gpt-4.1-mini review

This is the most advanced model, optimized for enterprise-grade backend applications (>500 APIs, 1,000 test scenarios). It excels at understanding complex requirements, inferring implicit needs, and implementing advanced features like real-time notifications, complex permissions, transaction processing, and caching. AutoBE achieves a 100% build success rate with this model, producing production-ready code with no compilation errors.

Generating an e-commerce platform costs ~$300–400 (150M tokens), so access is restricted to manage expenses. Completing the gpt-4.1-mini review unlocks free access, providing insight into how model capacity impacts code quality. This ensures participants can explore its full potential without cost concerns.

5.3. `qwen/qwen3-next-80b-a3b`

Todo

	Analyze	actors: 1, documents: 10
	Prisma	namespaces: 2, models: 3
	Interface	operations: 13, schemas: 17
	Test	functions: 58
	Realize	functions: 13

in: 21.86M (0 cached)

out: 602.4K

Bbs

	Analyze	actors: 2, documents: 6
	Prisma	namespaces: 6, models: 20
	Interface	operations: 44, schemas: 45
	Test	functions: 80
	Realize	functions: 44, errors: 1

in: 67.92M (0 cached)

out: 2.17M

	Analyze	actors: 3, documents: 11
	Prisma	namespaces: 10, models: 39
	Interface	operations: 68, schemas: 75
	Test	functions: 68
	Realize	functions: 68, errors: 1

in: 93.47M (17.6K cached)

out: 3.04M

Shopping

	Analyze	actors: 3, documents: 11
	Prisma	namespaces: 10, models: 47
	Interface	operations: 172, schemas: 171
	Test	functions: 360
	Realize	functions: 172, errors: 4