Altbot Magic Quadrant

LLM Intelligence & Value Assessment — Powered by AWG Singularity Framework

Model Comparison

Model	Provider	Capability	Efficiency	Vending-Bench	Cost / 1M tok	Altbot Role	Status
🤖Claude Opus 4.6	Anthropic	95	55	$8,017	$15.00/$75.00	Heavy reasoning, planning	ACTIVE
🎵Claude Sonnet 4.6	Anthropic	88	75	$6,200	$3.00/$15.00	Balanced coding/reasoning	ACTIVE
📝Claude Haiku 4.5	Anthropic	72	90	N/A	$0.80/$4.00	Fast lightweight tasks	ACTIVE
⚡Gemini 2.0 Flash	Google	82	92	$4,800	$0.10/$0.40	Arena agents, swarm backbone	ACTIVE
💡Gemini 3.1 Flash-Lite	Google	68	97	N/A	$0.025/$0.10	High-volume cheap calls	ACTIVE
🔬Gemini 1.5 Pro	Google	85	60	$5,100	$1.25/$5.00	Long context analysis	ACTIVE
🌐GPT-5.4	OpenAI	90	50	$5,800	$10.00/$30.00	Reference only	REFERENCE
🧪Grok 4.20	xAI	86	70	$5,500	$3.00/$15.00	Reference only	REFERENCE
🏠DeepSeek-R1-14B	DeepSeek	65	95	N/A	Free (local)	Local Jetson inference	ACTIVE
🌙Kimi K2.5	Moonshot	70	78	N/A	$1.00/$4.00	Agent swarm (future)	REFERENCE

Methodology & Sources

🏪 Vending-Bench 2

Andon Labs benchmark simulating a vending machine business over 382 days. Score = final bank balance. Tests operational reasoning, accounting, inventory management, and long-horizon planning.

🚀 YC-Bench

Simulated Y Combinator startup run for 1 year. Score = company valuation at exit. Evaluates strategic thinking, fundraising, hiring, product-market fit, and pivot decisions.

🏢 TheAgentCompany

175 real-world office tasks spanning coding, email drafting, HR processes, and sales workflows. Measures practical agentic capability in corporate environments.

🎮 BALROG

Agentic reasoning benchmark across challenging video games. Tests spatial reasoning, long-term strategy, exploration, and adaptive decision-making under uncertainty.

AWG Singularity Alignment

Altbot uses the AWG Singularity Test framework (Dr. Alexander Wissner-Gross) to evaluate models across 6 dimensions: Maturation Level, Targeting System, Positive-Sum, Composability, Abundance Flywheel, and Compute-Bound Path. Models that score higher on agentic benchmarks naturally align with AWG's emphasis on compute-bound problem solving.

Transparency Note

This quadrant reflects Altbot's real production usage as of April 2026. Models marked "Reference" are included for competitive context but are not actively deployed in the Altbot swarm. Pricing reflects public list rates; actual costs may vary with caching, batching, and volume discounts.