Open-Source AI Will Break Big Tech's Grip on Intelligence
Will open-weight models consistently match proprietary frontier performance — making AI effectively free — by end of 2027?
If intelligence becomes free, every AI business model built on charging for it needs to be rethought — including the ones your retirement portfolio is betting on.
Your Prediction
Where do you think this lands?
Join others who've weighed in
Scenarios
Current value: MMLU gap: 0.3pp (effectively zero). DeepSeek V3.2 at $0.14/M tokens. FrontierMath gap still large (GPT-5.4 at 47.6%, open-source much lower). ARC-AGI-2: best open at ~40% vs proprietary ~83%.
S-curve position: Mid-curve on easy benchmarks (saturated), early curve on hard benchmarks (still large gap)
Persistent 20%+ gap on hard tasks (proprietary compute advantage at frontier is too large for open-source to match)
Parity on most practical tasks, 10-15% gap on frontier reasoning — but the practical gap is irrelevant for 90% of use cases
Full parity by mid-2027 (Chinese investment + Llama 5 + DeepSeek V5 close the gap on hard tasks)
How We'll Know
- What we measure
- Whether open-weight models match proprietary frontier models on hard benchmarks (FrontierMath, ARC-AGI-2, SWE-bench) while costing 10x+ less
- Confirmed if
- Top open-weight model scores within 5% of best proprietary model on 3+ hard benchmarks (FrontierMath, ARC-AGI-2, SWE-bench) at <10% of the cost
- Refuted if
- Proprietary models maintain >15% lead on hard benchmarks through end 2027, or the gap widens
- Data sources
- LMSYS Chatbot Arena rankings
- FrontierMath leaderboard
- ARC-AGI-2 results
- SWE-bench Verified
- Model pricing databases (artificial analysis)
Evidence Trail
Evidence For
- Mar 9, 2026
DeepSeek V3.2: $0.14/M tokens (27-35x cheaper). MMLU gap: 17.5pp to 0.3pp in one year. Chinese open-source: 30% of global AI downloads (US: 15.7%). RAND: Chinese models at 1/6 to 1/4 cost. Llama 4 Scout: 10M token context (open weights). 81% of enterprises use 3+ model families — multi-model reality established.→ Probability: 55%
- Mar 9, 2026
Gemini 3.1 Pro at $2/$12 per M tokens (lowest price for frontier reasoning). DeepSeek V4 expected with 1T parameters at 32B active — near-zero marginal cost. Traditional benchmarks (MMLU, GSM8K) now 90%+ for all major models. Open-source models can be run on-premise, avoiding API costs entirely. The economic case for proprietary is eroding for all but the hardest tasks.→ Probability: 60%
Evidence Against
- Mar 9, 2026
FrontierMath: GPT-5.4 at 47.6%, open-source far behind. ARC-AGI-2: best systems at 83% but require massive compute. Proprietary models maintain lead on the hardest reasoning tasks. Enterprise buyers prefer proprietary for compliance, support, liability. The 0.3pp MMLU gap is on saturated benchmarks — hard tasks show much larger gaps.
What Experts Say
Penn Wharton Budget Model
Nonpartisan economic research, University of Pennsylvania
“AI will increase US GDP by approximately 1.5% by 2035 and roughly 3% by 2055”
Mark Zuckerberg
CEO, Meta
“AI agents will write most of Meta's code in the near future”