Your Company's Next Senior Engineer Won't Be Human
Will AI autonomously ship production features — from spec to deployment — without human code review at major tech companies by end of 2027?
If you're a software engineer, this determines whether AI is your best tool or your replacement.
Your Prediction
Where do you think this lands?
Join others who've weighed in
Scenarios
Current value: 46% AI-generated code, SWE-bench 80%, Boris Cherny 100% AI code at 22-27 PRs/day, Cursor $2B ARR / 12 people
S-curve position: Steep mid-curve — code generation nearly solved, autonomous shipping emerging rapidly
Still requires human review for all production code (liability, quality, cultural resistance)
Routine and internal features shipped autonomously at 5+ major tech companies; critical features still human-reviewed
Standard practice by Q3 2027 (agent frameworks mature, CI/CD integration, liability frameworks emerge)
How We'll Know
- What we measure
- Whether AI systems autonomously ship production features (spec → code → test → deploy) without human code review at companies with 1000+ engineers
- Confirmed if
- At least 3 major tech companies publicly confirm AI autonomously ships production features without human code review as standard practice
- Refuted if
- All major AI coding tools still require human review for production deployment by end 2027
- Data sources
- GitHub Copilot / Agent HQ metrics
- Company engineering blog posts
- SWE-bench / METR evaluations
- Developer surveys (Stack Overflow, JetBrains)
Evidence Trail
Evidence For
- Mar 7, 2026
GitHub Copilot writes 46% of code for active users. Claude Sonnet 4.6 at 80.8% SWE-bench Verified. METR task horizon at 14.5 hours. 57% of enterprises running multi-step agent workflows. GitHub Agent HQ runs multiple AI models on same codebase.→ Probability: 35%
- Mar 7, 2026
Boris Cherny: 100% AI code, 22-27 PRs/day (practitioner proof). Cursor: $2B ARR with 12 employees (market proof). SWE-bench: 4%→80% in 26 months (benchmark proof). Dev freelance rates: -36% (price proof). Power-law: top 10% engineers are 10x productive with AI. Solo founders building complex products. Inference cost decline 200x/year makes AI coding nearly free.→ Probability: 55%
Evidence Against
- Mar 7, 2026
SWE-bench measures isolated tasks, not production complexity. Enterprise legacy codebases resist AI. Code review exists for liability and quality, not just correctness. Architectural decisions, cross-team coordination remain human. No major company has publicly shipped production features without human review yet.
- Mar 8, 2026
Benchmark-to-production gap remains large: Opus 4.5 scores 80% on SWE-bench Verified but only 18% on private codebases (SWE-bench Pro). Sonar (Feb 2026): Opus 4.6 has 21% MORE issue density and 55% MORE vulnerability density than Opus 4.5 — smarter model ≠ safer code. Code review bottleneck: 21% more tasks completed but review times up 91% (Faros AI, Jan 2026). Cursor BugBot: only 35% of AI-generated fixes merge unmodified. Technical debt accumulating: 14 different DB connection patterns in one AI-generated codebase, code duplication up 4x. The quality gap has shifted from 'can it write code' to 'can you trust the code it writes.'
- Mar 13, 2026
METR Opus 4.6: 719min task horizon (50%) / 70min (80%) — 2.45x jump in ~3 months. Cursor Automations launch: event-triggered agents (Slack, Linear, GitHub, PagerDuty) shift from prompt-based to always-on autonomous coding. Cursor approaching $50B valuation. Anthropic Jobs Report: 75% programmer task coverage in practice. SWE-bench Pro at 57.5% with optimized scaffold (WarpGrep v2).
How Our View Evolved
- Mar 13, 202655%↑60%
METR Opus 4.6 at 719min/70min task horizon (2.45x jump). Cursor Automations launch (event-triggered agents). 75% programmer task coverage (Anthropic). SWE-bench Pro at 57.5% with scaffold. Cursor $50B valuation.
- Mar 8, 2026Initial assessment: 55%
Baseline — initial published assessment
What Experts Say
Dario Amodei
CEO, Anthropic
“AI models will handle most aspects of software engineering tasks from start to finish within 6-12 months”
Andrej Karpathy
AI Researcher, former Tesla AI Director, educator
“Agentic engineering (AI agents writing 99% of code, humans as oversight) becomes the default professional workflow”
Mustafa Suleyman
CEO of Microsoft AI
“Most professional tasks involving sitting at a computer will be fully automated by AI within 12-18 months”
Boris Cherny
Head of Claude Code, Anthropic
“AI can already write 100% of production code; top engineers using AI are 10x more productive”
Cursor (Anysphere)
AI Code Editor ($2B ARR, 12 employees)
“AI-native companies can achieve billion-dollar revenue with teams of <20 people”