Your Company's Next Senior Engineer Won't Be Human
Will AI autonomously ship production features — from spec to deployment — without human code review at major tech companies by end of 2027?
If you're a software engineer, this determines whether AI is your best tool or your replacement.
Your Prediction
Where do you think this lands?
Join others who've weighed in
Scenarios
Current value: 46% AI-generated code, SWE-bench 80%, Boris Cherny 100% AI code at 22-27 PRs/day, Cursor $2B ARR / 12 people
S-curve position: Steep mid-curve — code generation nearly solved, autonomous shipping emerging rapidly
Still requires human review for all production code (liability, quality, cultural resistance)
Routine and internal features shipped autonomously at 5+ major tech companies; critical features still human-reviewed
Standard practice by Q3 2027 (agent frameworks mature, CI/CD integration, liability frameworks emerge)
How We'll Know
- What we measure
- Whether AI systems autonomously ship production features (spec → code → test → deploy) without human code review at companies with 1000+ engineers
- Confirmed if
- At least 3 major tech companies publicly confirm AI autonomously ships production features without human code review as standard practice
- Refuted if
- All major AI coding tools still require human review for production deployment by end 2027
- Data sources
- GitHub Copilot / Agent HQ metrics
- Company engineering blog posts
- SWE-bench / METR evaluations
- Developer surveys (Stack Overflow, JetBrains)
Evidence Trail
Evidence For
- Mar 7, 2026
GitHub Copilot writes 46% of code for active users. Claude Sonnet 4.6 at 80.8% SWE-bench Verified. METR task horizon at 14.5 hours. 57% of enterprises running multi-step agent workflows. GitHub Agent HQ runs multiple AI models on same codebase.→ Probability: 35%
- Mar 7, 2026
Boris Cherny: 100% AI code, 22-27 PRs/day (practitioner proof). Cursor: $2B ARR with 12 employees (market proof). SWE-bench: 4%→80% in 26 months (benchmark proof). Dev freelance rates: -36% (price proof). Power-law: top 10% engineers are 10x productive with AI. Solo founders building complex products. Inference cost decline 200x/year makes AI coding nearly free.→ Probability: 55%
Evidence Against
- Mar 7, 2026
SWE-bench measures isolated tasks, not production complexity. Enterprise legacy codebases resist AI. Code review exists for liability and quality, not just correctness. Architectural decisions, cross-team coordination remain human. No major company has publicly shipped production features without human review yet.
- Mar 8, 2026
Benchmark-to-production gap remains large: Opus 4.5 scores 80% on SWE-bench Verified but only 18% on private codebases (SWE-bench Pro). Sonar (Feb 2026): Opus 4.6 has 21% MORE issue density and 55% MORE vulnerability density than Opus 4.5 — smarter model ≠ safer code. Code review bottleneck: 21% more tasks completed but review times up 91% (Faros AI, Jan 2026). Cursor BugBot: only 35% of AI-generated fixes merge unmodified. Technical debt accumulating: 14 different DB connection patterns in one AI-generated codebase, code duplication up 4x. The quality gap has shifted from 'can it write code' to 'can you trust the code it writes.'
What Experts Say
Dario Amodei
CEO, Anthropic
“AI models will handle most aspects of software engineering tasks from start to finish within 6-12 months”
Andrej Karpathy
AI Researcher, former Tesla AI Director, educator
“Agentic engineering (AI agents writing 99% of code, humans as oversight) becomes the default professional workflow”
Mustafa Suleyman
CEO of Microsoft AI
“Most professional tasks involving sitting at a computer will be fully automated by AI within 12-18 months”
Boris Cherny
Head of Claude Code, Anthropic
“AI can already write 100% of production code; top engineers using AI are 10x more productive”
Cursor (Anysphere)
AI Code Editor ($2B ARR, 12 employees)
“AI-native companies can achieve billion-dollar revenue with teams of <20 people”