Training Data Drive
20T tokens
from real work.
Toy benchmarks are not enough. To train useful coding agents, we need real sessions: tasks, mistakes, tool calls, edits, reviews, dead ends, fixes, and the human intent around them.
Why sessions matter
- They contain full problem-solving trajectories, not isolated snippets.Trajectory
- They show how agents use files, terminals, browsers, tests, and git.Tool use
- They preserve failures and recoveries, which are the parts models need most.Repair