LINK NOTE

Amazon's coding agents just don't quit ↗

2025.12.03

First, by learning what the agents were and weren’t good at, the team could switch from babysitting every small task to directing agents toward broad, goal-driven outcomes. Second, the velocity of our teams was tied to how many agentic tasks they could run simultaneously. Third, the longer the agents could operate on their own, the better.

When did the amount of time a coding agent can run without intervention become a priority metric? I understand how it can be useful and that we need agents that have the right ability to overcome obstacles, but it feels like a lot of the labs & agent makers out there are focusing on it as it's own metric, which I don't think benefits users nearly as much as it benefits those labs & agent makers. And I think this quote kind of exposes that. They have rationale for points one and two, and then agents running for longer is just... 'better.' Feels like a setup for one day these agents taking 10 hours and 1 million tokens just to come back and say that they updated a single CSS class and now your problem is fixed.