AI Won't Save You From Bad Process — It Will Just Accelerate It

Artificial intelligence was poised to solve the people problem in software development. By deploying tireless, ultra-fast AI agents, organizations hoped to bypass the bottlenecks and human errors that have historically plagued complex projects. A wave of new research, however, delivers a surprising twist: even with humans removed from the equation, AI-driven projects are failing in eerily familiar ways. The culprit, it turns out, was never the people. It was always a structural force known as Batch Size Gravity.

The Unseen Force of Batch Size Gravity

In a critical research report published on The New Stack, author Steve Fenton argues that software delivery has a form of gravity that increases in direct proportion to its batch size [1]. The larger the chunk of work, the more it struggles against its own complexity, making it exponentially harder to integrate, test, and ship successfully. This isn't a human failing; it's a law of organizational physics. Fenton's report posits that this law applies just as unforgivingly to AI agents as it does to their human counterparts.

"The reality is that software delivery has a form of gravity that increases relative to batch size," Fenton writes. "This is why swarms of AI agents fail in all the same ways teams do when you give it large and complex tasks."

This insight reframes the entire conversation around AI in software engineering. It suggests that simply throwing more AI at a project won't fix underlying process issues; in fact, it will only amplify the resulting dysfunction.

The Experiment That Proves the Point

The theory of Batch Size Gravity is powerfully supported by a controlled experiment conducted by Jeremy McEntire, Head of Engineering at Wander. His team tasked four different AI architectures with the same software engineering goal, providing each with an identical compute budget. The results were stark.

AI Architecture	Tasks Completed	Budget Consumed	Outcome
Single Agent	28 out of 28	Partial	Success
Multi-Agent Pipeline	0 out of 28	Full ($50)	Failure

The multi-agent pipeline, a structure that mirrors many corporate hierarchies, produced no deployable code. It burned its entire budget on what could only be described as bureaucratic overhead: planning, reviewing, and rejecting 87% of its own work. The agents, McEntire notes, effectively held meetings and shipped nothing.

"It failed because it was subject to the same structural forces that make human bureaucracies dysfunctional," McEntire concludes. "We’ve spent decades building organizations that talk about work rather than do it. When we trained AI on the outputs of those organizations, it learned to do the same."

By removing humans entirely, McEntire proved that these failures are inherent to the system's design, not the agents operating within it.

Why Faster Coding Doesn't Mean Faster Delivery

While AI tools like GitHub Copilot have been shown to increase individual coding speed by as much as 55%, these task-level productivity gains often fail to translate into faster end-to-end delivery [3]. An analysis by First Line Software highlights that the true bottlenecks in software delivery are not typing speed but systemic issues like review cycles, rework, testing, and integration. Because most AI tools operate outside of this core delivery pipeline, they cannot influence the system-level metrics that truly matter, such as those identified by the DevOps Research and Assessment (DORA) program.

The solution is not to make individuals faster, but to make the entire system smoother—by integrating AI directly into delivery pipelines, not just alongside them.

The Fix Is Already Here

The answer to overcoming Batch Size Gravity is not a futuristic AI breakthrough. It is the same set of principles that the Agile and DevOps movements have championed for years: Continuous Delivery and small batch sizes. The DORA research has consistently shown that elite-performing teams are defined by their ability to deploy small changes frequently and reliably.

To succeed, organizations must treat AI agents not as a silver bullet, but as high-velocity workers that require well-structured, small-batch delivery pipelines. The same process discipline that enables human teams to thrive is the essential ingredient for making AI effective at scale.

The recent wave of AI project failures—with some studies showing rates as high as 80-95%—is not an indictment of the technology itself. It is a clear signal that AI cannot fix a broken process. It only makes that process fail faster and more spectacularly. As we stand on the cusp of a new technological era, the most pressing question is not whether AI is powerful enough, but whether our organizations are disciplined enough to wield it.