EngineeringDec 29, 20257 min read

Your build is part of your product

Nobody ships the build, but everybody pays for it, which is why neglected tooling is the most expensive thing on a team.

Most engineers can name the cost of slow software. Almost nobody on a team can name the cost of a slow build, a flaky test suite, or a deploy that takes four manual steps and a prayer. That cost is real, it compounds daily, and it's hidden because nobody invoices you for it. Your build, your test runner, your local dev setup, your CI pipeline — they never reach the customer, so they never reach the roadmap. They just quietly tax every change you make.

That's the trap. The build is part of your product, because the build is the rate at which your product can change. A feature that takes a day to write and three hours to ship is not a one-day feature. The slowest, most unreliable step in your pipeline is the actual speed of your team, and you don't get to opt out of it by ignoring it.

The tax is paid in attention, not minutes

A six-minute build doesn't cost six minutes. It costs the engineer's place in the problem. Somewhere past the two-minute mark, you tab away — Slack, email, the other PR — and when the build finishes you have to climb back into the context you just abandoned. The thirty seconds of reading and the four minutes of drifting are not the same kind of time. One is work; the other is a small eviction from your own head.

Flaky tests are worse, because they corrode judgment. The first time a test fails for no reason, you re-run it. The tenth time, you stop reading failures at all — you just hit retry. Now the suite that was supposed to catch real regressions has trained your whole team to ignore it. A test you don't trust is not a slower test. It is a deleted test that still costs money to run.

→A build slow enough to context-switch on is slow enough to lose the thread.
→A test that fails randomly teaches people to ignore failures that aren't random.
→A deploy with manual steps means the steps get skipped under pressure, which is exactly when they matter.

A test you don't trust isn't a slower test. It's a deleted test that still bills you for the privilege.

It compounds, which is why it's invisible

Tooling decay never arrives as a crisis. No build goes from fast to slow overnight; it adds two seconds a sprint. No test suite goes from solid to useless in a day; it adds one flaky test a month. Each increment is too small to justify stopping for, and that is precisely the mechanism. The cost is structured so that it is never anyone's turn to fix it.

I've watched a team treat a forty-minute CI run as a fact of nature. It wasn't. It was four years of "we'll deal with it later," each individual decision defensible, the sum of them a team that shipped on Thursday what it could have shipped on Tuesday. When we finally profiled it, half the time was one cache that had silently stopped working eighteen months earlier. Nobody owned it, so nobody saw it.

The reason this happens is organizational, not technical. Tooling work has no demo. You can't show a faster build to a customer, you can't put a green deploy pipeline in a release note, and you can't easily attribute next quarter's velocity to the week you spent fixing it. So it loses every prioritization fight to the feature with a visible owner and a visible date — until the slowness gets bad enough to feel like weather, at which point everyone assumes it can't be changed.

Treat it like the product it is

The fix isn't a heroic tooling sprint. It's refusing to let the build live outside the product. Put a number on it, watch the number, and treat a regression in that number as a bug — because it is one.

→Measure the loop you actually run all day: save to feedback, push to green CI, merge to deployed. Those numbers are your real velocity.
→Set a ceiling and defend it. When the build crosses it, that's a P2, not a someday.
→Give the pipeline an owner, the same way every other part of the product has one. Unowned systems rot by default.

ci-budget.yml


budgets:
ci_total: 8m      # hard ceiling, fail the run if exceeded
unit_tests: 90s
flaky_rate: 0.5%  # quarantine above this, do not retry blindly

A build that breaches its time budget should fail loudly, like any other regression.

None of this is exotic. It's the same discipline you already apply to the parts of the product users can see: you decide what good looks like, you measure it, and you don't let it quietly get worse. The only difference with the build is that the person who suffers when it rots is you — which should make it easier to justify, not harder.

Your customers will never see your build. Your team lives in it every single working hour. Decide which of those two facts should set its priority.

#Engineering#Tooling#DXShare ↗

→ / AUTHOR

Ionut Dumitru

Full-stack engineer and product designer. Writes about building products where the engineering and the design are the same job.

GitHub ↗X ↗

→ / NEXT

CraftDec 22, 2025

Hire the person who redesigned something they didn't have to →

← All writingionutdumitru.com