I’m back from Citcon. So a few more notes
Things people have shown
The worst build I ever worked saw…
We had a hallway discussion about some of the difficult build environments we’ve worked on over the years. A bad build can be really unpleasant to work with and a blocker to progress. One project I worked on burned out three developers in a row trying to get a messy build under control.
The discussion reminded me of something I always knew but only figured out recently, that a complicated build is often a symptom of design weaknesses. So when I’m thinking about adding another little tweak to the build to fix a problem, I should first take a look at the code to if there’s a root cause that I should address first. For me, the classical example of fixing the wrong problem is a build that changes the code to set parameters, which means I need to build artefacts for each configuration. Usually this requires lots of copying stuff around, which takes time and is harder to track. The real answer is to have clean artefacts that can deployed anywhere and separate out the per-environment features.
Concurrent builds
As often happens, the most interesting snippet for me was right at the end. Jeffrey Fredrick talked about how his group has an optimistic, rather than pessimistic, approach to running multiple builds. They run all their builds in parallel, rather than having a pipeline of increasingly complicated tests, and people can check in provided they pass the fast check-in build that catches the obvious errors. The corollary is that people can check in even when there are broken secondary builds, which is a bit shocking to the hard core. Usually, any failures settle down as check-ins ease off towards the end of the day.
The idea is to get feedback as soon as possible, and to avoid the problem that some teams have where it’s hard to get a check-in window because it takes too long to confirm the last one. Of course, they have a culture that makes this work: they’re doing shrink-wrap so their release cycle is longer, they have enough hardware to run in parallel, and I assume that people have the initiative to pick up failed builds and fix them.