Financial analogies for software

[2022-12-10 Sat] diatone.net

Earlier this year I reread The Mythical Man-Month.

The namesake concept of the book, the mythical man-month, is derived from an understanding of the nature of software project estimation, and how it goes awry. There are four simple tenets to this understanding.

First: software is an extension of the human mind. From Brooks:

"The programmer, like the poet, works only slightly removed from pure thought-stuff. He builds his castles in the air, from air, creating by exertion of the imagination. Few media of creation are so flexible, so easy to polish and rework, so readily capable of realizing grand conceptual structures."

This is foundational. The fundamental unit of software is thinking. In a manner of speaking, and this is the first financial analogy, thoughts are the currency of software. It takes thoughts to arrange information and communication structures, and not all thoughts are alike. Clarity is key. Or, incidental clarity, at least. Contrast this with bricklaying: it's physically intuitive how long it takes to lay bricks; the constraints are physically visible. But with software, they simply can't be seen, so unless you want to pay your way through a project with weak currency, you have to force yourself to achieve clarity. This is a continuous process.

But it goes further. The foundational primitives of software live in the same realm as metaphors, jokes, and that cool thing you remembered in the shower yesterday. So in principle they're easy to rearrange. But it's a double edged sword, because software is an extension of the mind. Because it's so tantalisingly simple to rearrange your thoughts, then hypothetically it's trivial to rearrange software too. But this doesn't address the fog of software development: complexity.

When a skyscraper is built, the opposing forces constraining its development are physics and economics. It's physically intuitive to understand what projects are viable, and what projects are obviously not. The tension between these opposing forces gives clarity to a project, and actually make it easier to proceed. But software (humour me and let's handwave away the laws of physics for the majority of projects that aren't doing cool, hard tech stuff) rarely has physics as an opposing force to economics. So to guide our estimation, we lean on complexity.

Complexity is the glue that binds work together. Complexity is yak shaving. Complexity is the groan you let out when you realise your changes affect three other teams, and you need to loop them in. Complexity is when you cross an interface boundary. Complexity is accessibility considerations, too. Complexity is the drag force on your brain when you're trying to understand how to be tax compliant in your largest markets, without sacrificing conceptual integrity and tight abstraction in your billing system. Complexity is all the things you didn't expect to happen, all happening at once, in a dynamic equilibruim you never could have predicted.

To illustrate, consider a task in a project as a random variable \(T\) with some probability of completion within some estimated length of time:

\begin{equation} E[T] = tp = 1 \times 0.95 \end{equation}

That other \(0.05\) represents all the things you weren't expecting to appear when you were doing \(T\). The compiler threw you in the deep end. Your computer ran out of battery. Your coworker, the only one who can explain the system you need to understand, yeah that one, they got sick. You found an edge case on prod that didn't match your assumptions about the system. You didn't find a way to hack around the <em>existing</em> hack that was put in place when leadership changed direction on a dime two years back, and now you need to get creative. Your kid got sick, and parenting comes first. An incident happened, and you spent your day being sucked into firefighting, and you will spend the next day reviewing the incident to try to avoid it ever happening again. It's performance review time. It's team celebration time. tl;dr: shit happens.

Here's the thing, a project is a composition of tasks:

\begin{equation} E[P] = E[T_0] \times E[T_1] \times \ldots \times E[T_N] = 0.95^N \end{equation}

So now the more work required, the lower the overall chance of completion on time. Earlier I said that complexity is the fog of software development. The hard truth is that once a project becomes sufficiently complex, N itself becomes a stochastic thing. Now, unlike that skyscraper, software estimation is has no clear ceiling that can be analytically deduced. Instead, your estimate is a thing with a horizon that drifts off into infinity if you're not careful.

Finance and mathematics have tools to deal with horizon estimates. But they don't have any tools to deal with the basic parameter of the horizon: software complexity. This is the other massive component behind why estimating projects is so challenging, and is the core of the second financial analogy, technical debt.

Also, this is a big reason why as a field we lean on artisanal expertise and context to estimate software projects. No context? Expect schedule slippage; the map is not the territory ; etc . This leads us to the third financial analogy: context as free cash flow. An abundance of context makes paying down technical debt — tackling complexity — much easier. And, paired with a high-value currency (bear with me) — clarity of thought — you're in a much stronger position than when you have neither context nor clarity of thought.

But there is one more analogy I'd like to share with you.

Some fields of economic activity have undifferentiated labour, or close to it. For example: anyone with with arms can place a brick. Maybe one is stronger, or faster, or slightly more accurate at brick laying. But there aren't any fundamental barriers to one person placing a brick any more efficiently than another beyond tragic accidents. Because of this, it's parallelisable. Cost goes up linearly with labour, but time to completion goes down.

But software is about thoughts, complexity, and context. These are fundamental barriers to the interchangeability of labour. If the currency of software development is thinking, and the debt is complexity, and the free cash flow is an abundance of context, then the operational expense is communication overhead. And there are certainly barriers to communication, eg: language. From Brooks:

"Since software construction is inherently a systems effort — an exercise in complex interrelationships — communication effort is great, and it quickly dominates the decrease in individual task time brought about by partitioning. Adding men lengthens, not shortens, the schedule."

If your project needs a lot of communication, your opex is running high. Almost by definition, you are spending time and energy trying not to make dumb mistakes, instead of building great software. So if you can, find a way to build the same thing, with less opex. Conway's Law corroborates this: when an organisation builds software along its de-facto comms lines, unnecessary coordination friction is minimised, and in a manner of speaking the org can build with an optimal opex rate. Please note that this isn't a scalar thing; the nature of the communication matters too. Blindly cutting away comms probably does you more harm than good.

Because of this dynamic — differentiable labour — a project isn't easy to parallelise. In fact, under certain conditions, a project is impossible to parallelise. Cost goes up with labour, but time to completion might go down, but it might also go up. Which way it goes, and how far per unit labour, is a question of thoughts, complexity, context, and communication overhead.

Let me repeat: software project timelines are informed by:

  1. thoughts — currency — roughly: what is the project?
  2. complexity — debt — how puzzling is the project?
  3. context — cash — what is needed to navigate the project?
  4. comms overhead — opex — how much toil is spent discussing the project, versus building*?

* "building" here is assumed to be "… the right thing": we're discussing efficiency, not effectiveness

This is how Brooks gets the title of the book. The concept of a "man-month" is at best useless in software estimation. At worst, it's flat out wrong — it's mythical.