Irreducible Complexity in Software Development

(I originally posted this on my MSDN blog.)

My previous post talked about how software development can’t be modeled by any process that’s significantly less complex than the development process itself. I’d like to expand on that a bit more.

Irreducible Complexity

I think people are attracted to modeling and detailed design documents because they’re overwhelmed by the amount of complexity they’re facing in their project and they hope that the models will be significantly less complex than the software they’re used to model. That may be true, but you can’t lose substantial amounts of complexity without also losing important detail and nuance, and it’s the details that have impacts on the project out of all proportion to their size.

For models to be able to completely express a program, they have to be approximately as complex as the program they’re expressing. If they’re significantly less complex then they’re not adequate to fully express a working program and you’ve still got a lot more work to do after you finish your models, and that work is likely to invalidate the model you built in pretty short order. As Bertrand Meyer famously said, “Bubbles don’t crash.”

Compress This!

A useful analogy might be that of compressing data. Most raw data can be compressed to some extent. But the pigeonhole principle tells us that any general-purpose lossless compression algorithm that makes at least one input file smaller will make some other input file larger. In other words, there’s a fundamental limit to the amount of lossless compression that can be applied to any data set; if that weren’t true, you could recursively compress any data set to zero bytes.

If you close one eye and squint real hard, you could view the history of programming languages to this point as an exercise in data compression. We went from assembly to C to C++ to C#, and hundreds of other languages, and at each step we figured out to make the languages more succinctly expressive of what we want to do. That’s great! But at some point we’re going to run into that fundamental data compression limit where a more abstract language actually makes the amount of effort larger, not smaller. (Some would argue that Cobol managed to hit that limit a long time ago.)

I suspect that’s what happens when people try to extensively “model” software in documentation or any artifact other than code. It seems like it ought to be simpler but it turns out to be more complex than just writing the code in the first place.

Planning for Battle

That’s not to say that design artifacts are useless. They’re great for thinking in broad terms and making rough plans. Just don’t rely on them for more than they’re capable of doing. As Eisenhower said, “In preparing for battle I have always found that plans are useless, but planning is indispensable.”