TL;DR: the worst bugs involve state. Prefer functional languages, and you’re far ahead of the curve.

This weekend, I went to visit a good friend in NYC. Unfortunately, there was a lingering bug in a part of Boundless Learning’s code that was gnawing the back of my mind. Some of our data collection was corrupt, and for the life of me, I couldn’t figure out why.

Like all epic bugs, it went from major-issue, to full-stop-critical, and I spent my sunny NY-Saturday in front of a laptop, followed it with four fitful hours of sleep, and awoke from a nightmare finally capable of solving the issue.

And it all came down to Python’s woeful implementation of default function arguments.

In Scala, Ruby, and perhaps a hundred other languages, you can provide default values to function arguments. If the caller doesn’t provide an argument a value, the function uses the default value instead. Simple, convenient.

In Python however, the “default value” is not bound at call-time, but at the time of function-definition. 

That’s ridiculous & wrong.

I've recommended Python to a fair number of people. In terms of semantics, syntax, general productivity, maintainability, it far outstrips a language like PHP. However, spending your weekend solving a subtle multi-threaded data/state issue that exists because of poor language-design choices, is a reminder that tools AND abstractions matter. Abstractions shape your productivity, and they shape your thinking.

At Boundless, we aren’t afraid of using some seriously good abstractions, and while part of our server infrastructure is Python based, the other 97% is written in Scala. Beyond having sane default-argument semantics, Scala code skirts this class of problem entirely: you simply don’t use mutable data structures. When your sessions are client-side, your server stateless, and your code stateless, it frees up developers to just look at the single function they are in, and know that everything needed to understand it is laid out lexically in front of them. 

Let me repeat that, and then contrast to any non-functional/state-ful language:

In a functional language, to understand the realm of possibility for how a function will evaluate, you need only look at the value bindings in scope (typically your arguments), and the other (side-effect free) functions you call. You can make very real, valid assumptions, and understand your invariants. You needn’t consider what the rest of the application is doing, because none of it is allowed by the compiler to alter the value of what is in scope. “What you see is what you get” in functional languages. 

Contrasting that: in a imperative/state-ful language, you CANNOT make assumptions, and there are FEW invariants. You MUST to take the rest of your code-base/execution model into account to understand any given line of code, because your assumptions are at the mercy of how well the system & your team-mates happen to respect them. You might not worry all the time, and if you code idiomatically, generally you’ll avoid problems, but you can never be certain, and once in a while, you’ll lose a weekend to the most subtle of issues that is ultimately caused by using mutable data structures.

It can’t be stressed enough that by greatly restricting the realm of what’s allowed, you make more expressive what’s written (hence the name of this blog). The more guarantees you have (lack of mutability, adherence to known types, code invariants, etc), the more can be said about your code.

It’s late 2011, and if I deal with another state-based bug this year, it’s too soon.


  1. For those of you who would poo-poo functional/typed programming languages, I wish you well, the rest of us will be producing skyscrapers, and you can continue repairing brick huts. 
  2. Yes, Scala has mutable data structures & variables, which are avoided by idiom. On balance, this is more of an issue in theory than it is in practice. Defaults matter, clearly, if the bug in subject was any indication. There is always need for state-ful programming, because sometimes it IS the best abstraction. The point of this post is that it is not the best DEFAULT abstraction for the vast majority of programming problems & day-to-day code.