Writing Unit Tests That People Can Read

(I originally posted this on my MSDN blog.)

There’s a great quote from Refactoring: Improving the Design Of Existing Code:

“Any fool can write code that a computer can understand. Good programmers write code that humans can understand.” — Martin Fowler

Recently, I’ve discovered that it applies to more than just code. It also applies to unit tests.

I’ve been writing unit tests and using TDD for awhile now and while it worked ok, I wasn’t really happy with the aesthetics of my unit tests. They were technically correct but were difficult to write correctly the first time, even more difficult for me to revisit and modify later, and nigh impossible for anyone not already fluent in the quirky idioms to read and understand. That’s not a great recipe for success.

I went searching for ways to make my unit tests easier to understand and discovered three techniques that improve things quite a bit:

Structuring tests in the Behavior-Driven Development style
Using the new Arrange-Act-Assert syntax in Rhino Mocks 3.5
Using specification extension methods ala SpecUnit.net

The feedback from my team members is that unit tests written in the new style are significantly easier to read, understand, and modify and they do a better job of documenting the intended behavior of the code.

Behavior-Driven Development

Behavior-driven development, or BDD, can be hard to grasp if you just do web searches and read the literature. At least, it was for me. I kept getting lost in what seemed to be wandering philosophical debates about “executable specifications” and customer-oriented languages. It didn’t help that there are apparently two flavors of BDD, each with their own esoterica.

I finally found an excellent article in CoDe Magazine, written by Scott Bellware, that brought it down to the practical level of improving the way unit tests are written, with code samples that made sense to me. I’m sure there’s a lot of subtle philosophy in the article that went over my head (I’m a barbarian, sorry), but I immediately seized on the Context/Specification format for my tests. (BDD adherents prefer to call them “specs”, for very good reasons, but to avoid confusion my team has continued to call them tests for now.)

I won’t give a thorough definition of Context/Specification or of the terms it uses here; you can refer to Bellware’s article for that. However, the elements of the Context/Specification approach I found most helpful are:

Use language that focus on behaviors, particularly user-oriented behaviors, rather than implementation details.
Emphasize the distinction between the initial conditions in your tests, the actions that your tests take to change those initial conditions, and the expectations you have about the results.
Group your tests together according to context.

Focusing on behaviors rather than implementation details helps to focus my mind on the “design” aspect of TDD rather than jumping ahead to making assumptions about how I’m going to implement it. It often leads me to a design that better reflects the domain in which I’m working.

Separating the the setup, action, and observation parts of my tests helps make them much easier to quickly scan to find the particular test I’m interested in at the moment. Or if I’m looking at unfamiliar code for the first time, it’s easy to get a high-level look at what the class’s responsibilities are and what circumstances it’s equipped to handle without getting bogged down in test implementation details.

Grouping tests according to context helps me to think carefully about what I need to consider and makes it easier to see when there behaviors that I’ve overlooked.

I don’t doubt that executable specifications and ubiquitous language and all that stuff has a lot of value and I’m sure I’ll learn more about those concepts more in the future, but for now I’m content to use the BDD idioms merely as a better way to drive the design of my code.

Arrange-Act-Assert Syntax

I settled on Oren Eini’s Rhino Mocks pretty early on for my interface mocking needs and I was relatively happy with it. The only problem was that the record-playback pattern was counter-intuitive and was difficult to use in a clearly-understandable way. Fortunately, version 3.5 was released a few months ago with a new arrange-act-assert syntax that makes tests much easier to understand.

With the AAA syntax, there’s a much better separation between the code that sets up the mocks and the code that verifies your expectations. The flow reads naturally from beginning to end which is a huge help.

One minor drawback to the new AAA syntax is its reliance on lambda expressions, which can be somewhat daunting for people who aren’t familiar with it. However, it makes sense once you understand what it means and familiarity with lambdas is an important skill to build in any case, so it’s not wasted effort.

BDD and AAA Example

Let’s try out an example. In one of my projects, I have a class that watches a folder on disk for new files to be created. When a new file appears, the upload file watcher will make sure it’s one we recognize and are interested in, and if so, will invoke an action.

In this interaction-based test, I was trying to express the idea that if you ask it to watch a folder that doesn’t exist, it should create it for you. This is the test as I originally wrote it:

[TestMethod()]

public void CreatesFolderIfNonexistent()

{

var mockTimer = this.mocks.Stub();

var mockPath = this.mocks.DynamicMock();

var stubFileSystemWatcher = new FileSystemWatcherStub();

var mockLogger = this.mocks.Stub();

using (this.mocks.Record())

{

SetupResult.For(mockPath.GetFiles())

.Return(new List());

SetupResult.For(mockPath.Exists).Return(false);

mockPath.Create();

}

using (this.mocks.Playback())

{

var watcher = new UploadFileWatcher(mockPath, mockLogger, stubFileSystemWatcher, mockTimer);

watcher.Start();

}

I was already trying to orient my tests around class behaviors, but it’s difficult to absorb all the implications of this test name in a quick scan of the code, especially when it’s mixed in with a whole bunch of equally-terse names.

The record-playback mocking syntax is all jumbled up, with setup and expectations mixed together and the actual action coming last. There’s no clearly-labeled expectation at all – you just have to know that the call to Create() inside the record block is the expectation.

Now here’s the test after I rewrote it to use both the Context/Specification style and the Rhino Mock’s AAA syntax:

[TestClass]
public class when_the_watcher_is_started_and_the_watch_folder_does_not_exist : UploadFileWatcherContext
{
    protected override void Context()
    {
        base.Context();
        this.stubPathToWatch.Stub(x => x.Exists).Return(false);
        this.stubPathToWatch.Stub(x => x.GetFiles()).Return(new List<IFileInfo>());
    }
    protected override void BecauseOf()
    {
        this.watcher.Start(this.stubUploadFileSource);
    }
    [TestMethod]
    public void should_create_the_watched_folder()
    {
        this.stubPathToWatch.AssertWasCalled(x => x.Create());
    }
    [TestMethod]
    public void should_start_watching_the_watch_folder()
    {
        this.stubFileSystemWatcher.AssertWasCalled(x => x.StartWatchingFolderForNewFiles(Arg.Is(this.stubPathToWatch), Arg<Action<IFileInfo>>.Is.Anything));
    }
}
 

(Sorry about the line wrapping: I’m still figuring out the best way to include code snippets.)

The first thing to notice is that there’s an entire test class defined with a really verbose name. The test class represents a particular set of circumstances that we want to think about. The class name is written out in proper English so it’s very explicit, eliminates guesswork, and is easy to scan.

Next, there is a method called Context() devoted strictly to setting up the initial conditions for this test. Well, actually two of them: there’s now a base class called UploadFileWatcherContext that holds the common setup code used by all tests. It just creates the bare stubbed interfaces that we need and injects them into the class under test – I don’t do anything fancy in the base class. In this context, we set up a couple of behaviors on the stubbed interfaces.

There’s also a method called BecauseOf() devoted strictly to performing the action that causes the results we’re going to look for.

Finally, there are two actual tests that verify our expectations. As with the context class, these tests are named in a verbose style that a) is explicit and easy to scan and b) uses behavior-oriented language and avoids implementation jargon. Each test clearly verifies that a certain methods were invoked on our stubbed interfaces, which is how we determine whether the expected behavior occurred.

Think of it as a state machine. Context() defines state A, BecauseOf() defines the transition, and the tests collectively determine whether we arrived at state B.

Although the second version of this unit test has many more parts than the first, there are several important benefits. One, if you’re scanning the tests and don’t particularly care how they’re implemented, it’s easy to skip the Context() and BecauseOf() methods entirely. They’re not important in understanding what the tests are expressing. All you really need to understand are the class name and the test method names and you can optionally go deeper if you want more details on a particular test.

Two, when you first state a particular circumstance and then think about everything you expect to happen in that circumstances, it’s easier to come up with a complete list. The original test didn’t actually verify that the newly-created folder would be watched after it was created, but that omission wasn’t at all obvious. When I refactored the test to the new format it was immediately obvious that there were two expected behaviors in this situation, not just one.

Three, separating setup, acting, and observing helps to drive a better design. In the old unit test, I was supplying the path to watch to the constructor rather than to the Start() method. Once I separating setup from acting and moved the object construction to a base context class, it became obvious that the path properly belonged to the action phase, not the setup phase. Once I moved the path parameter to Start(), the design of other classes that use the upload file watcher magically became a lot simpler.

Specification Extensions

The third technique for bringing clarity to unit tests is to use extension methods to make the tests read more naturally, ala SpecUnit.net. (SpecUnit.net is written for xUnit.net, and right now I’m using the default Visual Studio test environment, so I just implemented my own.)

The standard Assert class gets the job done just fine, but it reads like code, not like English. In unit tests, we want to optimize for scanning and instant comprehension. One easy way to do that is to implement extension methods on types that express the intent of the tests as clearly and succinctly as possible. For instance, consider this test, written two different ways:

[TestClass]
public class when_empty_encrypted_text_is_submitted_for_decryption : EncryptionServiceContext
{
    protected override void BecauseOf()
    {
        this.plainText = this.encryptionService.Decrypt(string.Empty);
    }
    [TestMethod]
    public void the_plaintext_should_be_empty_V1()
    {
        Assert.AreEqual(string.Empty, this.plainText);
    }
    [TestMethod]
    public void the_plaintext_should_be_empty_V2()
    {
        this.plainText.ShouldBeEmpty();
    }
}
 

In the first version of the test, we use the Assert.AreEqual, which works, but requires a bit of mental translation in order to comprehend it. Sure, experienced programmers can do it very quickly, but it’s still an overhead cost that can be reduced.

In the second version, we’ve defined an extension method on the string type that simply performs the assert on our behalf. Same functionality, but much simpler to scan and comprehend.

We started with a few basic methods in our extension library and every time we want to express an observation that doesn’t read naturally, we just add a new extension method to the library. This way, even very complex observations can be encapsulated in one line that succinctly states the intent of the observation and hides the implementation details.

More To Come

There are a few interesting details still to cover, but this is enough for one post. In the future I’ll cover handling expected exceptions and how to write BDD-style tests in the default Visual Studio test runner. (It requires just a tiny bit of glue code.)

Writing Unit Tests That People Can Read

Behavior-Driven Development

Arrange-Act-Assert Syntax

BDD and AAA Example

Specification Extensions

More To Come

One thought on “Writing Unit Tests That People Can Read”

Leave a comment Cancel reply

Behavior-Driven Development

Arrange-Act-Assert Syntax

BDD and AAA Example

Specification Extensions

More To Come

Share this:

Related

One thought on “Writing Unit Tests That People Can Read”

Leave a comment Cancel reply