Dave Schweisguth in a Bottle

How many meanings of that can you think of?

In praise of code coverage

leave a comment »

I’ve found measuring code coverage (the coverage of code by automated tests) to be well worth my while, ranking not too far below test-driven development and continuous deployment as a practice that a software project should have in place to make me and my team happy and productive. But that’s not a universal opinion. Many software engineers, even diligent automated testers and continuous deployers, are quick to dismiss coverage, and it’s probably the thing that I’ve had to set up myself most often when I join a project. So let’s go over why it’s a valuable practice.

First, test well

First, let’s agree on table stakes. The most effective way to keep productivity high and avoid regressions in a nontrivial software project is behavior-driven development (BDD): capture the software’s behavior from the user point of view in tests, high-level acceptance tests that express the user’s interaction with the software, and lower-level tests, unit tests and maybe other flavors depending on architecture, that express detailed requirements, still in as user-centric a way as can be done given the code being tested. BDD encompasses TDD, so write tests first, then implement, then refactor. Other benefits aside, doing this means the test suite breaks when someone changes code in a way that breaks how the software is intended to work. That means we can add new code and refactor old and new code confidently. So we need to write meaningful, expressive, user-centric tests, and we need to do that before we even begin to implement, never mind worry about code coverage.

One sometimes hears that it’s bad to write meaningless tests to increase coverage, which is true as far as it goes. See for example Dave Farley’s video “Don’t chase test coverage!” (part of an excellent series of videos in which Dave almost always preaches to my choir), in which he argues that focusing on increasing coverage distracts from writing good tests. Well, I have more faith in you than that. In fact, we paired on a little BDD exercise when you interviewed, so I know you can and do write good tests. Now, how can we up our game?

Then, test enough

Code coverage gives great insight into your existing codebase and changes to it:

  • If existing code isn’t covered, it isn’t tested. Either write more tests to cover that code, or delete the feature if it isn’t needed. The team can get by for a while thinking that some big uncovered part of an app is brochureware that doesn’t really need to keep working, and manually smoke-testing every now and then, and then an engineer doesn’t smoke-test when they should have and a salesperson clicks there during a demo and gets a 500 error, and… just write a test or delete the code.
  • If a code change introduces uncovered code, either old code became untested or new code was added without being tested. Sometimes a change orphans a method or a class that can now be deleted. Sometime it adds a conditional or an error handling block that isn’t tested. Sometimes changing tests to accomodate new behavior accidentally leaves old behavior untested.

Untested code is a quality concern that you’d fix if you found it reading old code, or comment on if you found it when reviewing new code. So let your coverage tool find it for you.

How much coverage is enough?

On a green-field project, set up your build to break when coverage is less than 100%. Yes, really! If you do it from the beginning, it’s easy to keep up, just like breaking the build when a test fails and not deploying when the build fails. Only very rarely do I need to write an extra test (usually of a test helper method with a branch that’s only executed when a test fails) to keep coverage up. (Or you could selectively exclude code which doesn’t need to be covered. Francesco Borzì and Adrien Guéret wrote about that approach.) And it’s far easier to manage 100% coverage than less-than-100% coverage. Long-lived uncovered code is noise that masks new gaps.

On a project that didn’t measure code coverage until it had already accumulated a lot of uncovered code, it’s never worth anyone’s time to backfill coverage to 100%. It’s harder to test old code than it is to test-drive new code; it’s often not clear what an untested block of old code is supposed to do. But coverage still identifies untested features that need to be tested or removed, or handled with care until that becomes too much of a pain and you bite the bullet and test or remove them. And watching for new uncovered lines introduced by code changes is just as useful with a less-than-100% baseline as it is with a 100% baseline. (Note that, given a background of uncovered code, you can’t just watch the overall percentage; you have to watch new uncovered lines. Coverage tools typically report new uncovered lines separately.)

Try it

If your project doesn’t measure code coverage yet, add it and browse the report. Set up your editor to display coverage, or make reviewing the coverage part of your coding routine. If coverage is already set up but the team’s ignoring it, put the report up front in your build artifacts or integrate it into your pull request process. All these things are easy and safe to do, and I guarantee you’ll learn something, and add a valuable tool to your toolbox.

Written by dschweisguth

January 24, 2024 at 15:42

Posted in Uncategorized

Tagged with ,

Leave a comment