Technical Article, Testing

The Test Pyramid Part 3: Unit testing in the test pyramid

November 11, 2019
Bill Hodghead

This post is part of a series on functional testing using the test pyramid.

Purposes of unit testing

Documentation: Defines the behavior of the code as a working specification that is kept up to date. Reduces the need for a development specification.
Maintainability: By proving code is testable, it proves the code is maintainable forcing practices such as low coupling, low complexity and dependency injection.
Helps maintain Quality by guaranteeing the code does what the developer intended at the most basic level.

With good unit tests, the code is much easier to modify. Unit tests help the most on the second and subsequent times you change the code, as it’s easy to know if you have broken basic functionality in seconds.

Who defines the test?

The “test definition” is typically just the title of the unit test.
Any developer or architect can define the tests. It doesn’t have to be the person who defines the test code, though it usually is.
Someone who is NOT the developer of the code can sometimes describe the behavior needed more clearly, especially if you are using Test Driven Development (TDD) and pairing.

Who codes the test automation?

The developer who is writing the code or someone paired with that developer.

What to test?

The smallest unit of functionality — often a single path through a function or class.
Simple getters and setters don’t need to be tested but often are tested in order to get values changed in other functions.
Simple positive, negative and boundary conditions are mostly covered by unit tests or in API compatibility tests.

Best practices

TDD (Overview, how-to, measures of effectiveness)
A standard format such as AAA
A naming convention. I like “XXX__ShouldDo_YYY_When_ZZZ” but pick something that tells you exactly what the test does so you don’t have to read the test code.
Drive tests from data tables.
Use safe refactoring to make unit testing possible, especially on older code.
Publish code quality metrics by component (see measures below).
A unit test should always be able to fail. Using TDD, it starts as failed and then is made to pass.
A unit test should fail for one and only one reason.
No other unit test should fail for that reason.
Properties of a good unit test: self-explanatory, self-evaluating, portable, small, isolated, readable, automated, fast.

Measures of success

Code coverage in the 75% to 85% range. A high coverage forces good maintainability practices, including low coupling, low complexity and use of dependency injection. Higher than 85% has diminishing returns but is often doable for small components.
All unit tests for a component run in a few seconds. If unit tests take longer, they are not isolated enough and not mocking external components as they should.
Use other code quality measures like complexity (< = 10 / function) and coupling or # of dependencies (< = 7 per function).

Maturity levels

There are several maturity models for unit testing already published. For some reason, they all put measuring the coverage at a high level. This is ridiculous, as you have to measure to get anywhere. Here’s a simple stab at a model:

None: Unit tests not done. Code quality not measured.
Initial: Baseline measured but coverage < 25%. Code quality not regularly measured.
Definition: Coverage measured manually, though < 50%. Goals set, and process improvement defined. Group may struggle to achieve goals on new or changed code.
Integration: Unit tests and coverage tools automated as part of merges to master. New or changed code always meets coverage goals.
Management and measurement: Dashboards show code quality status for all components. Check-ins always meet code quality goals.
Optimization: Using TDD to define unit tests before coding. Running unit tests continuously during development.

Experiences

There are whole books on unit testing, and we’re not going to cover everything here. However, here are some direct experiences with applying the unit testing guidance above.

Developers of legacy code in Company A had the motivation and time to add unit tests, but they didn’t have the skills. The legacy code was never written to be testable, and they needed training in safe refactoring techniques to be able to change the code to make it testable and maintainable. Once that happened, they made great progress in cleaning up the old code as they added features.

The team of developers in Company B that used TDD had much higher coverage numbers than other teams in the company — often around 100%. That code was easier to change. The code that had below 70% coverage was hard to change by developers other than those that wrote it. Some of the low coverage code was eventually thrown out and rewritten.

Multiple companies found that unit testing the user interface (UI) was pointless when it had no business logic, just bindings to underlying functionality. When there was a small bit of functional logic in the UI (common with NodeJS or Razor C# pages for example), a few direct unit tests could cover it (not UI driven).

Likewise, in most microservices, there’s a small bit of code that is not the logic of the service but is just a connector to an external component. This code is tested through API tests or integration tests, not unit tests. For these reasons, it’s a good idea use the code coverage numbers as a tool but allow teams to bypass the coverage rule on check-in if multiple reviewers agree it makes sense. Example: one developer added logging statements throughout lots of code that didn’t have unit tests yet, but she otherwise made no changes to the functionality. Because the code was not functionally changed, it didn’t make sense to force her to add all the unit tests.

Multiple companies found that unit tests were much better for productivity than for quality. The functionality they test is small and, after you’ve made it testable, that functionality is pretty easy to understand. Unit tests on simple maintainable code rarely fail. The main benefit of the unit test was getting the team to write simple maintainable code in the first place and then documenting what that code should do. And if they do fail after a change, great! You can get the result in seconds and can fix it by hitting Undo a few times.