Technical Article, Testing

The Test Pyramid Part 6: End-to-end testing

December 16, 2019
Bill Hodghead

This post is part of a series on functional testing using the test pyramid.

Also called “system” testing.

Purposes

Verifies the key customer experiences of the software.

Who defines the test?

Defined by product management (PM), often with contribution from quality engineers (QE).

Who codes the test automation?

System automation can be created by a development team or a separate QE team (sometimes called the system team) that specializes in system-wide automation and non-functional monitoring and testing.

What to test?

Test customer experiences and business measures specific to the business and product.
The test may be automation or a monitor of real customer experience.
Don’t cover boundary conditions, negative cases or path variations — these for the lower layered testing.
Amusingly, “system measures” (also called operational measures) like CPU, disk usage, memory usage and so on are NOT the primary measures of “system tests.”

Best practices

Limit to the primary valuable customer experience (don’t do full BDD to find all customer use cases at this level).
Don’t automate end-to-end tests through UI or command line interface (CLI). UI and CLI are best tested through unit, component, and integration testing.
Keep the number of system tests low. Having fewer than 100 tests is a good guideline, but for a real number, check your ROI of cost to maintain vs. costs prevented. Push testing to lower on the pyramid as much as possible as these tests require more effort to maintain.

Measures of success

Experience outcomes (measurable customer experiences)
Leading indicators
Often experiences have non-functional measures that must be achieved through monitoring, for example: how long does it take real customers to complete a transaction?

Maturity levels

None: No key customer experiences define or automated.
Initial: Some customer experiences identified and automated through the UI and kicked off as part of code merges to master.
Definition: All primary product experiences have key measures identified. Most are automated and run regularly with code changes or on a schedule.
Integration: All primary end-to-end experiences are automated. End-to-end automation runs in under two hours and is automatically run on every change.
Management and measurement: Management focuses on measures and experimentation instead of features. Experiences are tied to personas and used by marketing.
Optimization: Leading indicators regularly used and tracked. Experiences are evaluated and updated with major releases.

Experiences

Engineers at each company asked, “Why so few end-to-end tests?” If you have automated a couple of these tests, you’ll see that they are really valuable. In fact, they find more bugs per test than most other types of testing. Why not concentrate on end-to-end testing? Here are some reasons:

Avoiding blocked testing. If you have a bug in the login routine, then most of your subsequent end-to-end tests can’t run and you aren’t getting data from them. Integration and component level tests can still run.
Reducing time to data. End-to-end tests typically take hours to run. Compare that to the seconds of unit tests or minutes for component tests. It’s important to give your developers feedback as quickly as you can.
Reducing test maintenance costs. End-to-end tests hit many areas that are changing. They must be constantly maintained. The more there are, the more the maintenance cost. While it’s possible to abstract some parts of them to minimize change, it doesn’t always make sense — such as when using calls to a published interface.
Reduce debugging time. It’s rare that you can quickly debug and fix an end-to-end test. We found that it took an average of 30 to 45 minutes of developer time. Failures in lower level tests will show exactly what the issue is and can be debugged more quickly (e.g., seconds for a unit test).