Using a test oracle

Oracles solve “This thing needs more tests than I can list!”

We use data-driven testing to reduce the number of tests into a few rows. What happens when the number of test rows get to be too many?

Example: Let’s say we have a date control that is composed of three integer inputs: one for year, one for month and one for day. If we write out all the equivalence classes we get something like in this article, which shows we have ~150 test rows. If we did boundary value analysis, it would be even more tests. It’s a lot of work to find all those tests and it’s also a lot of work to review them.

There’s an easier way. Instead of using tests as rows of data, tests become rules. A rule might be that we will accept only years between 1812 and the current year, inclusive. Another might be that months can only be from 1 to 12.

An oracle is a system that can take those rules and run them and get the same output as the system being tested. We use it like this:

diagram of how to use Oracle system to   reduce the number of tests

We feed the same data into both our system under test and our oracle. Then we compare the results, which should be the same.

Never use the same algorithm in the oracle that you are using for the system under test. If the test and code being tested are the same you are just checking that they match, not that they are right.

The oracle must be simple. Ideal oracles are very easy to understand, maintain and add to. They might not test every value, but they test the ones they know.

A sampling oracle

Imagine we must test the values in a complex tabular report.

We could try to test it by reproducing all the logic to test every value in the report, but that violates our rule against using the same algorithm for testing as the thing under test.

Instead, we try to come up with simple questions we can ask about parts of our data that we can answer. For example, our oracle can ask, “Are the number of rows equal to N?”, “Is the key value always increasing?”, “Are their duplicate key values?”, “Does every row have a value for the required column #2?” and so on.

Each of these rules is a test that samples the data output and compares it to an expected state. For the sampling oracle, we don’t check every possible value in the report, but we do check enough to feel confident that the report is good.

Using chain of responsibility as an oracle

A chain of responsibility is perhaps the easiest and most powerful oracle to use for most data driven testing problems.

Reminder: What’s a chain of responsibility?

For those that have forgotten, or never knew, a chain of responsibility can be thought of as a series of if / then statements. Each statement, if true, stops further processing. It’s also sometimes called a guard pattern.

Let’s say we’re testing a function that adds two numbers. Some simple rules to check our function might look like this pseudocode example:

If (parameter 1 == null OR parameter 2 == null) then return Error 1
If (parameter 1 == blank OR parameter 2 == blank) then return Error 2
if (parameter 1 + parameter 2 > MaxInt) then return Error 3
Else return Success

Typically, the chain of responsibility lists the rules in order from the most common to the rare or from worst negative case the least bad case. The final rule is always “else return success” meaning that the oracle passes.

Using a chain of responsibility to test our date control

Imagine you have a function you need to test that looks like:

int isValidDate(int year, int month, int day)

We assume the function returns 0 if the date is valid and returns one of many possible numbers for each kind of failure. We restrict the valid years to be from 1812 to the current year.

Our oracle pseudocode would start like this:

If (m < 1 or m > 12) return Error-month
If (d < 1 or d > 31) return Error-day
If (y < 1812 or y > CurrentYear()) return Error-year
Else return success

Where the errors are integer number constants.

Great, but we know a date is more complex than that. We know that some months have only 30 days, so we add:

If (m in [9,4,6,11] and d > 30) return Error-monthday

Also, February is a special case, and has only 28 days, so

If (m = 2 and d > 28) return Error-February

Now, our oracle looks like:

If (m < 1 or m > 12) return Error-month
If (d < 1 or d > 31) return Error-day
If (y < 1812 or y > CurrentYear()) return Error-year
If (m in [9,4,6,11] and d > 30) return Error-monthday
If (m = 2 and d > 28) return Error-February
Else return success

Notice how it was easy to add new rules as we came up with them without changing the existing rules. We only need to make sure you get the order right when the rules have overlapping conditions.

Rule order matters when we remember that February also has leap years. Leap year can be done like this:

If (m = 2 and (y mod 4) = 0 and d > 29) return Error-leapyear1

This rule checks that February can have 29 days every 4th year. We must put this new rule BEFORE our other February rule, otherwise this one would never get called.

Are we done? Nope. If you remember the details of leap years you’ll know that:

“Every year that is exactly divisible by four is a leap year, except for years that are exactly divisible by 100, but these centurial years are leap years if they are exactly divisible by 400. For example, the years 1700, 1800, and 1900 are not leap years, but the years 1600 and 2000 are.”

Wikipedia: Leap Year

Thus, we get our last two rules:

If (m = 2 and (y mod 100) = 0 and not (y mod 400) and d > 29) 
return Error-leapyear2
If (m = 2 and (y mod 100) = 0 and d > 28) return Error-leapyear3

Inserting those correctly, we get the final oracle:

If (m < 1 or m > 12) return Error-month
If (d < 1 or d > 31) return Error-day
If (y < 1812 or y > CurrentYear()) return Error-year
If (m in [9,4,6,11] and d > 30) return Error-monthday
If (m = 2 and (y mod 4) = 0 and d > 29) return Error-leapyear1
If (m = 2 and (y mod 100) = 0 and not (y mod 400) and d > 29) 
return Error-leapyear2
If (m = 2 and (y mod 100) = 0 and d > 28) return Error-leapyear3
If (m = 2 and d > 28) return Error-February
Else return success

There we are. All the complexity of a date function described in 9 rules, or roughly 9 lines of code!

Coding the date oracle

To test the date function, we still need some good test values. I’ll use equivalence classes here to get the following table. It would be a little bigger if I used boundary value analysis, but not much. This assumes 2019 is the current year.

Year (y)Month (m)Day (d)
181100
190016
2000228
20011229
20201330
31
32

Then, the test code looks like:

[Test, Combinatorial] 
public void Test_Is_Valid_Date_With_Oracle( 	
	[Values(1811, 1900, 2000, 2001, 2020)] int year, 
	[Values(0, 1, 2, 12, 13)] int month)
	[Values(0, 6, 28, 29, 30, 31, 32)] int day)
{ 
	Assert.That(isValidDate(year, month, day), 
Is.EqualTo(dateOracle(year, month, day));
} 

I’m using the Combinatorial attribute in NUnit to indicate that I want to run this for all combinations of the 3 inputs. NUnit will turn this single line of test code into 5 x 5 x 7 = 125 tests.

This example shows how you can do 125 tests using 10 lines of code (9 as rules in the oracle and 1 in the test) and make them easy to review and add to.

Oracles let you reduce your tests by converting many rows into rules, but what happens when you still have too many tests? This can happen when you have a combination of parameters. Please see my earlier article about using pairwise testing to reduce the number of tests when we have combinations.

Copyright © 2019, All Rights Reserved by Bill Hodghead, shared under creative commons license 4.0