Test Oracles

Oracles are interesting (terminology) because the original oracles were from Greece and were mythological. In software testing, an oracle is a tool that helps you decide whether the program passed your test. Oracles are heuristic - they're fallible.

A List of Partial Oracles

This list can be used in place of the tables from the AST-BBST Foundations Lectures.

  • Constraint oracle: We use the constraint oracle to check for impossible values or impossible relationships. For example an American ZIP code must be 5 or 9 digits. If you see something that is non-numeric or some other number of digits, it cannot be a ZIP code. A program that produces such a thing as ZIP code has a bug.

  • Regression oracle: We use the regression oracle to check results of the current test against results of execution of the same test on a previous version of the product.

  • Self-verifying data: We use self-verifying data as an oracle. In this case, we embed the correct answer in the test data. For example, if a protocol specifies that when a program sends a message to another program, the other one will return a specific response (or one of a few possible responses), the test could include the acceptable responses. An automated test would generate the message, then check whether the response was in the list or was the specific one in the list that is expected for this message under this circumstance.

  • Physical model: We use a physical model as an oracle when we test a software simulation of a physical process. For example, does the movement of a character or object in a game violate the laws of gravity?

  • Business model: We use a business model the same way we use a physical model. If we have a model of a system, we can make predictions about what will happen when event X takes place. The model makes predictions. If the software emulates the business process as we intend, it should give us behavior that is consistent with those predictions. Of course, as with all heuristics, if the program "fails" the test, it might be the model that is wrong.

  • Statistical model: We use a statistical model to tell us that a certain behavior or sequence of behaviors is very unlikely, or very unlikely in response to a specific action. The behavior is not impossible, but it is suspicious. We can test whether the actual behavior in the test is within the tolerance limits predicted by the model. This is often useful for looking for patterns in larger sets of data (longer sequences of tests). For example, suppose we expect an eCommerce website to get 80% of its customers from the local area, but in beta trials of its customer-analysis software, the software reports that 70% of the transactions that day were from far away. Maybe this was a special day, but probably this software has a bug. If we can predict a statistical pattern (correlations among variables, for example), we can check for it.

  • State model: Another type of statistical oracle starts with an input stream that has known statistical characteristics and then check the output stream to see if it has the same characteristics. For example, send a stream of random packets, compute statistics of the set, and then have the target system send back the statistics of the data it received. If this is a large data set, this can save a lot of transmission time. Testing transmission using checksums is an example of this approach. (Of course, if a message has a checksum built into the message, that is self-verifying data.)

  • Interaction model: We use a state model to specify what the program does in response to an input that happens when it is in a known state. A full state model specifies, for every state the program can be in, how the program will response (what state it will transition to) for every input.

  • Calculation oracle: We use calculation oracles to check the calculations of a program. For example, if the program adds 5 numbers, we can use some other program to add the 5 numbers and see what we get. Or we can add the numbers and then successively subtract one at a time to see if we get a zero.

  • Inverse oracle: The inverse oracles is often a special case of a calculation oracle (the square of the square root of 2 should be 2) but not always. For example, imagine taking a list that is sorted low to high, sorting it high to low and then sorting it low to high. Do we get back the same list?

  • Reference program: The reference program generates the same responses to a set of inputs as the software under test. Of course, the behavior of the reference program will differ from the software under test in some ways (they would be identical in all ways only if they were the same program). For example, the time it takes to add 1000 numbers might be different in the reference program versus the software under test, but if they ultimately yield the same sum, we can say that the software under test passed the test.

Note: This list is incomplete and additional oracles (specific enough to support automation) should be added to this list.

Last updated