Software QA FYI - SQAFYI

Testing Monte Carlo Algorithmic Systems

By: Frank Erdman

A Monte Carlo system uses non-deterministic algorithms to approximate a solution, and generally the more iterations of the algorithms that are run, the better the approximate solution will be. A classic simple example of this type of program is one estimating PI. If one pictures a unit circle inside a square, with radius of 1, then the area of the circle is PI*r^2 and the area of the square is (2r)^2, or 4. The area of the circle divided by the area of the square then comes to (PI*r^2) / (2r)^2, or PI / 4. If this ratio is computed, one can multiply it by 4 to get PI. A “Monte Carlo” approach to this would be to randomly select points on the square. If these points lie within the unit circle, it counts as a “hit”, so the equation would be:

Pi ~ 4 * (Number of Hits) / (Number of Iterations).
As an example, say we ran the program with 10 iterations, and got 8 “hits”, i.e. 8 of the points lay within the unit circle (i.e. point p satisfies condition: p = x^2 + y^2 <= 1). Then the equation would be:
Pi = (4 * 8) / 10 = 32 /10 = 3.2

The more iterations are input to the program, the closer one can get to a value for PI (defined as being 3.1415…).
Example source code for an implementation of such a program in c can be found at:
http://www.dartmouth.edu/~rc/classes/soft_dev/C_simple_ex.html
Now there are many types of solutions in enterprise software which use this basic strategy, the so-called “Monte Carlo” strategy, because, for example in the situation of computing PI, random numbers are being used to generate selected points within the square containing the unit circle, and Monte Carlo is the name of a famous casino, casinos being examples of the use of random numbers. A “Monte Carlo” type of solution will have two essential characteristics:

1) Each input / output pair is predicated on random or non-determinist processes
2) Increasing the number of input/output pair cycles for the same inputs during program execution (should) lead to better (more accurate) output.

How would one go about testing this sort of application? One would have to fit two factors into account: the non-determinism associated with the outputs, and the expected behavior that, despite this non-determinism, better or more accurate outputs should be obtained by increasing the number of iterations in execution. Here, for example, we could have 3 test scenarios, one with 10 iterations (Test Case 1), one with 100 iterations (Test Case 2), and one with 1000 iterations (Test Case 3). We could run these 3 simple scenarios each, say, 100 times. Now, we could put the outputs for the 100 runs of Test Case 1 into a spreadsheet, the outputs for the 100 runs of Test Case 2 into its own spreadsheet, and the outputs for Test Case 3 into a third spreadsheet. Looking at say, the spreadsheet for the results of Test Case 1, we can expect that (probably) each run will have a different output value, and the same with the other two test cases, due to the indeterminist nature of the algorithm under test. However, looking at the results of Test Case 1 in a spreadsheet, we can calculate the standard deviation from what we know (from talking to development or the requirements team) is a good result. So, for example, in preparing this article, I implemented a c version of the Monte Carlo example, and found that if I input 10 iterations then my output could vary widely from 3.14, which is known as a good approximation for PI. Say, for the sake of argument, that the standard deviation should be less than or equal to plus or minus 0.5 from 3.14, so anywhere from 2.64 to 3.64 would be the expected results for Test Case 1 (10 iterations). However, Test Case 2 (100 iterations) should have a smaller standard deviation, say, for the sake of argument, around 0.25, so the results could be anywhere from 2.89 to 3.39. Test Case 3 (1000 iterations) could have a still smaller standard deviation, say, 0.12, so the expected results could be anywhere from 3.02 to 3.26. (Incidentally these numbers are generally in the ball park of what I observed testing my own c implementation of the “Monte Carlo PI” problem.)

To test these types of applications, a test engineer needs to obtain what the expected “standard deviation” for a given number of iterations might be, and then, test for that expected standard deviation for that given number of iterations. The test engineer needs to further obtain what the expected reduction in the standard deviation of results should be when increasing the number of iterations by a certain reasonable amount (such as in this case, increasing by powers of ten). Then the tester’s expected results are, for a given test case or number of iterations, that the standard deviation from multiple runs of that test case is not greater than the one expected.