Copyright 2004 by Tejas Software Consulting and Christopher J. Meisenzahl - All rights reserved.
Overview -- Observations -- Maturity -- Project activity -- Platforms -- Support -- Documentation -- Installation -- Implementation -- Performance -- Similar tools -- Limitations -- Appendix A: Additional examples -- Appendix B: Dealing with impossible pairs -- Appendix C: Processing jenny's output
Reviewer:
Christopher J. Meisenzahl
Date reviewed:
2004-03-17
Version reviewed: 2003-09-14
Maintainer: Bob
Jenkins
URL:
http://burtleburtle.net/bob/math/jenny.html
Testingfaqs.org
category: Test
Design Tools
License: Public Domain
User
interface: Command Line
Jenny is a tool
used primarily to generate test case combinations based on a
specified set of inputs. Let's say that you have a set of
variables representing configurations that need to be tested, such as
combinations of
browsers (IE, Netscape, Mozilla, Opera, Safari), operating systems
(Win95, Win98, Win2k, Mac OS X, Linux), JavaScript configurations
(on, off), and SSL options (enabled, disabled). To exhaustively test
every combination of these values, you would need to test 100
configuration combinations: 5 x 5 x 2 x 2 = 100.
You
could instead
decide to test only the unique pairs of variable states (known as
pair-wise testing). This has been proven empirically to be an
effective method to reach satisfactory test coverage (treatments of
the effectiveness of pair-wise testing are handled elsewhere, see
Appendix D). By feeding the “5 x 5 x 2 x 2” example into
a tool like jenny, you can whittle the 100 combinations down to only
25, and still have full pair-wise coverage. The bottom line is that
tools that generate pair-wise test cases have the potential to
greatly reduce the time and cost associated with testing. See the
documentation section for references to additional information about
this technique.
Here's
an example of
jenny in action on a smaller problem.
I like this tool quite a bit and find it both useful and powerful. But it's not for non-technical users or those who don't understand pair-wise testing. It's important when using a tool like this that you understand what it is that you've asked the tool to generate, and its limitations. Pair-wise testing is not a substitute for risk analysis, Human-Computer Interaction testing, performance testing, etc.
Also, the fact that it must be compiled and only has a command line interface will be daunting and possibly overwhelming for many non-technical users. But this facet of jenny is also one of its strongest assets. Being a command line application gives it a broad appeal. It can be used natively on the Win32, Linux, most Unix variants, and even MacOS X.
By default, jenny will generate all pair-wise combinations of inputs. But it is flexible and permits n-tuple combinations to be generated (e.g. 3-tuple, 4-tuple, etc.). I don't see higher-order tuples being needed very often, but it's good to know it's there.
In his notes, the tool's author goes into a good amount of detail about some relatively advanced combinatorics that frankly is over my head. While most users will want to generate all-pairs combinations of test cases, triples, quadruples, and more may be generated. It's comforting to know that the tool is capable of more in case I need it.
I
find the user
interface fairly straightforward, given that it's a command
line application. While it certainly doesn't hold your hand,
one or two simple examples make it clear how the tool functions in
its basic use. Compiling and anything beyond the most basic use of
the tool requires some familiarity with a command line interface;
Unix familiarity is a plus. Some of the tools I reference in this
review (specifically wc and sed) are not natively available in
Microsoft operating systems. I found Cygwin
to be an excellent compromise as it provides a Unix-like environment
under
Windows.
4 - Beta (on a scale of 1-5)
I would consider this tool to be at the beta maturity level, with a couple of caveats. The developer created this utility and placed it in the public domain. As of this writing, the developer has created four releases since August 2003 but there is no formal schedule for future releases.
I
encountered no crashes or serious errors when using the tool. Key
features have been implemented
but it is not particularly polished. However there is ample
documentation, including detailed info available by running "jenny -h".
3 - Stable (on a scale of 1-5)
I would rate the level
of project activity as stable. Jenny is not two years old yet, but it
has seen a couple of well documented revisions. There is no formal
schedule for future updates.
The program consists of a single C file. It appears to be ANSI C compliant. The author's web site includes a link to a pre-compiled Windows binary. I was able to compile the program with no problems using gcc on Windows 2000 with Cygwin, and on Red Hat Linux 9. It is likely to work on any system with a C compiler and a shell interface.
I found no mention of formal support. However, the author's web site does include his email address and he was quite helpful to me, graciously answering several of my questions.
No
publicly-accessible version control system exists; though old versions
of the code are available. The code is
well-commented but does not include a version number. A change log is
available on the web page. The tool
does not have an explicit diagnostic mode.
I think the documentation for this product is well done. The documentation consists of a single web page with the following categories:
The
author does a
good job describing the point of a tool like jenny, and follows up
with basic syntax and a treatment of each of the command line
options. Also covered are some tips for making jenny more useful
(including using tools like wc and sed to filter and manipulate
output), and a description of possible future enhancements. The author
even lists a few competitive tools,
both freeware and commercial.
For more information about all-pairs, see these papers: "Pairwise Testing" by Michael Bolton and "Efficient Testing Using the Pairwise Approach" by Bernie Berger.
There is no installer. The application consists of one C file, and after compilation there is just one executable file.
The tool is implemented in C. I built it very easily using Cygwin/gcc on a Windows 2000 PC. No errors or warnings were reported by the compiler, except when I added the -Wall option the compiler issued several warnings. There are about 1300 non-comment source lines, and 320 lines of comments. The code is very well-formed and commented.
As with most any command-line application built with C using modern hardware, it's quite fast. I gave it some pretty large problem sets, most of them unreasonably large. Most were completed in under a second, with all finishing in under one minute. The bottom line is that I don't see performance being a limiting factor for anyone using the tool. Memory use by the tool is negligible in the basic examples.
I executed a few tests to see what kind of performance can be expected. All tests were done with the –n2 option (generate pair-wise tests) and run on an IBM ThinkPad with a 1.5 GHz Centrino and 768 MB RAM.
Jenny, run under Windows 2000 with Cygwin, compiled
locally with gcc. The test runs used an equal number of dimensions and
features within each dimension.
|
Problem size |
Time |
Generated Tests |
|
3 x 3 |
0m0.016s |
9 |
|
6 x 6 |
0m0.021s |
50 |
|
9 x 9 |
0m0.086s |
122 |
|
12 x 12 |
0m0.452s |
232 |
|
15 x 15 |
0m1.865s |
380 |
|
18 x 18 |
0m6.201s |
563 |
|
21 x 21 |
0m17.217 s |
791 |
|
24 x 24 |
0m42.423 s |
1058 |
I
think that in
practical use, 9 x 9, or 12 x 12, would be considered large problem
sets (both examples completed in less than one second). If I ran into
anything larger than that I would probably work to logically break
the problem down into more manageable components. “9 x 9”
results in 122 test cases and ”12 x 12” results in 232
test cases.
ALLPAIRS, executed
on Windows 2000,
using the precompiled allpairs.exe:
|
Problem size |
Time |
Generated Tests |
|
3 x 3 |
0.039 s |
10 |
|
6 x 6 |
0.379 s |
54 |
|
9 x 9 |
3.657 s |
139 |
|
12 x 12 |
19.963 s |
272 |
|
15 x 15 |
1m13.291 s |
447 |
|
18 x 18 |
3m39.952 s |
676 |
|
21 x 21 |
9m15.540 s |
949 |
|
24 x 24 |
21m21.560 s |
1296 |
A
glaring shortcoming of
ALLPAIRS is the inability to specify
impossible or nonsensical pairings. Users must either split their
problem up into multiple problems, or rework the resultant test cases
by hand. Jenny is slightly more daunting for a new user, and in some
cases requires a more technical audience, but it does provide more
flexibility. Jenny offers a few features that ALLPAIRS does not, but
by far the most valuable is the ability to specify impossible or
nonsensical pairings. The majority of times I need to generate
pair-wise test cases I need to specify some pairing that my application
won't
logically permit. While I prefer the self-verifying output of
ALLPAIRS, I will most often give the nod to jenny based on the
ability to rule out impossible pairings. Another factor in jenny's
favor is that it's solutions are notably better optimized compared to
ALLPAIRS.
Jenny's output is just a list of combinations of integers (representing different variables) and alpha characters (possible states for each variable); each row represents one test case.
Let's
say your
problem domain looks like this:
|
OS |
Browser |
JavaScript |
|
Mac OS X |
Mozilla |
Enabled |
|
Win 2000 |
Netscape |
Disabled |
|
Win XP |
|
|
You would feed this info into jenny using a command like:
$ jenny 3 2 2
The output of jenny would look like this:
1a 2b 3a
1b 2a 3b
1a 2a 3b
1b 2b 3b
1b 2a 3a
This is not the most useful output. The user is left to translate the output into their specific problem domain. In our example, “1a” maps to “Mac OS X”, “2b” maps to “Netscape” and so on.
See Appendix C for a more in-depth treatment of this issue.
Jenny does not generate any kind of debugging information to verify that it actually created all of the required pairings. In simple cases you can determine this manually, but this approach quickly becomes unreasonable in even moderately complicated scenarios. The adventurous can always walk through the code to verify the algorithm.
$ jenny -n33
This is displayed:
jenny: -n says all n-tuples should be covered.
A message explaining that I provided no dimensions would have been more appropriate. There are several similar examples of misdirected error messages, but I think that this limitation is somewhat mitigated by the good documentation and the ability to run “jenny -h” to get detailed usage instructions.
$ jenny 2 2 2
The
output will look
like this, 5 total test cases. ‘a’ is the first possible
state of a variable, and ‘b’ the second possible state. If there were
more states, the progression would continue through the alphabet.
1a 2b 3aA slightly more complicated example - here we have 3 variables, 2 with 3 possible states, and 1 with 2 possible states:
1b 2a 3b
1a 2a 3b
1b 2b 3b
1b 2a 3a
$ jenny 3 3 2
The output will look like this, 9 total test cases:
1a 2c 3a
1b 2a 3b
1c 2b 3b
1b 2b 3a
1a 2a 3b
1c 2a 3a
1b 2c 3b
1a 2b 3b
1c 2c 3b
jenny 3 2 2The output would look like this:
1a 2b 3aNow, we know that “1a” represents Safari, and “2b” represents Windows 2000. That combination will be represented in test cases 1 and 5. But this is a problem as Safari is not available on that platform, so what can we do? Well, we can't just ignore those test cases because we would miss other valid pairings as well. We can manually dissect those test cases and reconstruct replacements by hand. However, that can be a bit tedious and error prone, especially in a non-trivial example. Fortunately jenny allows us to specify nonsensical pairings using the “–w” flag.
1b 2a 3b
1c 2a 3a
1b 2b 3a
1a 2b 3b
1c 2b 3b
1a 2a 3a
We
can now try this - it
instructs jenny to create all pair-wise test cases but to avoid pairing
“1a” with “2b”:
jenny -w1a2b 3 2 2The output would look like this. Note that we have covered all pair-wise test cases without ever pairing “1a” with “2b”:
1a 2a 3aNot only did jenny deal with our restriction, it even did with one fewer test case than before. This is an important feature with plenty of real world use potential. In my daily test case creation I run into scenarios like this regularly, and I suspect others do too.
1b 2b 3b
1c 2b 3a
1b 2a 3a
1a 2a 3b
1c 2a 3b
1a 2b 3aThis is not particularly helpful when your problem domain looks like this the following. You would be left to map the values by hand.
1b 2a 3b
1a 2a 3b
1b 2b 3b
1b 2a 3a
|
OS |
Browser |
JavaScript |
|
Mac OS X |
Mozilla |
Enabled |
|
Win 2000 |
Netscape |
Disabled |
|
Win XP |
|
|
One of the jenny's weaknesses is also a strength. Jenny works from the command line and writes to the standard output stream. This means that the command line parameters can be scripted, but more importantly, results from jenny can be piped into something else. If you want to do a quick check of how many test cases jenny will generate for a given input see the following two examples. Jenny's output is piped into the “wc” (word count) program. wc tells us is the number of lines sent; in our case this is the number of unique test cases.
In this case, jenny generated 7 unique test cases:
$ jenny 3 2 2 | wc -lIn this case, jenny generated 34 unique test cases.
7
$ jenny 7 4 4 4 3 2 | wc -lAnother powerful example of jenny's strength with respect to standard output is using another tool like “sed” to automatically translate the results into test cases we can use.
34
This command:
$ jenny 3 2 2 | sed -e 's/1a/MacOS_X/g' -e 's/1b/Win_2000/g' -e 's/1c/WinXP/g'
Will produce:
MacOS_X 2b 3aThe data can also be sorted in any manner you prefer like this:
Win_2000 2a 3b
WinXP 2a 3a
Win_2000 2b 3a
MacOS_X 2b 3b
WinXP 2b 3b
MacOS_X 2a 3a
$ jenny 3 2 2 | sed -e 's/1a/MacOS_X/g' -e 's/1b/Win_2000/g' -e 's/1c/WinXP/g' | sortNote that for the sake of brevity and simplicity I have only translated the first column (the “OS” variable) of data. In practice you would translate all columns. As the sed commands get more verbose and wrap on the command lines, it makes sense to place them all in a sed script and refer to the script. For example, if all of your sed commands are in a file named “sed_commands.sed”:
MacOS_X 2a 3a
MacOS_X 2b 3a
MacOS_X 2b 3b
WinXP 2a 3a
WinXP 2b 3b
Win_2000 2a 3b
Win_2000 2b 3a
$ jenny 3 2 2 | sed –f sed_commands.sed | sort