Testing Interactive Programs
Software QA Magazine, Vol. 3, No. 1, February 29, 1996 Tester's Toolbox column
by Danny Faught
Copyright 1996, Danny FaughtMany people are surprised by the challenges presented by testing interactive programs. In this column I'll discuss some of the issues involved with testing interactive tty-based programs under Unix. I'll start with a simple example, then I'll walk through an "Expect" script that solves a more complicated problem. I'll wrap up with some other challenges you might run into, and then a list of resources to get you started with "Expect".
A simple interaction
There are some cases where automation is easy. Since I'm most familiar with testing Unix itself, I'll use some common Unix utilities as examples. Let's say we need to test the "rm" utility, specifically the "-i" option which specifies that it should interactively prompt the user to confirm each file that it tries to delete. It might be used it like this:
% rm -i foo rm: remove foo? y %This is a very simple interaction, and it would be reasonable to use a pipe in a Bourne shell script to automate the "y" answer. Here's a sample test for "rm -i":#!/bin/sh touch foo echo y | rm -i foo"Rm" happily reads the "y" from its standard input, not caring whether it comes from a user or a pipe. (I left out a check for the return code and output, and a check that the file was actually deleted.)
Bringing out the big guns
Unfortunately, testing interactive programs is usually not as easy as cobbling together some pipes in a shell script. Let's say our task is testing the "crypt" utility. "Crypt" is a simple filter that reads standard input and sends the encrypted output to standard output. We can send the encryption key on the command line, or we can type it interactively. To make things simple, most "crypt" tests would probably put the key on the command line. But we do need at least one test of "crypt"'s interactive input code, and that's what we'll do here.If we tried to send the key to "crypt" via the standard input, it just becomes part of the data. "Crypt" still demands a key from our terminal. "Crypt" and many other utilities bypass standard input by reading from the special file "/dev/tty". This device file provides a simple mechanism to read from the current terminal without having to figure out exactly what terminal it is. We can't fool "crypt" by standard input tricks. We need a real terminal session so the kernel will make /dev/tty valid.
Dealing with terminals can be treacherous. Fortunately, this wheel has already been invented in a public domain tool called "Expect". "Expect" is a scripting language based on "Tcl" (Tool Command Language). This means that an Expect script can use all of the base Tcl commands, plus the specialized features that Expect adds. It is the most powerful tool I've found for automating interactive programs under Unix (sorry, it only works on Unix).
"Expect" provides simple mechanisms for automating interactive processes, which is exactly what we're doing when writing automated tests for interactive programs. "Expect" takes advantage of pseudo-terminals, or ptys, which are similar to hardware terminals except that both ends are handled by software. We use a pty any time we run an xterm or do a remote login.
Here's an "Expect" script that exercises "crypt"'s input routine (with lines numbers added for clarity):
1 spawn -nottycopy /bin/sh -c "crypt < crypt.in > crypt.out" 2 expect { 3 timeout { send_error "TIMEOUT!\n"; exit -1 } 4 eof { send_error "EOF!\n"; exit -1 } 5 "Enter key:" 6 } 7 send "this_is_the_key\r" 8 expect { 9 timeout { send_error "TIMEOUT!\n"; exit -1 } 10 eof 11 } 12 set status [wait] 13 exit [lindex $status 3]The "spawn" command on line 1 runs a program with its input and output connected to a pty. I used the "-nottycopy" option so that the pty won't inherit any of the settings of my own terminal when I run the script. The pty is initialized only with whatever default the kernel provides, and the default terminal settings that "Expect" provides. Without this option, the script might run fine normally, but act strangely when we run it under cron or an automated driver. This way we'll find these problems sooner. And we don't want different users to get different test results because of their terminal setup. If the program is dependent on any particular terminal settings, these should be set explicitly in the script.Instead of running "crypt" directly, I ran "/bin/sh". This allowed me to use the shell's ability to redirect the output, taking input from a previously prepared "crypt.in" file, and capturing the output in "crypt.out". While I could have used some of "Expect"'s built-in commands to shuttle this data around, it would have been much more difficult. Using the shell this way, we're also testing "crypt" in the same manner that we expect the user to use it.
Once "crypt" is running, the spawn falls through to the first "Expect" command starting on line 2. The program is running independent of the "Expect" script at this point, at least until it asks for input. We know we want to send it the password, but it's wise to get synchronized with it first, since the user would normally wait for the prompt before typing. So I use the "Expect" command to wait for "crypt" to print the text "Enter key:" with the pattern on line 5. Note that we can also use patterns to match output when we can't predict exactly what it will look like. There is no action listed after the "Enter key:" pattern, so the script falls through to the next command when it finds this text.
It's important to consider what else might happen. Maybe the "crypt" developer misspelled one of the words, or left out the prompt entirely. "Expect" would never match its pattern. The default timeout is 10 seconds, so we'd see a 10 second delay and then our timeout action fires, causing the test to fail. If I hadn't indicated a timeout action, the script would have simply continued, and it might not have caught the bug at all. So it's extremely important to provide a timeout action like the one on line 3. Similarly, I have an "eof" action on line 4 to catch the case where the program exits early, which causes "Expect" to read an eof (end of file) from the program's output. If we have several "Expect" commands, we should use "expect_after" to set a timeout and eof action for all of them at once.
If all goes well, we continue to the send command on line 7. I send the encryption key to the program, followed by "\r", which is the code generated by the return key. We can't leave out details like the return key! Since the input is going through a pty, the program sees its input even if it reads from the /dev/tty device.
On line 8, another "Expect" command traps a timeout, which could indicate that the program is hung in an infinite loop, running too slowly, or waiting for more input that we didn't anticipate. Or, it might mean the program simply needs more time and we should increase the default timeout.
This time, we expect to get an eof since the program should be finished. So I put eof as the last pattern, and don't provide an action. I could have added an action to print out "done" or something, if I wanted more feedback from the script.
Finally, we need to get the exit code from "crypt". I use "wait" to grab the status array on line 12, and then use "lindex" to grab the return code and pass it on as the "Expect" script's status on line 13. If "crypt" dies from a signal, the shell converts it to a normal (but non-zero) exit code. If I had run "crypt" directly, we would have to check for a signal as well as a return code. Note that we can't tell whether a -1 return value is from the script itself or from "crypt", unless we examine the output.
All that's left at this point is to verify the output of the script to look for stray error messages, and to verify that "crypt.out" contains the right data.
Other tidbits
While "Expect" takes us a long way toward providing an environment similar to the user's actual login session, there are still some gotchas. One is terminal protocols. The terminal driver doesn't have any knowledge of terminal protocols; these are handled by the terminal sitting on the user's desk, or an emulator such as xterm or a modem communication program. Let's say we want to test the "resize" utility. Resize attempts to send control characters to the terminal to determine its size. If there's no terminal, then resize won't behave as expected. I was able to spoof it by using "expect"/"send" sequences to respond to the few vt100 control codes that resize needs. For examples of more complete terminal emulators, see the sections on "term_expect" and "tkterm" in _Exploring Expect_.We might also run into situations where we need an entry in the "utmp" database. This is the database that utilities like "finger" and "who" use to get their information. "Expect" does not provide an automated way to generate a utmp entry, because writing to the file requires superuser privileges. If a test needs a utmp entry, it's best to use a separate setuid program to generate it. An example of such a beast is "sessreg" which is part of the xdm distribution. I ran into this issue when testing "who am i". I decided to deal with it by building a separate utmp file and providing it on the who command line so I wouldn't have to deal with the system utmp file.
Automating an interactive program is a unique skill, and it can be frustrating. You're taking a program that was designed to deal with human typing speed and driving it as fast as "Expect" can interpret your script. Sometimes you'll greatly accelerate the detection of timing bugs, and other times you'll run into undocumented subtleties of the program that you'll have to work around. You should plan to be frustrated a few times before getting an "Expect" script to work.
One thing that can help get you started on a test is the "autoexpect" script in the example directory of the "Expect" sources. Autoexpect is a capture/playback tool that will record a session and then generate an "Expect" script to play it back. You may have to do some editing of the results or give it some hints along the way, but it can be a great help.
Deja Gnu
An article on testing and "Expect" wouldn't be complete without mentioning DejaGnu. DejaGnu is a testing framework implemented using "Expect", and distributed under the GNU General Public License. Use archie to find the DejaGnu archive nearest you.
Resources
You can obtain sources for "Expect" at ftp://ftp.cme.nist.gov/pub/expect/expect.tar.Z. You'll also need to get and build Tcl, which is available in the same directory. The "Expect" sources include the "Expect" FAQ, several example scripts, and the man page. You'll also need the Tcl man page, since the basic Tcl functions are not documented in the "Expect" man page.If you're going to spend any amount of time with "Expect", invest in a copy of the "rhesus book":
Don Libes, _Exploring Expect: A Tcl-Based Toolkit for Automating Interactive Programs_, O'Reilly & Associates, 1995. ISBN 1-56592-090-2.For some somewhat dated but free background information, there are several useful articles about "Expect" available in the ftp archive containing the "Expect" sources. I recommend starting with:Don Libes, "Regression Testing and Conformance Testing Interactive Programs", Proceedings of the Summer 1992 USENIX Conference, San Antonio, Texas, June 8-June 12, 1992. (Available via ftp as regress.ps.Z, with slides in regress-talk.ps.Z.)You can go to the Usenet newsgroup comp.lang.tcl to ask "Expect" questions, or send email to the author, Don Libes, at libes@nist.gov. See also the Tcl FAQ for general Tcl information.Don Libes, "expect: Curing Those Uncontrollable Fits of Interaction", Proceedings of the Summer 1990 USENIX Conference, Anaheim, California, June 1990. (seminal.ps.Z and seminal-talk.ps.Z)
I'd like to sincerely thank Don Libes for convincing me to use "Expect", and for providing great support along the way.
Back to the home page