Artifact 1975c96ae9eb1e43fc7f7c0211531c0b3ba1d98c:
- File doc/os8-progtest.md — part of check-in [0c003f5ef9] at 2020-12-05 19:50:49 on branch trunk — Add documentation of the --exitfirst option and the work-around for long output. (user: poetnerd size: 10643) [more...]
os8-progtest: Perform Tests on a Program Under OS/8
This program uses Python expect to work through tests of a program under OS/8. The test cases and expected output are expressed in YAML format utilizing the pyyaml library.
It is found in the tools
directory of the source tree.
Usage
os8-progtest [options] <prog_spec>
The prog_spec
is the program to test optionally followed by a subset
of tests. For example:
$ tools/os8-progtest cc8
…runs all CC8 tests, while:
$ tools/os8-progtest cc8:ps,fib
…runs just the two CC8 tests on ps.c
and fib.c
.
More than one prog_spec
can appear on the command line to test more than
one program at a time.
Because this test system is based on the pexpect library, success is determined by seeing a sequence of expected outputs come from the test program. If one or more of these fail to occur, the test will appear to hang until pexpect times out while waiting.
Options
Argument | Meaning |
---|---|
--help, -h |
show this help message and exit |
--verbose , -v |
increase output verbosity |
-d DEBUG |
set debug level; 0-14 |
--destdir DESTDIR |
destination directory for output files |
--srcdir SRCDIR |
source directory for test .yml files |
--target TARGET |
target image file |
--dry-run , -n |
dry run: only print what would happen |
--exitfirst , -x |
exit on first failure |
When --srcdir
is not given, os8-progtest
looks for YAML files in
scripts/os8-progtest
relative to the PiDP-8/I source tree root.
The default debug level of 0 suppresses all debug output, while 14 makes it quite noisy.
The --exitfirst
option is used when we want a non zero status exit from
the run of os8-progtest
on the very first failure, rather than the usual
behavior of a successful run being performance all tests and reporting
failures along the way.
The Test Definition File
The .yml
files defines a series of tests, each one of which consists of
a state machine. The state machine consists of a state name, the
test text string to send, and an array of possible responses and the
state to go to if the response is received.
In the abstract, each test name begins starts at the beginning of a line,
and all the associated states are indented one tab stop.
Each state consists of a name followed by an array specification,
where the first element is the test text string to send, and the
second element is an array of 1 or more [response, newstate]
pairs.
Every state machine must have a “start
” state.
Every state machine should have at least one state that names “success
” or
“failure
” as termination states.
The -n
, --dry-run
option should be used at development time to
confirm that the test state machine will terminate with success if all
tests come back successful.
The easiest way to understand how to define the state machine is to study an example:
'ps':
'start': ["EXE CCR\r", [["PROGRAMME\\s+>", 'progname']]]
'progname': ["ps.c\r", [
[ ".*924.*COMPLETED\r\n\r\n#END BATCH\r\n\r\n.$",
'success'
]
]
]
'fib':
'start': ["EXE CCR\r", [["PROGRAMME\\s+>", 'progname']]]
'progname': ["fib.c\r", [
[ "OVERFLOW AT #18 = 2584\r\n\r\n#END BATCH\r\n\r\n.$",
'success'
]
]
]
This file defines two tests for the cc8
package, ps
and fib
, which
follow a very similar structure:
- Both have the requisite
start
state and terminatingsuccess
condition. - Both run the OS/8 command
EXE CCR
which runsDSK:CCR.BI
, part of the CC8 package. - Both react to the program name input prompt by linking to the
progname
state. - Each test answers the
progname
state by sending the name of a C program installed with CC8, after which the test was named. - Each test looks for expected output from the test program followed by
the common
#END BATCH
after a completedCCI.BI
run. If found, sends the test state machine to thesuccess
condition.
Because the second element of each test is an array of pairs, you can have the test check for multiple expected possible answers, sending the state machine to potentially different conditions depending on which one comes back. This gives the system a simple sort of conditional logic:
'advent': ["ADVENT\e", [
[ "LOCATION OF TEXT DATABASE\\s+\\(\\S+\\).*", 'database' ],
[ "WELCOME TO ADVENTURE!!", 'instructions' ]
]
]
You can see the rest of the test here, but this shows the
essential elements: two possible responses, sending the test to one of
two states, database
or instructions
depending on whether Adventure
has been run on the boot media used by the test before. Without this
logic, we’d have to either rebuild the test media each time we ran the
test or roll back all state changes made to it.
Notice the use of \e
to start ADVENT
, terminating the OS/8 Command Decoder
input with an ASCII escape character.
Additional Syntax Information
A line beginning with a #
is ignored as a comment.
Crafting New State Machines
The YAML quoting conventions are carefully chosen!
- Surround state names with single quotes. Otherwise state names
like
yes
get evaluated and turned into something else. (yes
becomesTrue
.) - Surround send and reply strings with double quotes. The YAML evaluation of quoted strings is the simplest to understand as it gets translated into Python pexpect regular expressions.
- Surround state names with single quotes. Otherwise state names
like
OS/8 commands end with a carriage return denoted by
\r
or escape denoted by\e
.For programs needing Ctrl-C to get out of some loop, use the YAML hex code
\x03
.pexpect
translates all TTY output you see from running the simulator to upper case, so take anything you see and translate it to upper case in the reply string.It is important to escape characters that normally have regex meaning:
.
+
*
$
(
)
and\
. To do this in YAML, preface each occurrence with two backslashes. So for example, to pass in a literal question mark, replace?
with\\?
.To pass in regex escapes that have a backslash — for example
\s
for whitespace — double the backslash, so\s
becomes\\s
.Use of regex’s end-of-string match (
$
) can often improve reliability, because it ensures the state machine doesn’t proceed before OS/8 or the program running under it is ready. Keep in mind thatos8-progtest
runs in your host machine’s context, and while the program under test is running unthrottled on the PDP-8 simulator, it’s still likely a program from the 1960s or 1970s expecting to run on a machine capable of only a few hundred thousand instructions per second, being fed interactive input by a 110 bps teletype. Some programs can get spammed if you don’t wait out the full reply line before sending the next bit of input.Sometimes guessing the exact whitespace is difficult. The
\\s+
construct to match on one or more whitespace characters is often helpful.Helpful match strings:
String Meaning "\n\.$"
OS/8 Monitor prompt. Always look for this at the end. "\n\*$"
OS/8 Command decoder prompt. Often the first step in running programs.
Problems matching against long strings of output characters.
Python expect has been observed to misbehave on long strings of
output, for example when trying out BASIC games that print typewritter
art. The match times out and fails, and the before
match string is
only a partial read of the whole output.
Neither enlarging the pexpect maxread
option for the spawned sub-process,
nor setting a sleep between tests helped. However, there is a work around:
Perform another write/expect cycle. Doing this is challenging, because
ideally you want to send input that won't mess up the output.
Under BASIC, an attempt was made to send XOFF
('0x011') but sometimes, instead
of sending XOFF
that would be ignored, OS/8 would echo "X011". The work-around
that actually worked was to:
Detect stalled output with careful crafting of additional state matches. And adding a new state.
Prevent false positive tests of stalled output, by putting the longest, definitive match first in the list.
Send a newline.
Retest.
If necessary loop a couple times.
Make sure tests after the kick handle the additional newlines gracefully.
Example: The playboy bunny typewriter art kept hanging at random points. The old version:
'bunny':
'start': ["R BASIC\r", [["NEW OR OLD--$", 'old']]]
'old': ["OLD\r", [["FILE NAME--$", 'name']]]
'name': ["BUNNY.BA\r", [["READY\r\n$", 'run']]]
'run': ["RUN\r", [[".*BUNNY.*\r\nREADY\r\n$", 'success']]]
'quit': ["\x03", [["\n\\.$", 'success']]]
becomes:
'bunny':
'start': ["R BASIC\r", [["NEW OR OLD--$", 'old']]]
'old': ["OLD\r", [["FILE NAME--$", 'name']]]
'name': ["BUNNY.BA\r", [["READY\r\n$", 'run']]]
'run': ["RUN\r", [
[".*BUNNY.*\r\nREADY\r\n$", 'quit'],
["^RUN.*", 'kick']
]
]
'kick': ["\r", [
[".*\r\nREADY\r\n", 'quit'],
[".*BUNNY.*", 'kick']
]
]
'quit': ["\x03", [["\n\\.$", 'success']]]
A few subtle aspects:
- The
kick
state can loop until it gets the "READY" signifying the end of the run. - The test for "READYrn" in the
run
state contains the detection of end of string ('$') but the test in thekick
state does not, because there will be extra newlines. - The work-around here relies on the hope that there'll be a string with the whole word "BUNNY" in it with the partial read.
- Note that the wild card matches ('.*') are non-greedy, and match on the smallest success. That's why the longest match for the successful run is the first test to apply.
License
Copyright © 2020 by Bill Cattey. Licensed under the terms of the SIMH license.