In the Beginning...

The original version of the PiDP-8/I software by Oscar Vermeulen simply turned the front panel LEDs on and off rapidly in a continuous update scan,¹ with each row of LEDs updated every 0.3 ms, so that with 8 logical rows of LEDs, the panel update loop would get back to a given row roughly every 1/400 of a second. At the instant of the update, the original code would look at the current processor state and decide how to strobe that row's LEDs to show the values set by the processor core.

Even on the slowest Pi we support — the Raspberry Pi model A+ — the simulated PDP-8 processor core changes state about a million times per second, but because of the persistence of vision limit, 400 Hz is plenty fast. (Indeed, it is excessive, as we will see below.)

A typical LED can go from "off" to full brightness in under a microsecond, but when you turn them on and off at "only" 400 Hz, it just looks like the LED isn't being driven to its full brightness to the human eye. The eye doesn't see the rapid blinking, it just sees a steady low brightness. A high-speed motion picture camera would see the LEDs turning on nearly instantaneously, staying at full brightness for a time, and then turning back off nearly instantaneously. Because the rows are scanned in turn, each row would in fact have LEDs on only at most 1/8 the time; the rest of the display's LEDs are actually off most of the time!

The Problem

The PiDP-8/I project is all about trying to mimic a piece of hardware produced in the late 1960s. LEDs existed at the time the PDP-8/I was being designed, but they cost about $200 each and visible light LEDs were only available in red at the time. Those were both non-starters for the PDP-8/I, so the DEC designers chose to use an amber incandescent light bulb.

Unlike LEDs, incandescent lamps cannot turn rapidly on and off. Even a slow human eye can see an incandescent lamp's warm-up and cool-down behavior, because it takes time to heat a metal wire up to its incandescence point and even more time for it to cool back down and drop below this point.

When you run a program on a PDP-8/I that turns one of its indicator lamps on and off rapidly, this thermal inertia produces an even more pronounced "blurring" effect than the one described above with rapidly-scanned LEDs. There is a range of change rates that would blur together into a soft constant glow on a real PDP-8/I but which a human would perceive as flickery on a PiDP-8/I with the original lamp driving code. The result is that programs running on a real PDP-8/I that give a nice soft glowing effect would appear flickery when run on a PiDP-8/I.

Schofield's Incandescent Lamp Simulator

Ian Schofield fixed this problem with his incandescent lamp simulator patch.²

It works by maintaining a set of brightness values for each LED, continually updated each time through the display update loop. He cut the delay values to 1/60th the values chosen by Oscar Vermeulen, which allowed him to get 32 distinct brightness levels per LED while still achieving a ~400 Hz display update rate.

Schofield's ILS updates these per-LED brightness values once every 1049 PDP-8 CPU instructions executed by the simulator. Since the simulator runs at around 7 MIPS on a Raspberry Pi 2 or 3, this means you get several thousand brightness level updates per second. With 32 brightness levels, that is fast enough to feed the 400 Hz display update thread with continuous changes. More updates per second would only slow the simulator down without materially improving the appearance of the display.

A key feature of Schofield's ILS is that it modifies the lamp brightness nonlinearly as a function of its present brightness:

Schofield ILS brightness curves

That is, if an LED is off (brightness 0) and you turn it on, the first time through this update loop its new brightness level takes a rapid jump upwards. (Green curve above.) As long as the desired state of that LED remains "on," the lamp simulator keeps increasing its brightness level, but the rate slows as it approaches full brightness. This simulates the rapid rise towards incandescence of a real light bulb and its asymptotic approach towards thermal stability.

When the desired lamp state goes to zero, the brightness level falls nonlinearly as before, asymptotically approaching 0 brightness. (Red curve above.) Since brightness is quantized to 32 levels, it falls through the zero-brightness threshold after about 350 time steps.

The functions used are:

Falling: b = b - b × 0.01
Rising: b = b + (32 - b) × 0.01

The falling function is the easiest one to understand: it removes 1% of the current brightness level at each time step. Since each subtraction depends on its current value, the amount of brightness dropped at each time step decreases.

The rising function is conceptually the same, except that we increase the brightness by 1% of the difference between the current brightness level and the full-brightness level, 32.

This nonlinear behavior is why it takes 350 time steps to drop from 100% brightness to "off" by 1% steps, rather than 100 time steps. The early drops are fast, but because the steps decrease in size continuously, it ends up taking 3.5x longer than a linear 1% per time step algorithm would.

Incidentally, the two functions above are actually the same function. The falling function can also be written b = b + (0 - b) × 0.01, showing that the only thing varying is the target value: 0 for falling, 32 for rising.

Problems with Schofield's ILS

Although this scheme solved the original problem, and solved it well, it has a number of imperfections:

The hard-coded divisor means that it only works properly when the PDP-8 simulator is running flat-out on a Raspberry Pi 2 or 3. It is the correct value for that case only.

If the Raspberry Pi Foundation comes out with a new Pi that's twice as fast as the old ones, it would run twice as fast, changing the perceived brightness curves.

More critically for our current purposes, if you give the SET THROTTLE command to the SIMH PDP-8 simulator to make it run slower, the jumps between brightness steps become visible to the human eye because they aren't happening fast enough.

We can't simply update the LED values used by the GPIO thread once per simulated PDP-8 instruction.³ What we need here is a dynamic function that modifies the update rate according to the instructions per second (IPS) rate the processor happens to be running at at the moment.
Vermeulen's original 400 Hz update rate is wasteful, and Schofield's choice to maintain it isn't justified by human psychovisual science. The human PoV limit means we really only need an update rate of about 100 Hz to fool the eye into seeing a continuously-updated picture. Since the display update rate is the single largest factor affecting CPU usage of the GPIO thread, lowering this to 100 Hz saves quite a bit of CPU power.
While the PDP-8 simulator is running at a steady rate, the fixed delays and fixed iteration values mean it updates the display at a fairly predictable rate, with low jitter.⁴ Unfortunately, this desirable-sounding characteristic gives rise to beat frequencies due to aliasing created by the display sampling rate as it interacts with the CPU state update rates of the running PDP-8 program.

The Schofield simulator attempts to solve this with its 1049 divisor, claiming that because it is prime, it causes all CPU states to be sampled evenly. This is wrong-headed. Prime numbers (or at least, coprime numbers) can indeed avoid aliasing problems in sampling systems,⁵ but only if they are on the other side of the division sign: prime numbers are indivisible, but you can evenly divide a prime number into infinitely many other larger numbers.

To see the problem, consider what happens when the CPU state happens to be updating at 401 × 1049 × N times per second, where N is an integer constant such as 17, which comes to about 7.1 MIPS, a perfectly plausible running rate on a Pi 3. The first number is the display update rate — remember, the ~400 Hz display update rate is approximate — and the second is Schofield's brightness state update divisor. We have three prime numbers here, but we still have a potential aliasing problem because the simulated CPU is running a program with its own update rate which may interact with this composite 7.1 MIPS update rate.

Use of a sufficiently large prime number as the denominator instead of the numerator does have some small benefit: 1049 is an even divisor of much fewer of the possible IPS rates than something objectively bad like 60. Nevertheless, it is no guarantee of even sampling.
Above, I observed that if you turn an incandescent lamp on and off, it appears to come on much faster than to turn off. This is because turning on a lamp involves driving it with an electrical current until it incandesces, whereas turning it off is a passive process, requiring the lamp to cool down through conduction and radiation. Thermal inertia slows the latter process. Schofield's ILS doesn't try to model this.

First Steps Toward a New ILS

In mid January 2017, I (Warren Young) began work on modifications to the ILS code.

My first goal was to fix problem #1 above. Specifically, I wanted to run ac-mq-blinker.pal at 30 kIPS so the individual AC and MQ register state changes would be distinctly visible to a human watching it. I wanted the nice LED turn on-and-off behavior of Schofield's ILS even at such low speeds.⁶

While making that improvement, I got clever. Too clever, as it turns out.

I've known about the human PoV limit for quite a long time, so I thought, why do we need to update the brightness values thousands of times per second? I understand that 400 Hz × 32 brightness steps means we need thousands of updates per second to feed the ILS version of the GPIO panel update loop, but why is this happening in the middle of the CPU instruction loop in the first place? What if we just "publish" the current on and off state for each LED at more like 100 Hz, and let the GPIO thread update its own brightness values based on those current target states?

That is, if the simulator starts out with an LED turned off, and the simulator says it's supposed to be turned on for 50 or so instructions, this new ILS code would ramp that LED up to 100% brightness, according to the formulas and graphs above. Perfectly fine, right?

After the above changes landed as v20170123, a number of people pointed out that the display wasn't updating as smoothly as before. Why not?

With the original Schofield ILS algorithm, the brightness values were updated roughly 8000 times a second, which means that any interaction between that update rate and the repaint rate of the panel were likely beyond human perception. Oh, there was still the potential for bad interactions per problem #3 above, but it would take a pretty specific set of circumstances to make them visible on the front panel. Also, the fact that this brightness level update happened inside the simulator's CPU instruction dispatch loop meant the updates were synchronous with respect to CPU instruction updates, so we only had two update rates that could interact, one in each program thread.

When I moved the brightness update code from the CPU thread to the GPIO thread, I broke that synchrony, and my decoupling of the target LED state values from the LED brightness update loop and update rate change compounded the problem. I had three very different update rates all going on at the same time,⁷ plus timing jitter from the host OS's process scheduler. This vastly magnified the potential for producing the unwanted visual artifacts listed in problem #3 above.

I'd made it worse, not better.

The New ILS

The solution to these newly-created problems was to change from publishing a single target value (on or off) for each LED to aggregating the "on" times for each LED during the time between display updates.

That is, if we have a PDP-8 program producing a 50% duty cycle on one of the LEDs, we want the display update code to be told that this LED was turned on half the time since the last time it updated the display. Instead of applying Schofield's nonlinear math to produce the new brightness value from a simple on/off value, we now apply it to the new ideal brightness value. The equations change a bit:

Falling: b = b - b × 0.012
Rising: b = b + (t - b) × 0.005

You will notice that the multipliers are now different for each case, which fixes problem #4 above. I changed both values rather than one because I wanted the turn-on time to be a bit slower than with Schofield's version. I have no idea if this is more accurate with respect to the actual brightness curves of the lamps used in a PDP-8/I, but it looks nice, and that's good enough for now.⁸

The other change is that for the rising-brightness equation now uses a value t instead of 32. This is the target brightness level calculated by the code that accumulates LED "on" time for the current full-display update. We no longer calculate as if ramping from the current brightness toward 100% brightness, but instead toward an ideal target value based on what happened in the CPU since the last update.

Those changes were insufficient, however. We still had three distinct update frequencies, two of which were fairly stable, and the third could be stable, meaning we hadn't really solved problem #3 above. The solution was to add some random timing jitter into the process that updates the LED "on" time statistics. This is a form of dithering, and it serves to break up any aliasing produced by the interaction of the various fixed update frequencies.

CPU Savings

The display update rate is the largest single factor in the amount of CPU power required by the ILS. Cutting the whole-panel update rate from ~400 Hz to 100 Hz did not drop the CPU usage by 4× however. Much of this speed savings went to pay for the increased calculation complexity added to achieve the other benefits above, so that in the end, the GPIO thread now takes a bit over half as much CPU power as before.

Sadly, this is not enough savings to allow the ILS to run on a single-core Raspberry Pi, even if you throttle the PDP-8 simulator down to PDP-8/S levels. We'll need more savings to get the ILS to run on a Pi Zero at PDP-8/I equivalent speed, and more still to get it to run on the still-slower Pi 1 Models A+ and B+.

I also want to note that because the ILS code is quite dependent on getting regular state updates from the CPU thread, the ILS is utterly incompatible with SIMH's PDP-8 CPU idle detection feature. (i.e. set cpu idle) If you try it, the panel LEDs run fine for a few seconds after the simulator starts, then they all shut off when the simulator's CPU detects an idle condition, and then they flutter on and off semi-randomly as long as the simulator remains in this idle state. The NLS code is also somewhat affected by this, but not to nearly as great a degree. Because of this, the PDP-8 simulator configuration files all tell SIMH to run in "no idle" mode.

Other Benefits

In addition to running faster than before, the new incandescent lamp simulator is now more or less immune from aliasing.

Additionally, the new code is now largely independent of the CPU instruction timing in the simulator. The current implementation does take a second or so to achieve timing regulation at slow SIMH throttle rates, but this minor flaw produces a "power brownout" effect that feels period-correct, if it isn't actually correct.

The new ILS also feels "smoother" to me. I believe this is due to the stochastic sampling and statistical aggregation of LED "on" times: any human-perceptible rapid updates to the display state are now certain to reflect an actual sharp change in the state of the simulated PDP-8/I, rather than be an artifact of the simulator itself.

Acknowledgements

Although I've catalogued the many limitations and flaws in the display update mechanisms shipped previously, saying, "This implementation sucks!" while pointing at existing code is easy. It is far more difficult and uncommon to say, "The complete lack of this functionality sucks!" while trying to imagine how it should be implemented de novo.

Therefore, I hereby thank Oscar Vermeulen and Ian Schofield for writing the code I built my ILS atop. It simply wouldn't exist without your prior art. Thank you, thank you.

Footnotes

Why scan? Because of the 40 GPIO pins exposed on a Raspberry Pi A+/B+ or newer, twelve are power or ground pins, which we can't use for driving LEDs or detecting switch closures, and a few more are unusable for some other reason, leaving only about 25 pins for our use. The PiDP-8/I PCB has 89 LEDs and 26 switches, so we can't even come close to giving each LED and switch its own GPIO pin. They're arranged on the PCB in a matrix with 8 logical LED rows and 3 logical switch rows.

Each row is turned on separately. If it's an LED row, we briefly turn the LEDs on that need to be on at that instant, and if it's a switch row, we check the switches in that row for closure. The software's GPIO thread goes through these 11 rows in turn on each iteration of its main loop.

(Realize that the "rows" I'm talking about here are logical rows in this electrical matrix. They only have a loose mapping to the rows and columns of the LEDs and switches as they're arranged on the PCB.)
We abbreviate this ILS, and contrast it to the original scheme by Oscar Vermeulen, which we call no-lamp-simulator, abbreviated NLS.
Well, we can, and we briefly did at one point during development. This cut the simulator's speed from about 8 MIPS to about 1.5 MIPS!
Jitter is a technical term, and its meaning matters for this current discussion, so I have purposely avoided using it in its colloquial sense above.
This is the principle behind coherent sampling in FFT-based systems.
The ac-mq-blinker.pal example becomes 5.script when you build the software, so it is what runs when the simulator starts or restarts with IF=5.

Running this example at 30 kIPS means it runs at less than 1/10 the speed of a PDP-8/I, and about 1/250 the speed of the simulator when run un-throttled on a Raspberry Pi 3. This demo has a 4096-step delay loop built into it which takes 2 PDP-8 instructions per step to execute, so you actually have to divide the processor IPS rate by about 8000 to account for that delay, cutting the effective execution rate for the demo to about 4 IPS. The rest of this demo program has eight PDP-8 instructions, so it updates the display about twice a second.
100 Hz "human PoV" update rate for the published LED states, the ~400 Hz full-panel update rate, and the iteration frequency of whatever PDP-8 program you happened to be running at the time, such as a ~3.7 MHz OS/8 keyboard input wait loop on a Pi 3. The new 100 Hz update rate replaces the ~8 kHz update rate of Schofield's ILS.
On the PiDP-8/I user mailing list, alank2 posted measured brightness curves of incandescent lamps turning on and off, but as of this writing, no work has yet gone into modeling these curves in the way we drive the LEDs.

PiDP-8/I Software