PiDP-8/I Software: Changes To PDP-8 Memory Addressing

Changes to "PDP-8 Memory Addressing" between 2017-04-03 05:08:49 and 2017-04-03 05:19:12

# Notation

We must first settle on a bit of notation.

PDP-8 addresses are traditionally written as [octal](https://en.wikipedia.org/wiki/Octal) numbers. That means when we write 123 in this context, we aren't saying "one hundred and twenty-three," we're talking about the quantity (1 &times; 8²) + (2 &times; 8¹) + (3 &times; 8⁰) = 83 in ordinary decimal notation. When I write an octal number below, I will write it as 123₈ to make this clear, the subscript meaning "base 8," also called "octal." If you see a multi-digit number without the base-8 subscript, it's a decimal number.

We use octal when talking about PDP-8 addresses and memory values because the [PDP-8's major registers](http://homepage.cs.uiowa.edu/~jones/pdp8/man/registers.html) are all multiples of 3 bits in size.¹ Since an octal digit encodes as 3 [bits](https://en.wikipedia.org/wiki/Bit), that makes octal the most convenient way to write PDP-8 addresses. The more common ways to write computer numbers are inconvenient: [binary](https://en.wikipedia.org/wiki/Binary_number) takes too many digits even with the tiny PDP-8 memories, and [hexadecimal](https://en.wikipedia.org/wiki/Hexadecimal) numbers don't divide evenly by 3-bit chunks until you get to 24-bit addresses, which is beyond the PDP-8's limits.

Another bit of notation we need to establish here is our unit suffixes. In this article...
Another bit of notation unique to this article, but which I hope will spread, is the unit **kiW**, meaning 1024 words of PDP-8 memory. The unit is named by analogy to [kibibytes](https://en.wikipedia.org/wiki/Kibibyte), abbreviated **kiB**. This new unit does two things for us:

* ...the unit modifer **k** is the computer-centric 1024 multiplier, not the more correct [SI](https://en.wikipedia.org/wiki/International_System_of_Units) 1000 multiplier. Thus, 6 kB is six [kibibytes](https://en.wikipedia.org/wiki/Kibibyte): 6144 bytes, not 6000 bytes; and
1. It distinguishes between the proper [SI](https://en.wikipedia.org/wiki/International_System_of_Units) definition of **k** and the "computer memory" meaning of **k**: 1000 vs 1024. DEC manuals will often just use **k**, and you're expected to understand that it means 1024 words, not kibibytes or kilobytes.

* ...the unit kW means kilowords of PDP-8 memory, not "kilowatts." Although a fairly small PDP-8/I setup dissipates about one kilowatt of power, this article is purely concerned with PDP-8 memory, so I think we can safely repurpose this unit notation here. Original PDP-8 documentation would often just use "k" alone, and you were expected to understand that it meant kWords, not kBytes.
2. It avoids ambiguity with **kW** meaning a kilowatt. That is not an unlikely confusion, since a fairly small PDP-8/I setup dissipates about one kilowatt of power, and a large PDP-8 setup could dissipate multiple kilowatts.

# Bytes

The base PDP-8 memory configuration is 4&nbsp;kW of [core memory](https://en.wikipedia.org/wiki/Magnetic-core_memory). We speak of PDP-8 memory in terms of words, rather than bytes, because the PDP-8 predates the modern notion of an 8-bit byte. Back in the PDP-8's day, a "byte" was a more slippery concept. You could speak of 6-bit bytes, 7-bit bytes, 9-bit bytes... It all depended on what your particular task needed.²
The base PDP-8 memory configuration is 4&nbsp;kiW of [core memory](https://en.wikipedia.org/wiki/Magnetic-core_memory). We speak of PDP-8 memory in terms of words, rather than bytes, because the PDP-8 predates the modern notion of an 8-bit byte. Back in the PDP-8's day, a "byte" was a more slippery concept. You could speak of 6-bit bytes, 7-bit bytes, 9-bit bytes... It all depended on what your particular task needed.²

Since the PDP-8 uses a 12-bit native word size, 6-bit "bytes" are quite common in the PDP-8 world, often used for some kind of "packed [ASCII](https://en.wikipedia.org/wiki/ASCII)" representation. [One common scheme](http://homepage.cs.uiowa.edu/~jones/pdp8/faqs/#charsets) gets rid of most of the 32 control characters defined in 7-bit ASCII, all of the lowercase letters, and a whole bunch of the punctuation in order to pack two characters into a 12-bit PDP-8 word. There are actually a few different 6-bit packed ASCII representations for the PDP-8, so you have to know which scheme you're looking at before you can turn the data back into 7-bit ASCII.

The PDP-8 was being designed at about the same time as the first versions of ASCII,³ as well as around the same time as the first wildly popular ASCII terminal, the [Teletype Model 33](https://en.wikipedia.org/wiki/Teletype_Model_33).⁴

When dealing with such terminals and the included paper tape reader, PDP-8s generally deal in either 7-bit or 8-bit bytes. When we're talking about 8-bit bytes, we aren't talking about the "[high-ASCII](https://en.wikipedia.org/wiki/Extended_ASCII)" stuff that infested the PC world in the late 1970s and 1980s before [Unicode](https://en.wikipedia.org/wiki/Unicode) was invented. When a PDP-8 reads in plain ASCII text from a terminal as 8-bit bytes, the eighth bit is a [parity bit](https://en.wikipedia.org/wiki/Parity_bit), meant to detect read errors only.

Full 8-bit reads from the terminal did still commonly occur on PDP-8s though. The most common schemes are the [RIM loader](https://www.pdp8online.com/pdp8cgi/query_docs/tifftopdf.pl/pdp8docs/dec-08-lraa-d.pdf) and [BIN loader](http://www.pdp8online.com/pdp8cgi/query_docs/tifftopdf.pl/pdp8docs/dec-08-lbaa-d.pdf) binary paper tape formats. See those PDFs for details, but for our purposes here, it's only important to note that both schemes expressed two 12-bit PDP-8 words as three 8-bit bytes, one per row on the paper tape when punching it, and thus read back into the machine one 8-bit byte at a time. A large part of the machine code in the RIM and BIN loaders is concerned with rearranging these 8-bit bytes into 12-bit PDP-8 words.

# Words

The PDP-8 has a 12-bit native word size. That is the smallest chunk of data you can address in a single instruction, and it is also the size of PDP-8 machine instructions.

Every PDP-8 instruction is a single 12-bit word, and data are stored in 12-bit core memory locations. All of the PDP-8 registers are 12 bits or smaller.

12 bits lets you address 2¹² = 4096 memory locations, which is why the basic core memory size on a PDP-8 is 4&nbsp;kW.⁵
12 bits lets you address 2¹² = 4096 memory locations, which is why the basic core memory size on a PDP-8 is 4&nbsp;kiW.

# The 3-Level Memory Addressing System

You may be aware that the PDP-8 can be expanded to 32&nbsp;kW of memory. How does that square with all of the above?
You may be aware that the PDP-8 can be expanded to 32&nbsp;kiW of memory. How does that square with all of the above?

In some CPU types, instructions are variable-width, so that an instruction which takes two operands is longer than one that takes a single operand, and a self-contained instruction is shorter still, but in the PDP-8, every instruction takes a single 12-bit word. How can a PDP-8 refer to a 12-bit address when the single-word instructions are 12 bits themselves? And how do we get beyond that to 32&nbsp;kW, which would apparently require a 15-bit address? (2¹⁵ words = 32&nbsp;kW.)
In some CPU types, instructions are variable-width, so that an instruction which takes two operands is longer than one that takes a single operand, and a self-contained instruction is shorter still, but in the PDP-8, every instruction takes a single 12-bit word. How can a PDP-8 refer to a 12-bit address when the single-word instructions are 12 bits themselves? And how do we get beyond that to 32&nbsp;kiW, which would apparently require a 15-bit address? (2¹⁵ words = 32&nbsp;kiW.)

You may have prior experience with the [16-bit Intel x86 segmentation scheme](https://en.wikipedia.org/wiki/X86#Segmentation). If you thought writing code to deal with that was a pain, buckle up, it gets wild from here.

# Level 1: Pages

The first memory access level is the “page,” 128 words. That limit comes from the fact that all [PDP-8 memory reference instructions](http://homepage.cs.uiowa.edu/~jones/pdp8/man/mri.html) set aside 7 of their 12 bits for the operand address. That is, if the PDP-8 is executing an instruction in page 0 and it needs to load something from memory address 100₈, it can do so in a single instruction because that address fits into 7 bits.

# Level 2: Fields

To get beyond the page level, you have to use indirect memory accesses. That is, instead of using the 7-bit address to directly refer to the core memory address you're interested in, you store a 12-bit address in one of the core memory locations within the current page and refer indirectly through that location.

For example, let's say you're currently executing in page 0 and need to jump to code that resides at address 234₈, which is in page 1.⁶ You could store the 12-bit value 0234₈ at address 0177₈, then do an indirect jump through page address 177₈.
For example, let's say you're currently executing in page 0 and need to jump to code that resides at address 234₈, which is in page 1.⁵ You could store the 12-bit value 0234₈ at address 0177₈, then do an indirect jump through page address 177₈.

PDP-8 programmers normally don't have to think too much about such things, even at the assembly language level, because the assembler automatically generates "links" when needed to cross field boundaries like this. Thus, the assembly code for what we just described is written as a single assembly language instruction:

JMP 0234

The thing is, this instruction takes two words of core memory: one for the indirect `JMP` instruction, and one for the link.

Since all those links chew into the 4 to 32&nbsp;kW core limit of any given PDP-8 machine, one of the common games PDP-8 masters play is reusing *instruction and data* words for target address values and operands to save a word or two.
Since all those links chew into the 4 to 32&nbsp;kiW core limit of any given PDP-8 machine, one of the common games PDP-8 masters play is reusing *instruction and data* words for target address values and operands to save a word or two.

PDP-8 master Rick Murphy taught me one of these games. Instead of saying:

JMP 7600

and taking a "link" penalty, you find a nearby CLA instruction, change it from the [group 1](http://gunkies.org/wiki/PDP-8_architecture#Group_1_.22operate.22_instruction_operations) to the [group 2](http://gunkies.org/wiki/PDP-8_architecture#Group_2_.22operate.22_instruction_operations) form of the `OPR` instruction, give it a label in the assembly code, and jump indirectly through that label:

OS8ENT, 7600 / group 2 form of CLA, also the OS/8 entry point
... / some number of other instructions, but less than a page worth
JMP I OS8ENT / return control to OS/8

The `I` modifies the `JMP` instruction, telling the PDP-8 to load the value at the memory location referred to by the `OS8ENT` label, load up the value it finds there (7600₈) and jump to *that* address.

Barf bags are in the seat pocket ahead of you.

Once you've relieved yourself, please read on, because we're not done yet.

# Level 3: The Instruction and Data Field Registers

By this point, you'll be wondering how a PDP-8 can be expanded beyond its stock 4&nbsp;kW of core to its maximum of 32&nbsp;kW, and why is that the limit anyway?
By this point, you'll be wondering how a PDP-8 can be expanded beyond its stock 4&nbsp;kiW of core to its maximum of 32&nbsp;kiW, and why is that the limit anyway?

The answer to both questions is that the PDP-8 has a pair of 3-bit registers for setting the instruction field and the data field. 2³ = 8, and 8 fields &times; 4&nbsp;kW = 32&nbsp;kW.
The answer to both questions is that the PDP-8 has a pair of 3-bit registers for setting the instruction field and the data field. 2³ = 8, and 8 fields &times; 4&nbsp;kiW = 32&nbsp;kiW.

Thus, when a program is currently executing code in field 0 but wants to address data in field 1, it sets the data field (DF) register to 1, and now all data fetches pull data from field 1. Likewise, to jump to an address in field 1 from field 0, it sets the instruction field (IF) register to 1.

This is also why your PDP-8/I has two sets of 3 switches associated indicator lights on its front panel labeled "Data Field" and "Inst Field." Together, this gives you a combined 15-bit extended address. A jump or fetch between fields thus takes two instructions, rather than the indirect addressing for in-field operations or the direct addressing of in-page operations. This gives a hierarchy of expense: a diligent PDP-8 programmer tries to keep data and instructions close together and logically bunched to avoid needless page and field transitions.

# Conclusion

︙








-
-
-
+

3. Yes, "versions," plural. There were actually 3 major [versions of ASCII](https://en.wikipedia.org/wiki/ASCII#History), published in 1963, 1964, and 1967. Since the original PDP-8 came out in 1965 and the PDP-8/I model came out in 1968, you can see that the development of these computers coincided quite nicely with the development of ASCII itself. Thus, the PDP-8/I is one of the first machines that was aware from inception of what we now think of as 7-bit ASCII.

4. The most common terminal used with PDP-8s and other machines of its era was the Teletype Model 33 ASR, the "ASR" referring to the Asynchronous Send and Receive version of the Model 33. Although you will commonly see that terminal type referred to as an ASR-33, that is not its [proper](https://en.wikipedia.org/wiki/Teletype_Model_33#Model_33_ASR_vis-.C3.A0-vis_ASR-33) name.

As for my "wildly popular" characterization, they sold over half a million of them by the mid 1970s. Such success wouldn't be exceeded by a computer product until the first consumer-friendly microcomputers came out.

5. Remember the notation: 4096 12-bit words is 6144 bytes, but we always speak of core memory in terms of words, not bytes. It's "4&nbsp;k" or "4&nbsp;kW", not "6&nbsp;kB".

6. Pro tip: every time the third digit from the right side of an octal address goes up by 2, it is referring to a different PDP-8 page: page 1 begins at address 200₈, page 2 begins at address 400₈, etc. This is because 200₈ = 128₁₀ = 2⁷.
5. Pro tip: every time the third digit from the right side of an octal address goes up by 2, it is referring to a different PDP-8 page: page 1 begins at address 200₈, page 2 begins at address 400₈, etc. This is because 200₈ = 128₁₀ = 2⁷.

-----------------

## License

[sl]: https://tangentsoft.com/pidp8i/doc/trunk/SIMH-LICENSE.md