PiDP-8/I Software: Changes To PDP-8 Memory Addressing

+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+

# Notation

We must first settle on a bit of notation.

PDP-8 addresses are traditionally written as [octal](https://en.wikipedia.org/wiki/Octal) numbers. That means when we write 123 in this context, we aren't saying "one hundred and twenty-three," we're talking about the quantity 1 &times; 8² + 2 &times; 8¹ + 3 &times; 8⁰ = 83 in ordinary decimal notation. When I write an octal number below, I will write it as 123₈ to make this clear, meaning "base 8," also called "octal."

We use octal when talking about PDP-8 addresses and memory values because the [PDP-8's major registers](http://homepage.cs.uiowa.edu/~jones/pdp8/man/registers.html) are all multiples of 3 bits in size. (Well, there is also one program-accessible single-bit register, but we won't be talking more about that here.) Since an octal digit encodes as 3 [bits](https://en.wikipedia.org/wiki/Bit), that makes octal the most convenient way to write PDP-8 addresses. The other more common ways to write computer numbers are inconvenient: [binary](https://en.wikipedia.org/wiki/Binary_number) takes too many digits even with the tiny PDP-8 memories, and [hexadecimal](https://en.wikipedia.org/wiki/Hexadecimal) numbers don't divide evenly by 3-bit chunks until you get to 24-bit addresses, which is beyond the PDP-8's limits.

Another bit of notation we need to establish here is that when we use **k** as a unit modifier, it is the old-style 1024 multiplier, not the more correct [SI](https://en.wikipedia.org/wiki/International_System_of_Units) 1000 multiplier. Thus, 6 kB is six [kibibytes](https://en.wikipedia.org/wiki/Kibibyte): 6144 bytes, not 6000 bytes.

On that note, I will use the unit "kWords" here to avoid confusion with "kW" meaning "killowatt." Units of "kW" could mean either thing in the PDP-8 context. Yes, PDP-8s did actually draw killowatts of power. :)

# Bytes

The base PDP-8 memory configuration is 4 kWords of core. We speak of PDP-8 memory in terms of words, rather than bytes, because the PDP-8 predates the modern notion of an 8-bit byte.

Back in the PDP-8's day, a "byte" was a more slippery concept. You could speak of 6-bit bytes, 7-bit bytes, 9-bit bytes... It all depended on what your particular task needed.

Since the PDP-8 uses a 12-bit native word size, 6-bit "bytes" are quite common in the PDP-8 world, often used for some kind of "packed [ASCII](https://en.wikipedia.org/wiki/ASCII)" representation. One common scheme gets rid of most of the 32 control characters defined in 7-bit ASCII, plus all of the lowercase letters, and a whole bunch of the punctuation in order to pack text two characters to a word. There are actually a few different 6-bit packed ASCII representations for the PDP-8, so you have to know which scheme you're looking at before you can turn the data back into 7-bit ASCII.

The PDP-8 came out about the same time as the first versions of ASCII,¹ as well as around the same time as the first really popular ASCII terminal, the Teletype Model 33.² Thus, the PDP-8 was one of the first ASCII-aware machines in the world.

When dealing with such terminals and the included paper tape reader, PDP-8s generally deal in either 7-bit or 8-bit bytes. When we're talking about 8-bit bytes, it isn't this "[high-ASCII](https://en.wikipedia.org/wiki/Extended_ASCII)" stuff that infested the PC world in the late 1970s and 1980s, before Unicode was invented. When dealing reading in ASCII text, the eighth bit was used as a [parity bit](https://en.wikipedia.org/wiki/Parity_bit) when dealing with text input, meant to detect read errors only.

Full 8-bit reads from the terminal did still commonly occur on PDP-8s though. The most common schemes are the [RIM loader](https://www.pdp8online.com/pdp8cgi/query_docs/tifftopdf.pl/pdp8docs/dec-08-lraa-d.pdf) and [BIN loader](http://www.pdp8online.com/pdp8cgi/query_docs/tifftopdf.pl/pdp8docs/dec-08-lbaa-d.pdf) formats. See those PDFs for details, but for our purposes here, it's only important to note that both schemes expressed two 12-bit PDP-8 words as three 8-bit bytes, one per row on the paper tape when punching it, and thus read one 8-bit byte at a time when reading it back into the machine. A large part of the code in the RIM and BIN loaders is concerned with rearranging these 8-bit bytes into 12-bit PDP-8 words.

# Words

The PDP-8 has a 12-bit native word size. That is the smallest chunk of data you can address in a single instruction, and it is also the size of PDP-8 machine instructions.

Every PDP-8 instruction is a single 12-bit word, and data are stored in 12-bit core memory locations. All of the PDP-8 registers are 12 bits or smaller.

12 bits lets you address 2¹² = 4096 memory locations, which is why the basic core memory size on a PDP-8 is 4 kWords.³

# The 3-Level Memory Addressing System

Let's get back to that 4 kWord value. You may be aware that the PDP-8 can be expanded to 32 kWords. How does that square with all of the above?

In some CPU types, instructions are variable-width, so that an instruction that takes two operands is longer than one that takes a single operand, and a self-contained instruction is shorter still, but in the PDP-8, every instruction takes a single 12-bit word. How can a PDP-8 refer to a 12-bit address when the single-word instructions are 12 bits themselves? And how do we get beyond that to 32 kWords, which would apparently require a 15-bit address? (2¹⁵ = 32[k](https://en.wikipedia.org/wiki/Kibibyte).)

You may have prior experience with the 16-bit Intel x86 segmentation scheme. If you thought that was a pain, buckle up, it gets wild from here.

# Level 1: Pages

The first memory access level is the “page,” 128 words. That limit comes from the fact that all [PDP-8 memory reference instructions](http://homepage.cs.uiowa.edu/~jones/pdp8/man/mri.html) set aside 7 of their 12 bits for the operand address. That is, if the PDP-8 is executing an instruction in page 0 and it needs to load something from memory address 100₈, it can do so in a single instruction because that address fits into 7 bits.

To get beyond the page level, you have to use a JMP instruction. And since JMP is a memory reference instruction itself, that means to JMP to an address outside the current page, you have to do an indirect JMP through an address stored in the current page.

PDP-8 assemblers actually have features to generate these “links” for you. That is, you tell the assembler to JMP to a 12-bit address, and if that address isn’t in the current page, it steals one of the core locations in the current page to store the target address and generates an indirect JMP through that address.

The same goes for things like “load accumulator:” to load a value not in the current page, you need to store the source address in a field-local core address and do an indirect load, which takes longer.

Since all those target addresses chew into the 4 to 32 kWord core limit of any given PDP-8 machine, one of the common games PDP-8 masters play is reusing *instruction* words for target address values and operands to save a word or two.

Need to return to OS/8 from your program? Its entry point is at address 7600…which happens to be the same as the “clear accumulator” instruction’s value, and most programs have at least one of those sitting around somewhere. So, do an indirect JMP via a nearby CLA instruction, and you’ve returned control from your program to the host OS in a single instruction. <barf>

The PDP-8’s page size limit is 4k because of the PDP-8’s 12-bit nature. So how do we get beyond it to the 32 kWord limit, and why is that the limit anyway? Because there is a pair of 3-bit registers for setting the instruction page and the data page, that’s why. 8 fields * 4k = 32k.

And why only 3 bits? Because a register bit is composed of several transistors, and transistors cost actual money back in the early 1960s, when all of this was being designed.

LSI? That was science fiction to the PDP-8’s creators.

---------------------

**Footnotes and Digressions**

1. Yes, "versions," plural. There were actually 3 major [versions of ASCII](https://en.wikipedia.org/wiki/ASCII#History), the 1963, 1964, and 1967 versions. Note that 1967 postdates the first few versions of the PDP-8, though not the PDP-8/I, which we're focused on here on this site. Thus, the PDP-8/I is one of the first machines aware of 7-bit ASCII as we now understand it.

2. The most common terminal used with PDP-8s and other machines of its era was the [Teletype Model 33](https://en.wikipedia.org/wiki/Teletype_Model_33) ASR, the latter referring to the Asynchronous Send and Receive version. Although you will commonly see this referred to as an ASR-33, that is not its [proper](https://en.wikipedia.org/wiki/Teletype_Model_33#Model_33_ASR_vis-.C3.A0-vis_ASR-33) name.

3. Remember the notation: 4096 12-bit words is 6144 bytes, but we always speak of core memory in terms of words, not bytes. It's "4k" or "4 kWords", not "6 kB".












































































1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76	+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +	# Notation We must first settle on a bit of notation. PDP-8 addresses are traditionally written as [octal](https://en.wikipedia.org/wiki/Octal) numbers. That means when we write 123 in this context, we aren't saying "one hundred and twenty-three," we're talking about the quantity 1 × 8² + 2 × 8¹ + 3 × 8⁰ = 83 in ordinary decimal notation. When I write an octal number below, I will write it as 123₈ to make this clear, meaning "base 8," also called "octal." We use octal when talking about PDP-8 addresses and memory values because the [PDP-8's major registers](http://homepage.cs.uiowa.edu/~jones/pdp8/man/registers.html) are all multiples of 3 bits in size. (Well, there is also one program-accessible single-bit register, but we won't be talking more about that here.) Since an octal digit encodes as 3 [bits](https://en.wikipedia.org/wiki/Bit), that makes octal the most convenient way to write PDP-8 addresses. The other more common ways to write computer numbers are inconvenient: [binary](https://en.wikipedia.org/wiki/Binary_number) takes too many digits even with the tiny PDP-8 memories, and [hexadecimal](https://en.wikipedia.org/wiki/Hexadecimal) numbers don't divide evenly by 3-bit chunks until you get to 24-bit addresses, which is beyond the PDP-8's limits. Another bit of notation we need to establish here is that when we use k as a unit modifier, it is the old-style 1024 multiplier, not the more correct [SI](https://en.wikipedia.org/wiki/International_System_of_Units) 1000 multiplier. Thus, 6 kB is six [kibibytes](https://en.wikipedia.org/wiki/Kibibyte): 6144 bytes, not 6000 bytes. On that note, I will use the unit "kWords" here to avoid confusion with "kW" meaning "killowatt." Units of "kW" could mean either thing in the PDP-8 context. Yes, PDP-8s did actually draw killowatts of power. :) # Bytes The base PDP-8 memory configuration is 4 kWords of core. We speak of PDP-8 memory in terms of words, rather than bytes, because the PDP-8 predates the modern notion of an 8-bit byte. Back in the PDP-8's day, a "byte" was a more slippery concept. You could speak of 6-bit bytes, 7-bit bytes, 9-bit bytes... It all depended on what your particular task needed. Since the PDP-8 uses a 12-bit native word size, 6-bit "bytes" are quite common in the PDP-8 world, often used for some kind of "packed [ASCII](https://en.wikipedia.org/wiki/ASCII)" representation. One common scheme gets rid of most of the 32 control characters defined in 7-bit ASCII, plus all of the lowercase letters, and a whole bunch of the punctuation in order to pack text two characters to a word. There are actually a few different 6-bit packed ASCII representations for the PDP-8, so you have to know which scheme you're looking at before you can turn the data back into 7-bit ASCII. The PDP-8 came out about the same time as the first versions of ASCII,¹ as well as around the same time as the first really popular ASCII terminal, the Teletype Model 33.² Thus, the PDP-8 was one of the first ASCII-aware machines in the world. When dealing with such terminals and the included paper tape reader, PDP-8s generally deal in either 7-bit or 8-bit bytes. When we're talking about 8-bit bytes, it isn't this "[high-ASCII](https://en.wikipedia.org/wiki/Extended_ASCII)" stuff that infested the PC world in the late 1970s and 1980s, before Unicode was invented. When dealing reading in ASCII text, the eighth bit was used as a [parity bit](https://en.wikipedia.org/wiki/Parity_bit) when dealing with text input, meant to detect read errors only. Full 8-bit reads from the terminal did still commonly occur on PDP-8s though. The most common schemes are the [RIM loader](https://www.pdp8online.com/pdp8cgi/query_docs/tifftopdf.pl/pdp8docs/dec-08-lraa-d.pdf) and [BIN loader](http://www.pdp8online.com/pdp8cgi/query_docs/tifftopdf.pl/pdp8docs/dec-08-lbaa-d.pdf) formats. See those PDFs for details, but for our purposes here, it's only important to note that both schemes expressed two 12-bit PDP-8 words as three 8-bit bytes, one per row on the paper tape when punching it, and thus read one 8-bit byte at a time when reading it back into the machine. A large part of the code in the RIM and BIN loaders is concerned with rearranging these 8-bit bytes into 12-bit PDP-8 words. # Words The PDP-8 has a 12-bit native word size. That is the smallest chunk of data you can address in a single instruction, and it is also the size of PDP-8 machine instructions. Every PDP-8 instruction is a single 12-bit word, and data are stored in 12-bit core memory locations. All of the PDP-8 registers are 12 bits or smaller. 12 bits lets you address 2¹² = 4096 memory locations, which is why the basic core memory size on a PDP-8 is 4 kWords.³ # The 3-Level Memory Addressing System Let's get back to that 4 kWord value. You may be aware that the PDP-8 can be expanded to 32 kWords. How does that square with all of the above? In some CPU types, instructions are variable-width, so that an instruction that takes two operands is longer than one that takes a single operand, and a self-contained instruction is shorter still, but in the PDP-8, every instruction takes a single 12-bit word. How can a PDP-8 refer to a 12-bit address when the single-word instructions are 12 bits themselves? And how do we get beyond that to 32 kWords, which would apparently require a 15-bit address? (2¹⁵ = 32[k](https://en.wikipedia.org/wiki/Kibibyte).) You may have prior experience with the 16-bit Intel x86 segmentation scheme. If you thought that was a pain, buckle up, it gets wild from here. # Level 1: Pages The first memory access level is the “page,” 128 words. That limit comes from the fact that all [PDP-8 memory reference instructions](http://homepage.cs.uiowa.edu/~jones/pdp8/man/mri.html) set aside 7 of their 12 bits for the operand address. That is, if the PDP-8 is executing an instruction in page 0 and it needs to load something from memory address 100₈, it can do so in a single instruction because that address fits into 7 bits. To get beyond the page level, you have to use a JMP instruction. And since JMP is a memory reference instruction itself, that means to JMP to an address outside the current page, you have to do an indirect JMP through an address stored in the current page. PDP-8 assemblers actually have features to generate these “links” for you. That is, you tell the assembler to JMP to a 12-bit address, and if that address isn’t in the current page, it steals one of the core locations in the current page to store the target address and generates an indirect JMP through that address. The same goes for things like “load accumulator:” to load a value not in the current page, you need to store the source address in a field-local core address and do an indirect load, which takes longer. Since all those target addresses chew into the 4 to 32 kWord core limit of any given PDP-8 machine, one of the common games PDP-8 masters play is reusing instruction words for target address values and operands to save a word or two. Need to return to OS/8 from your program? Its entry point is at address 7600…which happens to be the same as the “clear accumulator” instruction’s value, and most programs have at least one of those sitting around somewhere. So, do an indirect JMP via a nearby CLA instruction, and you’ve returned control from your program to the host OS in a single instruction. <barf> The PDP-8’s page size limit is 4k because of the PDP-8’s 12-bit nature. So how do we get beyond it to the 32 kWord limit, and why is that the limit anyway? Because there is a pair of 3-bit registers for setting the instruction page and the data page, that’s why. 8 fields * 4k = 32k. And why only 3 bits? Because a register bit is composed of several transistors, and transistors cost actual money back in the early 1960s, when all of this was being designed. LSI? That was science fiction to the PDP-8’s creators. --------------------- Footnotes and Digressions 1. Yes, "versions," plural. There were actually 3 major [versions of ASCII](https://en.wikipedia.org/wiki/ASCII#History), the 1963, 1964, and 1967 versions. Note that 1967 postdates the first few versions of the PDP-8, though not the PDP-8/I, which we're focused on here on this site. Thus, the PDP-8/I is one of the first machines aware of 7-bit ASCII as we now understand it. 2. The most common terminal used with PDP-8s and other machines of its era was the [Teletype Model 33](https://en.wikipedia.org/wiki/Teletype_Model_33) ASR, the latter referring to the Asynchronous Send and Receive version. Although you will commonly see this referred to as an ASR-33, that is not its [proper](https://en.wikipedia.org/wiki/Teletype_Model_33#Model_33_ASR_vis-.C3.A0-vis_ASR-33) name. 3. Remember the notation: 4096 12-bit words is 6144 bytes, but we always speak of core memory in terms of words, not bytes. It's "4k" or "4 kWords", not "6 kB".

Initial version of "PDP-8 Memory Addressing"