1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
|
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
|
-
+
-
+
-
+
-
+
-
+
-
+
-
+
-
+
-
+
-
+
+
+
-
+
+
+
-
+
+
+
+
-
+
-
+
+
-
+
+
+
+
+
-
+
+
+
-
+
+
+
+
+
+
+
+
+
+
+
+
+
-
+
+
+
-
+
-
+
|
# Notation
We must first settle on a bit of notation.
PDP-8 addresses are traditionally written as [octal](https://en.wikipedia.org/wiki/Octal) numbers. That means when we write 123 in this context, we aren't saying "one hundred and twenty-three," we're talking about the quantity 1 × 8² + 2 × 8¹ + 3 × 8⁰ = 83 in ordinary decimal notation. When I write an octal number below, I will write it as 123₈ to make this clear, meaning "base 8," also called "octal."
PDP-8 addresses are traditionally written as [octal](https://en.wikipedia.org/wiki/Octal) numbers. That means when we write 123 in this context, we aren't saying "one hundred and twenty-three," we're talking about the quantity 1 × 8² + 2 × 8¹ + 3 × 8⁰ = 83 in ordinary decimal notation. When I write an octal number below, I will write it as 123₈ to make this clear, meaning "base 8," also called "octal." If you see a multi-digit number without the base-8 subscript, it's a decimal number.
We use octal when talking about PDP-8 addresses and memory values because the [PDP-8's major registers](http://homepage.cs.uiowa.edu/~jones/pdp8/man/registers.html) are all multiples of 3 bits in size. (Well, there is also one program-accessible single-bit register, but we won't be talking more about that here.) Since an octal digit encodes as 3 [bits](https://en.wikipedia.org/wiki/Bit), that makes octal the most convenient way to write PDP-8 addresses. The other more common ways to write computer numbers are inconvenient: [binary](https://en.wikipedia.org/wiki/Binary_number) takes too many digits even with the tiny PDP-8 memories, and [hexadecimal](https://en.wikipedia.org/wiki/Hexadecimal) numbers don't divide evenly by 3-bit chunks until you get to 24-bit addresses, which is beyond the PDP-8's limits.
Another bit of notation we need to establish here is that when we use **k** as a unit modifier, it is the old-style 1024 multiplier, not the more correct [SI](https://en.wikipedia.org/wiki/International_System_of_Units) 1000 multiplier. Thus, 6 kB is six [kibibytes](https://en.wikipedia.org/wiki/Kibibyte): 6144 bytes, not 6000 bytes.
On that note, I will use the unit "kWords" here to avoid confusion with "kW" meaning "killowatt." Units of "kW" could mean either thing in the PDP-8 context. Yes, PDP-8s did actually draw killowatts of power. :)
On that note, when I use the unit kW here, I mean 1024 words of PDP-8 memory, not "killowatts." Although a fairly small PDP-8/I setup dissipates about one killowatt of power, this article is purely concerned with PDP-8 memory, so I think we can safely repurpose this unit notation here. Original PDP-8 documentation would often just use "k" alone, and you were expected to understand that it meant kWords, not kBytes.
# Bytes
The base PDP-8 memory configuration is 4 kWords of core. We speak of PDP-8 memory in terms of words, rather than bytes, because the PDP-8 predates the modern notion of an 8-bit byte.
The base PDP-8 memory configuration is 4 kWords of core. We speak of PDP-8 memory in terms of words, rather than bytes, because the PDP-8 predates the modern notion of an 8-bit byte.¹
Back in the PDP-8's day, a "byte" was a more slippery concept. You could speak of 6-bit bytes, 7-bit bytes, 9-bit bytes... It all depended on what your particular task needed.
Since the PDP-8 uses a 12-bit native word size, 6-bit "bytes" are quite common in the PDP-8 world, often used for some kind of "packed [ASCII](https://en.wikipedia.org/wiki/ASCII)" representation. One common scheme gets rid of most of the 32 control characters defined in 7-bit ASCII, plus all of the lowercase letters, and a whole bunch of the punctuation in order to pack text two characters to a word. There are actually a few different 6-bit packed ASCII representations for the PDP-8, so you have to know which scheme you're looking at before you can turn the data back into 7-bit ASCII.
The PDP-8 came out about the same time as the first versions of ASCII,¹ as well as around the same time as the first really popular ASCII terminal, the Teletype Model 33.² Thus, the PDP-8 was one of the first ASCII-aware machines in the world.
The PDP-8 was being designed at about the same time as the first versions of ASCII,² as well as around the same time as the first really popular ASCII terminal, the Teletype Model 33.³ Thus, the PDP-8 was one of the first popular ASCII-aware machines in the world.
When dealing with such terminals and the included paper tape reader, PDP-8s generally deal in either 7-bit or 8-bit bytes. When we're talking about 8-bit bytes, it isn't this "[high-ASCII](https://en.wikipedia.org/wiki/Extended_ASCII)" stuff that infested the PC world in the late 1970s and 1980s, before Unicode was invented. When dealing reading in ASCII text, the eighth bit was used as a [parity bit](https://en.wikipedia.org/wiki/Parity_bit) when dealing with text input, meant to detect read errors only.
When dealing with such terminals and the included paper tape reader, PDP-8s generally deal in either 7-bit or 8-bit bytes. When we're talking about 8-bit bytes, we aren't talking about the "[high-ASCII](https://en.wikipedia.org/wiki/Extended_ASCII)" stuff that infested the PC world in the late 1970s and 1980s before [Unicode](https://en.wikipedia.org/wiki/Unicode) was invented. When a PDP-8 reads in plain ASCII text from a terminal as 8-bit bytes, the eighth bit is a [parity bit](https://en.wikipedia.org/wiki/Parity_bit), meant to detect read errors only.
Full 8-bit reads from the terminal did still commonly occur on PDP-8s though. The most common schemes are the [RIM loader](https://www.pdp8online.com/pdp8cgi/query_docs/tifftopdf.pl/pdp8docs/dec-08-lraa-d.pdf) and [BIN loader](http://www.pdp8online.com/pdp8cgi/query_docs/tifftopdf.pl/pdp8docs/dec-08-lbaa-d.pdf) formats. See those PDFs for details, but for our purposes here, it's only important to note that both schemes expressed two 12-bit PDP-8 words as three 8-bit bytes, one per row on the paper tape when punching it, and thus read one 8-bit byte at a time when reading it back into the machine. A large part of the code in the RIM and BIN loaders is concerned with rearranging these 8-bit bytes into 12-bit PDP-8 words.
Full 8-bit reads from the terminal did still commonly occur on PDP-8s though. The most common schemes are the [RIM loader](https://www.pdp8online.com/pdp8cgi/query_docs/tifftopdf.pl/pdp8docs/dec-08-lraa-d.pdf) and [BIN loader](http://www.pdp8online.com/pdp8cgi/query_docs/tifftopdf.pl/pdp8docs/dec-08-lbaa-d.pdf) binary paper tape formats. See those PDFs for details, but for our purposes here, it's only important to note that both schemes expressed two 12-bit PDP-8 words as three 8-bit bytes, one per row on the paper tape when punching it, and thus read back into the machine one 8-bit byte at a time. A large part of the machine code in the RIM and BIN loaders is concerned with rearranging these 8-bit bytes into 12-bit PDP-8 words.
# Words
The PDP-8 has a 12-bit native word size. That is the smallest chunk of data you can address in a single instruction, and it is also the size of PDP-8 machine instructions.
Every PDP-8 instruction is a single 12-bit word, and data are stored in 12-bit core memory locations. All of the PDP-8 registers are 12 bits or smaller.
12 bits lets you address 2¹² = 4096 memory locations, which is why the basic core memory size on a PDP-8 is 4 kWords.³
12 bits lets you address 2¹² = 4096 memory locations, which is why the basic core memory size on a PDP-8 is 4 kW.⁴
# The 3-Level Memory Addressing System
Let's get back to that 4 kWord value. You may be aware that the PDP-8 can be expanded to 32 kWords. How does that square with all of the above?
Let's get back to that 4 kW value. You may be aware that the PDP-8 can be expanded to 32 kW. How does that square with all of the above?
In some CPU types, instructions are variable-width, so that an instruction that takes two operands is longer than one that takes a single operand, and a self-contained instruction is shorter still, but in the PDP-8, every instruction takes a single 12-bit word. How can a PDP-8 refer to a 12-bit address when the single-word instructions are 12 bits themselves? And how do we get beyond that to 32 kWords, which would apparently require a 15-bit address? (2¹⁵ = 32[k](https://en.wikipedia.org/wiki/Kibibyte).)
In some CPU types, instructions are variable-width, so that an instruction which takes two operands is longer than one that takes a single operand, and a self-contained instruction is shorter still, but in the PDP-8, every instruction takes a single 12-bit word. How can a PDP-8 refer to a 12-bit address when the single-word instructions are 12 bits themselves? And how do we get beyond that to 32 kW, which would apparently require a 15-bit address? (2¹⁵ = 32[k](https://en.wikipedia.org/wiki/Kibibyte).)
You may have prior experience with the 16-bit Intel x86 segmentation scheme. If you thought that was a pain, buckle up, it gets wild from here.
# Level 1: Pages
The first memory access level is the “page,” 128 words. That limit comes from the fact that all [PDP-8 memory reference instructions](http://homepage.cs.uiowa.edu/~jones/pdp8/man/mri.html) set aside 7 of their 12 bits for the operand address. That is, if the PDP-8 is executing an instruction in page 0 and it needs to load something from memory address 100₈, it can do so in a single instruction because that address fits into 7 bits.
To get beyond the page level, you have to use a JMP instruction. And since JMP is a memory reference instruction itself, that means to JMP to an address outside the current page, you have to do an indirect JMP through an address stored in the current page.
# Level 2: Fields
To get beyond the page level, you have to use indirect memory accesses. That is, instead of using the 7-bit address to directly refer to the core memory address you're interested in, you store a 12-bit address in one of the core memory locations within the current page and refer indirectly through that location.
PDP-8 assemblers actually have features to generate these “links” for you. That is, you tell the assembler to JMP to a 12-bit address, and if that address isn’t in the current page, it steals one of the core locations in the current page to store the target address and generates an indirect JMP through that address.
For example, let's say you're currently executing in page 0 and need to jump to code that resides at address 234₈, which is in page 1.⁵ You could store the 12-bit value 0234₈ at address 0177₈, then do an indirect jump through page address 177₈.
PDP-8 programmers normally don't have to think too much about such things, even at the assembly language level, because the assembler automatically generates "links" when needed to cross field boundaries like this. Thus, the assembly code looks like a single instruction:
The same goes for things like “load accumulator:” to load a value not in the current page, you need to store the source address in a field-local core address and do an indirect load, which takes longer.
JMP 0234
But in reality, this instruction takes two words of core memory: one for the indirect JMP instruction, and one for the link.
Since all those target addresses chew into the 4 to 32 kWord core limit of any given PDP-8 machine, one of the common games PDP-8 masters play is reusing *instruction* words for target address values and operands to save a word or two.
Since all those links chew into the 4 to 32 kW core limit of any given PDP-8 machine, one of the common games PDP-8 masters play is reusing *instruction* words for target address values and operands to save a word or two.
Need to return to OS/8 from your program? Its entry point is at address 7600…which happens to be the same as the “clear accumulator” instruction’s value, and most programs have at least one of those sitting around somewhere. So, do an indirect JMP via a nearby CLA instruction, and you’ve returned control from your program to the host OS in a single instruction. <barf>
PDP-8 master Rick Murphy taught me this one: If your program needs to return control to OS/8, its entry point is at address 7600, which happens to be the same as the “clear accumulator” instruction’s value, and most programs have at least one of those sitting around somewhere. Therefore, instead of saying
JMP 7600
The PDP-8’s page size limit is 4k because of the PDP-8’s 12-bit nature. So how do we get beyond it to the 32 kWord limit, and why is that the limit anyway? Because there is a pair of 3-bit registers for setting the instruction page and the data page, that’s why. 8 fields * 4k = 32k.
and taking a "link" penalty, you find a nearby CLA instruction, give it a label in the assembly code, and jump indirectly through that label:
OS8ENT, CLA / CLA = 7600, the OS/8 entry point
... / some number of other instructions, but less than a page worth
JMP OS8ENT / return control to OS/8
And why only 3 bits? Because a register bit is composed of several transistors, and transistors cost actual money back in the early 1960s, when all of this was being designed.
Barf bags are in the seat pocket ahead of you.
Once you've relieved yourself, please read on, because we're not done yet.
LSI? That was science fiction to the PDP-8’s creators.
# Level 3: Instruction and Data Fields
By this point, you'll be wondering how a PDP-8 can be expanded beyond its stock 4 kW of core to its maximum of 32 kW, and why is that the limit anyway?
The answer to both questions is that the PDP-8 has a pair of 3-bit registers for setting the instruction field and the data field. 2³ = 8, and 8 fields × 4 kW = 32 kW.
Thus, when a program is currently executing code in field 0 but wants to address data in field 1, it sets the data field (DF) register to 1, and now all data fetches pull data from field 1. Likewise, to jump to an address in field 1 from field 0, it sets the instruction field (IF) register to 1.
This is also why your PDP-8/I has two sets of 3 switches on the front and a set of indicator lights labeled "Data Field" and "Instruction Field." Together, this gives you a combined 15-bit extended address. A jump or fetch between fields thus takes two instructions, rather than the indirect addressing for in-field operations or the direct addressing of in-page operations. This gives a hierarchy of expense: a diligent PDP-8 programmer tries to keep data and instructions close together and logically bunched to avoid needless page and field transitions.
---------------------
**Footnotes and Digressions**
1. You sometimes see the term [octet](https://en.wikipedia.org/wiki/Octet_(computing)) to unambiguously refer to a group of 8 bits as a result.
1. Yes, "versions," plural. There were actually 3 major [versions of ASCII](https://en.wikipedia.org/wiki/ASCII#History), the 1963, 1964, and 1967 versions. Note that 1967 postdates the first few versions of the PDP-8, though not the PDP-8/I, which we're focused on here on this site. Thus, the PDP-8/I is one of the first machines aware of 7-bit ASCII as we now understand it.
2. Yes, "versions," plural. There were actually 3 major [versions of ASCII](https://en.wikipedia.org/wiki/ASCII#History), the 1963, 1964, and 1967 versions. Note that 1967 postdates the first few versions of the PDP-8, though not the PDP-8/I, which we're focused on here on this site. Thus, the PDP-8/I is one of the first machines aware of 7-bit ASCII as we now understand it.
3. The most common terminal used with PDP-8s and other machines of its era was the [Teletype Model 33](https://en.wikipedia.org/wiki/Teletype_Model_33) ASR, the latter referring to the Asynchronous Send and Receive version. Although you will commonly see this referred to as an ASR-33, that is not its [proper](https://en.wikipedia.org/wiki/Teletype_Model_33#Model_33_ASR_vis-.C3.A0-vis_ASR-33) name.
2. The most common terminal used with PDP-8s and other machines of its era was the [Teletype Model 33](https://en.wikipedia.org/wiki/Teletype_Model_33) ASR, the latter referring to the Asynchronous Send and Receive version. Although you will commonly see this referred to as an ASR-33, that is not its [proper](https://en.wikipedia.org/wiki/Teletype_Model_33#Model_33_ASR_vis-.C3.A0-vis_ASR-33) name.
4. Remember the notation: 4096 12-bit words is 6144 bytes, but we always speak of core memory in terms of words, not bytes. It's "4k" or "4 kW", not "6 kB".
3. Remember the notation: 4096 12-bit words is 6144 bytes, but we always speak of core memory in terms of words, not bytes. It's "4k" or "4 kWords", not "6 kB".
5. Pro tip: every time the third digit of an octal address goes up by 2, it is referring to a different PDP-8 page: page 1 begins at address 200₈, page 2 begins at address 400₈, etc.
|