PiDP-8/I Software

Changes To PDP-8 Memory Addressing
Log In

Changes to "PDP-8 Memory Addressing" between 2017-04-01 15:48:39 and 2017-04-01 15:52:31

1
2
3
4
5

6
7

8
9
10
11
12
13
14
15
16

17
18
19
20

21
22
23
24
25
26
27
28
29
30
31
32
33

34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54

55
56
57
58
59
60
61
1
2
3
4

5
6

7
8
9
10
11
12
13
14
15

16
17
18
19

20
21
22
23
24
25
26
27
28
29
30
31
32

33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53

54
55
56
57
58
59
60
61




-
+

-
+








-
+



-
+












-
+




















-
+







# Notation

We must first settle on a bit of notation.

PDP-8 addresses are traditionally written as [octal](https://en.wikipedia.org/wiki/Octal) numbers. That means when we write 123 in this context, we aren't saying "one hundred and twenty-three," we're talking about the quantity (1 × 8²) + (2 × 8¹) + (3 × 8⁰) = 83 in ordinary decimal notation. When I write an octal number below, I will write it as 123₈ to make this clear, meaning "base 8," also called "octal." If you see a multi-digit number without the base-8 subscript, it's a decimal number.
PDP-8 addresses are traditionally written as [octal](https://en.wikipedia.org/wiki/Octal) numbers. That means when we write 123 in this context, we aren't saying "one hundred and twenty-three," we're talking about the quantity (1 × 8²) + (2 × 8¹) + (3 × 8⁰) = 83 in ordinary decimal notation. When I write an octal number below, I will write it as 123₈ to make this clear, the subscript meaning "base 8," also called "octal." If you see a multi-digit number without the base-8 subscript, it's a decimal number.

We use octal when talking about PDP-8 addresses and memory values because the [PDP-8's major registers](http://homepage.cs.uiowa.edu/~jones/pdp8/man/registers.html) are all multiples of 3 bits in size. (Well, there is also one program-accessible single-bit register, but we won't be talking more about that here.) Since an octal digit encodes as 3 [bits](https://en.wikipedia.org/wiki/Bit), that makes octal the most convenient way to write PDP-8 addresses. The other more common ways to write computer numbers are inconvenient: [binary](https://en.wikipedia.org/wiki/Binary_number) takes too many digits even with the tiny PDP-8 memories, and [hexadecimal](https://en.wikipedia.org/wiki/Hexadecimal) numbers don't divide evenly by 3-bit chunks until you get to 24-bit addresses, which is beyond the PDP-8's limits.
We use octal when talking about PDP-8 addresses and memory values because the [PDP-8's major registers](http://homepage.cs.uiowa.edu/~jones/pdp8/man/registers.html) are all multiples of 3 bits in size.¹ Since an octal digit encodes as 3 [bits](https://en.wikipedia.org/wiki/Bit), that makes octal the most convenient way to write PDP-8 addresses. The other more common ways to write computer numbers are inconvenient: [binary](https://en.wikipedia.org/wiki/Binary_number) takes too many digits even with the tiny PDP-8 memories, and [hexadecimal](https://en.wikipedia.org/wiki/Hexadecimal) numbers don't divide evenly by 3-bit chunks until you get to 24-bit addresses, which is beyond the PDP-8's limits.

Another bit of notation we need to establish here is that when we use **k** as a unit modifier, it is the old-style 1024 multiplier, not the more correct [SI](https://en.wikipedia.org/wiki/International_System_of_Units) 1000 multiplier. Thus, 6 kB is six [kibibytes](https://en.wikipedia.org/wiki/Kibibyte): 6144 bytes, not 6000 bytes.

On that note, when I use the unit kW here, I mean 1024 words of PDP-8 memory, not "killowatts." Although a fairly small PDP-8/I setup dissipates about one killowatt of power, this article is purely concerned with PDP-8 memory, so I think we can safely repurpose this unit notation here. Original PDP-8 documentation would often just use "k" alone, and you were expected to understand that it meant kWords, not kBytes.


# Bytes

The base PDP-8 memory configuration is 4 kWords of core. We speak of PDP-8 memory in terms of words, rather than bytes, because the PDP-8 predates the modern notion of an 8-bit byte. Back in the PDP-8's day, a "byte" was a more slippery concept. You could speak of 6-bit bytes, 7-bit bytes, 9-bit bytes... It all depended on what your particular task needed.¹
The base PDP-8 memory configuration is 4 kWords of core. We speak of PDP-8 memory in terms of words, rather than bytes, because the PDP-8 predates the modern notion of an 8-bit byte. Back in the PDP-8's day, a "byte" was a more slippery concept. You could speak of 6-bit bytes, 7-bit bytes, 9-bit bytes... It all depended on what your particular task needed.²

Since the PDP-8 uses a 12-bit native word size, 6-bit "bytes" are quite common in the PDP-8 world, often used for some kind of "packed [ASCII](https://en.wikipedia.org/wiki/ASCII)" representation. One common scheme gets rid of most of the 32 control characters defined in 7-bit ASCII, plus all of the lowercase letters, and a whole bunch of the punctuation in order to pack text two characters to a word. There are actually a few different 6-bit packed ASCII representations for the PDP-8, so you have to know which scheme you're looking at before you can turn the data back into 7-bit ASCII.

The PDP-8 was being designed at about the same time as the first versions of ASCII,² as well as around the same time as the first really popular ASCII terminal, the Teletype Model 33.³ Thus, the PDP-8 was one of the first popular ASCII-aware machines in the world.
The PDP-8 was being designed at about the same time as the first versions of ASCII,³ as well as around the same time as the first really popular ASCII terminal, the Teletype Model 33. Thus, the PDP-8 was one of the first popular ASCII-aware machines in the world.

When dealing with such terminals and the included paper tape reader, PDP-8s generally deal in either 7-bit or 8-bit bytes. When we're talking about 8-bit bytes, we aren't talking about the "[high-ASCII](https://en.wikipedia.org/wiki/Extended_ASCII)" stuff that infested the PC world in the late 1970s and 1980s before [Unicode](https://en.wikipedia.org/wiki/Unicode) was invented. When a PDP-8 reads in plain ASCII text from a terminal as 8-bit bytes, the eighth bit is a [parity bit](https://en.wikipedia.org/wiki/Parity_bit), meant to detect read errors only.

Full 8-bit reads from the terminal did still commonly occur on PDP-8s though. The most common schemes are the [RIM loader](https://www.pdp8online.com/pdp8cgi/query_docs/tifftopdf.pl/pdp8docs/dec-08-lraa-d.pdf) and [BIN loader](http://www.pdp8online.com/pdp8cgi/query_docs/tifftopdf.pl/pdp8docs/dec-08-lbaa-d.pdf) binary paper tape formats. See those PDFs for details, but for our purposes here, it's only important to note that both schemes expressed two 12-bit PDP-8 words as three 8-bit bytes, one per row on the paper tape when punching it, and thus read back into the machine one 8-bit byte at a time. A large part of the machine code in the RIM and BIN loaders is concerned with rearranging these 8-bit bytes into 12-bit PDP-8 words.


# Words

The PDP-8 has a 12-bit native word size. That is the smallest chunk of data you can address in a single instruction, and it is also the size of PDP-8 machine instructions.

Every PDP-8 instruction is a single 12-bit word, and data are stored in 12-bit core memory locations. All of the PDP-8 registers are 12 bits or smaller.

12 bits lets you address 2¹² = 4096 memory locations, which is why the basic core memory size on a PDP-8 is 4 kW.
12 bits lets you address 2¹² = 4096 memory locations, which is why the basic core memory size on a PDP-8 is 4 kW.


# The 3-Level Memory Addressing System

Let's get back to that 4 kW value. You may be aware that the PDP-8 can be expanded to 32 kW. How does that square with all of the above?

In some CPU types, instructions are variable-width, so that an instruction which takes two operands is longer than one that takes a single operand, and a self-contained instruction is shorter still, but in the PDP-8, every instruction takes a single 12-bit word. How can a PDP-8 refer to a 12-bit address when the single-word instructions are 12 bits themselves? And how do we get beyond that to 32 kW, which would apparently require a 15-bit address? (2¹⁵ = 32[k](https://en.wikipedia.org/wiki/Kibibyte).)

You may have prior experience with the 16-bit Intel x86 segmentation scheme. If you thought that was a pain, buckle up, it gets wild from here.


# Level 1: Pages

The first memory access level is the “page,” 128 words. That limit comes from the fact that all [PDP-8 memory reference instructions](http://homepage.cs.uiowa.edu/~jones/pdp8/man/mri.html) set aside 7 of their 12 bits for the operand address. That is, if the PDP-8 is executing an instruction in page 0 and it needs to load something from memory address 100₈, it can do so in a single instruction because that address fits into 7 bits.


# Level 2: Fields

To get beyond the page level, you have to use indirect memory accesses. That is, instead of using the 7-bit address to directly refer to the core memory address you're interested in, you store a 12-bit address in one of the core memory locations within the current page and refer indirectly through that location.

For example, let's say you're currently executing in page 0 and need to jump to code that resides at address 234₈, which is in page 1. You could store the 12-bit value 0234₈ at address 0177₈, then do an indirect jump through page address 177₈.
For example, let's say you're currently executing in page 0 and need to jump to code that resides at address 234₈, which is in page 1. You could store the 12-bit value 0234₈ at address 0177₈, then do an indirect jump through page address 177₈.

PDP-8 programmers normally don't have to think too much about such things, even at the assembly language level, because the assembler automatically generates "links" when needed to cross field boundaries like this. Thus, the assembly code looks like a single instruction:

            JMP 0234

But in reality, this instruction takes two words of core memory: one for the indirect `JMP` instruction, and one for the link.

94
95
96
97
98
99
100


101

102
103

104
105

106
107

108
109

94
95
96
97
98
99
100
101
102

103
104

105
106

107
108

109
110

111







+
+
-
+

-
+

-
+

-
+

-
+
I hope you've found that interesting an enlightening. If you would like more on this topic, the various versions of DEC's Small Computer Handbook for the PDP-8 cover this in more detail and take you beyond it, into actual programming of a PDP-8.


---------------------

**Footnotes and Digressions**

1. The PDP-8 does have one programmer-accessible register that isn't an even multiple of 3 bits, the single-bit "link" register, but we won't be talking more about that register in this article. The term "link" in the article does not refer to this register.

1. Thus the term [octet](https://en.wikipedia.org/wiki/Octet_(computing\)), which unambiguously refers to a group of 8 bits.
2. Thus the term [octet](https://en.wikipedia.org/wiki/Octet_(computing\)), which unambiguously refers to a group of 8 bits.

2. Yes, "versions," plural. There were actually 3 major [versions of ASCII](https://en.wikipedia.org/wiki/ASCII#History), the 1963, 1964, and 1967 versions. Note that 1967 postdates the first few versions of the PDP-8, though not the PDP-8/I, which we're focused on here on this site. Thus, the PDP-8/I is one of the first machines aware of 7-bit ASCII as we now understand it.
3. Yes, "versions," plural. There were actually 3 major [versions of ASCII](https://en.wikipedia.org/wiki/ASCII#History), the 1963, 1964, and 1967 versions. Note that 1967 postdates the first few versions of the PDP-8, though not the PDP-8/I, which we're focused on here on this site. Thus, the PDP-8/I is one of the first machines aware of 7-bit ASCII as we now understand it.

3. The most common terminal used with PDP-8s and other machines of its era was the [Teletype Model 33](https://en.wikipedia.org/wiki/Teletype_Model_33) ASR, the latter referring to the Asynchronous Send and Receive version. Although you will commonly see this referred to as an ASR-33, that is not its [proper](https://en.wikipedia.org/wiki/Teletype_Model_33#Model_33_ASR_vis-.C3.A0-vis_ASR-33) name.
4. The most common terminal used with PDP-8s and other machines of its era was the [Teletype Model 33](https://en.wikipedia.org/wiki/Teletype_Model_33) ASR, the latter referring to the Asynchronous Send and Receive version. Although you will commonly see this referred to as an ASR-33, that is not its [proper](https://en.wikipedia.org/wiki/Teletype_Model_33#Model_33_ASR_vis-.C3.A0-vis_ASR-33) name.

4. Remember the notation: 4096 12-bit words is 6144 bytes, but we always speak of core memory in terms of words, not bytes. It's "4k" or "4 kW", not "6 kB".
5. Remember the notation: 4096 12-bit words is 6144 bytes, but we always speak of core memory in terms of words, not bytes. It's "4k" or "4 kW", not "6 kB".

5. Pro tip: every time the third digit of an octal address goes up by 2, it is referring to a different PDP-8 page: page 1 begins at address 200₈, page 2 begins at address 400₈, etc.
6. Pro tip: every time the third digit of an octal address goes up by 2, it is referring to a different PDP-8 page: page 1 begins at address 200₈, page 2 begins at address 400₈, etc.