Notation

We must first settle on a bit of notation.

PDP-8 addresses are traditionally written as octal numbers. That means when we write 123 in this context, we aren't saying "one hundred and twenty-three," we're talking about the quantity (1 × 8²) + (2 × 8¹) + (3 × 8⁰) = 83 in ordinary decimal notation. When I write an octal number below, I will write it as 123₈ to make this clear, the subscript meaning "base 8," also called "octal." If you see a multi-digit number without the base-8 subscript, it's a decimal number.

We use octal when talking about PDP-8 addresses and memory values because the PDP-8's major registers are all multiples of 3 bits in size,¹ and the PDP-8's switch register is divided into groups of 3 switches. Since an octal digit encodes as 3 bits, that makes octal the most convenient way to write PDP-8 addresses. The more common ways to write computer numbers are inconvenient:

binary takes too many digits even with the tiny PDP-8 memories
hexadecimal numbers encode as 4 bits each, and since the least common multiple of 3 and 4 is 12, the smallest even mapping of hex digits to PDP-8 values doesn't happen until you have a full PDP-8 word. Since there are many cases in PDP-8 usage where it is useful to look at a 4 or 5-digit octal number and think about the digits separately, octal simply makes more sense than hex in this case.

Hexadecimal notation didn't start to become the predominant way to write computer numbers until the de facto standardization of the 8-bit byte, a subject we will return to in more detail below.

Another bit of notation unique to this article, but which I hope will spread, is the unit kiW, meaning 1024 words of PDP-8 memory. The unit is named by analogy to kibibytes, abbreviated kiB. This new unit does two things for us:

It distinguishes between the proper SI definition of k and the "computer memory" meaning of k: 1000 vs 1024. DEC manuals will often just use k, and you're expected to understand that it means 1024 words, not kibibytes or kilobytes.
It avoids ambiguity with kW meaning a kilowatt. That is not an unlikely confusion, since a fairly small PDP-8/I setup dissipates about one kilowatt of power, and a large PDP-8 setup could dissipate multiple kilowatts.

Bytes

The PDP-8 predates the modern notion of an 8-bit byte. Back in the PDP-8's day, a "byte" was a more slippery concept. You could speak of 6-bit bytes, 7-bit bytes, 9-bit bytes... It all depended on what your particular task needed.²

Don't believe me? Consider this sentence found in the FORTRAN IV chapter in the OS/8 Language Reference Manual:

A real constant occupies three words (i.e., six bytes) of storage.

Since words are 12 bits in the PDP-8, that can only be interpreted as referring to 6-bit "bytes."

6-bit bytes are quite common in the PDP-8 world, often used for some kind of "packed ASCII" representation. One common scheme gets rid of most of the 32 control characters defined in 7-bit ASCII, all of the lowercase letters, and a whole bunch of the punctuation in order to pack two characters into a 12-bit PDP-8 word. There are actually a few different 6-bit packed ASCII representations for the PDP-8, so you have to know which scheme you're looking at before you can turn the data back into 7-bit ASCII.

The PDP-8 was being designed at about the same time as the first versions of ASCII,³ as well as around the same time as the first wildly popular ASCII terminal, the Teletype Model 33.⁴

When dealing with such terminals and the included paper tape reader, PDP-8s generally deal in either 7-bit or 8-bit bytes. When we're talking about 8-bit bytes, we aren't talking about the "high-ASCII" stuff that infested the PC world in the late 1970s and 1980s before Unicode was invented.

Much existing PDP-8 software that reads in plain ASCII text from a terminal as 8-bit bytes treats the eighth bit on standard DEC format paper tapes as a mark parity bit, rather than as an error-detecting parity bit or as a data bit.

Full 8-bit reads from the terminal did still commonly occur on PDP-8s though. The most common schemes are the RIM loader and BIN loader binary paper tape formats. See those PDFs for details, but for our purposes here, it's only important to note that both schemes expressed two 12-bit PDP-8 words as three 8-bit bytes, one per row on the paper tape when punching it, and thus read back into the machine one 8-bit byte at a time. A large part of the machine code in the RIM and BIN loaders is concerned with rearranging these 8-bit bytes into 12-bit PDP-8 words.

Words

We speak of PDP-8 memory in terms of words, rather than bytes, because the smallest addressable unit of memory is the word, and the modern 8-bit byte doesn't divide evenly into the PDP-8 word size. That is, you can't retrieve just one byte of memory, so it is more useful to think about it in words.

The PDP-8 has a 12-bit native word size. That is the smallest chunk of data you can address in a single instruction, and it is also the size of PDP-8 machine instructions.

Every PDP-8 instruction is a single 12-bit word, and data are stored in 12-bit core memory locations. All of the PDP-8 registers are 12 bits or smaller.

12 bits lets you address 2¹² = 4096 memory locations, which is why the basic core memory size on a PDP-8 is 4 kiW.

The 3-Level Memory Addressing System

You may be aware that PDP-8 family computers up through the PDP-8/e can be expanded to a maximum of 32 kiW of core memory. How does that square with all of the above?

In some CPU types, instructions are variable-width, so that an instruction which takes two operands is longer than one that takes a single operand, and a self-contained instruction is shorter still, but in the PDP-8, every instruction takes a single 12-bit word. How can a PDP-8 refer to a 12-bit address when the single-word instructions are 12 bits themselves? And how do we get beyond that to 32 kiW, which would apparently require a 15-bit address? (2¹⁵ words = 32 kiW.)

You may have prior experience with the 16-bit Intel x86 segmentation scheme. If you thought writing code to deal with that was a pain, buckle up, it gets wild from here.

Level 1: Pages

The first memory access level is the “page,” 128 words. That limit comes from the fact that all PDP-8 memory reference instructions set aside 7 of their 12 bits for the operand address. That is, if the PDP-8 is executing an instruction in page 0 and it needs to load something from memory address 100₈, it can do so in a single instruction because that address fits into 7 bits.

Level 2: Fields

To get beyond the page level, you have to use indirect memory accesses. That is, instead of using the 7-bit address to directly refer to the core memory address you're interested in, you store a 12-bit address in one of the core memory locations within the current page and refer indirectly through that location.

For example, let's say you're currently executing in page 0 and need to jump to code that resides at address 234₈, which is in page 1.⁵ You could store the 12-bit value 0234₈ at address 0177₈, then do an indirect jump through page address 177₈.

PDP-8 programmers may opt to ignore such things, even at the assembly language level, because the assembler automatically generates "links" when needed to cross field boundaries like this. Thus, the assembly code for what we just described is written as a single assembly language instruction:

        JMP 0234

The thing is, this instruction takes two words of core memory: one for the indirect JMP instruction, and one for the link.

Since all those links chew into the 4 to 32 kiW core limit of any given PDP-8 machine, one of the common games PDP-8 masters play is reusing instruction and data words for target address values and operands to save a word or two.

PDP-8 master Rick Murphy taught me one of these games. Instead of saying:

        JMP 7600

and taking a "link" penalty, you find a nearby CLA instruction, change it from the group 1 to the group 2 form of the OPR instruction, give it a label in the assembly code, and jump indirectly through that label:

OS8ENT, 7600         / group 2 form of CLA, also the OS/8 entry point
        ...          / some number of other instructions, but less than a page worth
        JMP I OS8ENT / return control to OS/8

The I modifies the JMP instruction, telling the PDP-8 to load the value at the memory location referred to by the OS8ENT label (7600₈) into the processor's program counter, causing a jump to that address. Savings: one whole PDP-8 word. Woo!

Barf bags are in the seat pocket ahead of you.

Once you've relieved yourself, please read on, because we're not done yet.

Level 3: The Instruction and Data Field Registers

By this point, you'll be wondering how a PDP-8 can be expanded beyond its stock 4 kiW of core to its maximum of 32 kiW, and why is that the limit anyway?

The answer to both questions is that the PDP-8 has a pair of 3-bit registers for setting the instruction field and the data field. 2³ = 8, and 8 fields × 4 kiW = 32 kiW.

Thus, when a program is currently executing code in field 0 but wants to address data in field 1, it sets the data field (DF) register to 1 with a CDF instruction, and now all data fetches pull data from field 1. Likewise, to jump to an address in field 1 from field 0, it sets the instruction field (IF) register to 1 with a CIF instruction and jumps to an address in that field.⁶

This is also why your PDP-8/I has two sets of 3 switches and associated indicator lights on its front panel labeled "Data Field" and "Inst Field." Together, this gives you a combined 15-bit extended address. A jump or fetch between fields thus takes two instructions, rather than the indirect addressing for in-field operations or the direct addressing of in-page operations. This gives a hierarchy of expense: a diligent PDP-8 programmer tries to keep data and instructions close together and logically bunched to avoid needless page and field transitions.

Keep in mind that this is all a simplified overview, meant more for explaining how addressing works to a PDP-8 newbie than as a guide to memory management for a PDP-8 programmer. There are still further complications here, such as how indirect addressing interacts with links and the IF and DF registers.

Conclusion

I hope you've found this overview interesting and enlightening. If you would like more on this topic, the various versions of DEC's Small Computer Handbook for the PDP-8 cover this in more detail and take you beyond it, into actual programming of a PDP-8:

The 1967-1968 edition was first published before the PDP-8/I was formally released, so its first drafts must have been written with reference to a pre-production unit. (PDF, 21 MB.)

A telltale of this is that the machines pictured on the front cover of that edition are an original PDP-8 (a.k.a. the "Straight 8") and a PDP-8/S, even though the book claims to cover the PDP-8/I.

I have here in my possession a paper copy of a later edition, also labelled as the "1967-1968" edition but with an actual PDP-8/I pictured on the front. (Two of them, actually: the standard rack cabinet version and the rare console/desk version.) You can quickly tell the difference between these two editions of the manual by the greenish-yellow background behind the machines on the later version, whereas the first edition has an orange background behind the older machines depicted. Unfortunately, I'm not aware of a PDF source of the newer edition.
The 1973 edition for the PDP-8/e, PDP-8/f, and PDP-8/m is not entirely relevant to the PDP-8/I we're primarily concerned with here on this web site, but they did refine the tutorial material quite a bit over the earlier editions, so it is sometimes helpful to read the equivalent material in this newer edition when trying to figure something out. Just be aware that not everything applies to a PDP-8/I. (PDF, 76 MB.)

Footnotes and Digressions

The PDP-8 does have one programmer-accessible register that isn't an even multiple of 3 bits, the single-bit "link" register, but we won't be talking more about that register in this article. The term "link" in this article does not refer to the PDP-8's Link register.
Thus the term octet, which unambiguously refers to a group of 8 bits.
Yes, "versions," plural. There were actually 3 major versions of ASCII, published in 1963, 1964, and 1967. Since the original PDP-8 came out in 1965 and the PDP-8/I model came out in 1968, you can see that the development of these computers coincided quite nicely with the development of ASCII itself. Thus, the PDP-8/I is one of the first machines that was aware from inception of what we now think of as 7-bit ASCII.
The most common terminal used with PDP-8s and other machines of its era was the Teletype Model 33 ASR, the "ASR" referring to the Asynchronous Send and Receive version of the Model 33. Although you will commonly see that terminal type referred to as an ASR-33, that is not its proper name.

As for my "wildly popular" characterization, they sold over half a million of them by the mid 1970s. Such success wouldn't be exceeded by a computer product until the first consumer-friendly microcomputers came out.
Pro tip: every time the third digit from the right side of an octal address goes up by 2, it is referring to a different PDP-8 page: page 1 begins at address 200₈, page 2 begins at address 400₈, etc. This is because 200₈ = 128₁₀ = 2⁷.
The IF register doesn't actually change until the processor encounters a JMP or JMS instruction. Then the address being jumped to is computed based on the new IF value, transferring execution out of the current field.

PiDP-8/I Software