PiDP-8/I Software

Check-in [573aba2f6a]
Log In

Many hyperlinks are disabled.
Use anonymous login to enable hyperlinks.

Overview
Comment:Rewrote the "Inline Assembly in the Native CC8 Compiler" section in the CC8 user manual after learning more about its behavior and limitations.
Downloads: Tarball | ZIP archive
Timelines: family | ancestors | descendants | both | trunk
Files: files | file ages | folders
SHA3-256: 573aba2f6a1b9dc64781e259c226b62e1e13d788ff7f807335e40c6cd0cbedee
User & Date: tangent 2019-02-13 18:52:54.192
Context
2019-02-13
19:12
Added more detail about file I/O limitations to the LIBC user documentation section of the CC8 manual. (What used to be the "stdio" section is now broken up into several sections at the same level.) check-in: 23f92ab553 user: tangent tags: trunk
18:52
Rewrote the "Inline Assembly in the Native CC8 Compiler" section in the CC8 user manual after learning more about its behavior and limitations. check-in: 573aba2f6a user: tangent tags: trunk
06:13
Documented the new ./configure --boot-tape-* options, and updated some of the other configuration option docs in README.md. check-in: f5a05c7790 user: tangent tags: trunk
Changes
Side-by-Side Diff Ignore Whitespace Patch
Changes to doc/cc8-manual.md.
70
71
72
73
74
75
76
77
78


79
80
81
82
83
84
85
70
71
72
73
74
75
76


77
78
79
80
81
82
83
84
85







-
-
+
+









## Requirements

The CC8 system generally assumes the availability of:

*   [At least 12 kWords of core](#memory) at run time for programs
    compiled with CC8.  The stages of the native OS/8 CC8 compiler require
    20 kWords to compile programs.
    compiled with CC8.  The [native OS/8 CC8 compiler passes](#ncpass)
    require 20 kWords to compile programs.

    CC8 provides no built-in way to use more memory than this, so you
    will probably have to resort to [inline assembly](#asm) or FORTRAN
    II library linkage to get access to more than 16 kWords of core.

*   A PDP-8/e or higher class processor.  The CC8 compiler code and its
    [LIBC implementation](#libc) make liberal use of the MQ register
255
256
257
258
259
260
261
262

263
264
265

266
267
268
269

270
271
272
273
274
275
276
277
278
279
280
281
282

283
284
285
286
287
288
289
255
256
257
258
259
260
261

262
263
264

265
266
267
268

269
270
271
272
273
274
275
276
277
278
279
280
281

282
283
284
285
286
287
288
289







-
+


-
+



-
+












-
+







the `os8-run` documentation to understand this process better.

If you change the OS/8 CC8 source code, saying `make` at the PiDP-8/I
build root will update `bin/v3d.rk05` with new binaries automatically.

Because the CC8 native compiler is compiled by the CC8 *cross*-compiler,
the [standard memory layout](#memory) applies to both.  Among other
things, this means each phase of the native compiler requires
things, this means each pass of the native compiler requires
approximately 20 kWords of core.

The phases are:
<a id="ncpass"></a>The compiler passes are:

1.  `c8.c` &rarr; `c8.sb` &rarr; `CC.SV`: The compiler driver: accepts
    the input file name from the user, and calls the first proper
    compiler stage, `CC1`.
    compiler pass, `CC1`.

2.  `n8.c` &rarr; `n8.sb` &rarr; `CC1.SV`: The parser/tokeniser section
    of the compiler.

3.  `p8.c` &rarr; `p8.sb` &rarr; `CC2.SV`: The token to SABR code
    converter section of the compiler.

`CC.SV` contains extremely rudimentary preprocessor features
documented [below](#os8pp).

There is also `libc.c` &rarr; `libc.sb` &rarr; `LIBC.RL`, the [C
library](#libc) linked to any program built with CC8, including the
stages above, but also to your own programs.
passes above, but also to your own programs.

All of these binaries end up on the automatically-built OS/8 boot disk:
`CC?.SV` on `SYS:`, and everything else on `DSK:`, based on the defaults
our OS/8 distribution is configured to use when seeking out files.

Input programs should go on `DSK:`. Compiler outputs are also placed on
`DSK:`.
399
400
401
402
403
404
405
406
407
408



409
410
411
412
413
414
415
399
400
401
402
403
404
405



406
407
408
409
410
411
412
413
414
415







-
-
-
+
+
+







        It also means that if you take a program that the cross-compiler
        handles correctly and just copy it straight into OS/8 and try to
        compile it, it probably still has the `#include <libc.h>` line and
        possibly one for `init.h` as well. *Such code will fail to compile.*
        You must strip such lines out when copying C files into OS/8.

        (The native compiler emits startup code automatically, and it
        hard-codes the LIBC call table in the `CC2` compiler stage,
        implemented in `p8.c`, so it doesn’t need `#include` to make these
        things work.)
        hard-codes the LIBC call table in the [final compiler
        pass](#ncpass), implemented in `p8.c`, so it doesn’t need
        `#include` to make these things work.)

    *   [Broken](#os8asm) handling of [inline assmembly](#asm) via `#asm`.

    *   No support for `#if`, `#ifdef`, etc.

5.  Variables are implicitly `static`, even when local.

1578
1579
1580
1581
1582
1583
1584

1585
1586


1587
1588
1589
1590
1591
1592
1593
1578
1579
1580
1581
1582
1583
1584
1585


1586
1587
1588
1589
1590
1591
1592
1593
1594







+
-
-
+
+







[ub]:     https://en.wikipedia.org/wiki/Undefined_behavior
[zp]:     https://homepage.divms.uiowa.edu/~jones/pdp8/man/mri.html#pagezero


<a id="asm"></a>
## Inline Assembly Code

Both the [cross-compiler](#cross) and the [native compiler](#native)
The [cross-compiler](#cross) allows [SABR][sabr] assembly code between
`#asm` and `#endasm` markers in the C source code:
allow inline [SABR][sabr] assembly code between `#asm` and `#endasm`
markers in the C source code:

    #asm
        TAD (42      / add 42 to AC
    #endasm

Such code is copied literally from the input C source file into the
compiler’s SABR output file, so it must be written with that context in
1601
1602
1603
1604
1605
1606
1607
1608

1609
1610
1611
1612
1613
1614
1615
1602
1603
1604
1605
1606
1607
1608

1609
1610
1611
1612
1613
1614
1615
1616







-
+







body in assembly:

    add48(a)
    int a
    {
        a;          /* load 'a' into AC; explained below */
    #asm
        TAD (48
        TAD (D48
    #endasm
    }

Doing it this way saves you from having to understand the way the CC8
software stack works, which we’ve chosen not to document here yet, apart
from [its approximate location in core memory](#memory). All you need to
know is that parameters are passed on the stack and *somehow* extracted
1700
1701
1702
1703
1704
1705
1706
1707

1708

1709



1710





1711
1712
1713
1714
1715


















1716
1717
1718
1719








1720



1721
1722
1723








1724
1725





1726


1727

1728
1729
1730
1731
1732
1733
1734
1735
1701
1702
1703
1704
1705
1706
1707

1708
1709
1710

1711
1712
1713
1714
1715
1716
1717
1718
1719





1720
1721
1722
1723
1724
1725
1726
1727
1728
1729
1730
1731
1732
1733
1734
1735
1736
1737
1738



1739
1740
1741
1742
1743
1744
1745
1746
1747
1748
1749
1750



1751
1752
1753
1754
1755
1756
1757
1758


1759
1760
1761
1762
1763
1764
1765
1766

1767

1768
1769
1770
1771
1772
1773
1774







-
+

+
-
+
+
+

+
+
+
+
+
-
-
-
-
-
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+

-
-
-
+
+
+
+
+
+
+
+

+
+
+
-
-
-
+
+
+
+
+
+
+
+
-
-
+
+
+
+
+

+
+
-
+
-







main program and the C library. This constitutes a compile time linkage
system to allow for standard and vararg functions to be called in the
library.

**TODO:** Explain this.


### <a id="os8asm"></a>Inline Assembly and the OS/8 CC8 Compiler
### <a id="os8asm"></a>Inline Assembly in the Native CC8 Compiler

#### Limitations
The native CC8 compiler does not properly do any of the above.

The native compiler has some significant limitations in the way it
handles inline assembly.

The primary one is that snippets of inline assembly are gathered by the
[first pass](#ncpass) of the compiler in a core memory buffer that’s
only 1024 characters in size. If the total amount of inline assembly in
your program exceeds this amount, `CC.SV` will overrun this buffer and
produce corrupt output.
There is a start at handling of `#asm` in this compiler, but it isn’t
properly integrated into the code generation stage. Instead, the
preprocessing stage just gathers up what it finds and dumps it to a
temporary file, which the code generation stage unceremoniously dumps in
at the *end* of the resulting `CC.SB` output file.

It’s difficult to justify increasing the size of that buffer, because
it’s already over [&frac14; the space given](#udf) in CC8 to global
variables.

It all has to be gathered in one pass, because this 1&nbsp;kWord buffer
is written to a text file (`CASM.TX`) at the end of the [first compiler
pass](#ncpass), where it waits for the final compiler pass to read it
back in to be inserted into the output SABR code.  Since LIBC’s
[`fopen()`](#fopen) is limited to a [single output file at a
time](#stdio) and it cannot append to an existing file, it’s got one
shot to write everything it collected.

This is one reason the CC8 LIBC has to be cross-compiled: its inline
assembly is over 6&times; the size of this buffer.


#### Incompatibilities

Furthermore, this process is currently limited to 1&nbsp;kiB of total
text: if the preprocessor gathers any more than that, it’s likely to
crash the preprocessor.
The only known incompatibility between the compilers in the way they
handle inline assembly is that the native compiler inserts a `DECIM`
directive early in its SABR output, so all constants in inline assembly
that aren’t specifically given a radix are treated as decimal numbers:

    #asm
        TAD (42
    #endasm

That instruction adds 42 decimal to AC when compiled with the native
compiler, but it adds 34 decimal (42₈) with the cross-compiler because
the cross-compiler leaves SABR in its default octal mode!
Since such code is not injected inline into the output SABR code at a
corresponding point to where it’s coded in the C source file, it’s not
even clear to us how you’d call such code. It may be possible to declare

If you want code to work with both, use the SABR-specific `D` and `K`
prefix feature on constants:

    #asm
        TAD (D42      / add 42 *decimal* to AC
    #endasm

an assembly subroutine this way, but we currently don’t know how you’d
call an assembly function that has no prototype in the C code.
We cannot recommend using the `DECIM` and `OCTAL` SABR pseudo-ops in
code that has to work with both compilers because there’s no way to tell
what directive to give at the end of the `#asm` block to restore prior
behavior. If you switch the mode without switching it back properly,
SABR code emitted by the compiler itself will be misinterpreted.

There’s a `DECIM` directive high up in the implementation of LIBC, but
that’s fine since it knows it will be compiled by the cross-compiler
At this time, we recommend that only low-level experimenters attempt to
only.
use this feature of the native OS/8 CC8 compiler.


### <a id="opdef"></a>Predefined OPDEFs

In addition to the op-codes predefined for SABR — which you can find in
[Appendix C of the OS/8 Handbook, 1974 edition][os8hac] — the following
`OPDEF` directives are inserted at the top of every SABR file output