︙ | | |
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
|
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
|
-
-
+
+
|
## Requirements
The CC8 system generally assumes the availability of:
* [At least 12 kWords of core](#memory) at run time for programs
compiled with CC8. The stages of the native OS/8 CC8 compiler require
20 kWords to compile programs.
compiled with CC8. The [native OS/8 CC8 compiler passes](#ncpass)
require 20 kWords to compile programs.
CC8 provides no built-in way to use more memory than this, so you
will probably have to resort to [inline assembly](#asm) or FORTRAN
II library linkage to get access to more than 16 kWords of core.
* A PDP-8/e or higher class processor. The CC8 compiler code and its
[LIBC implementation](#libc) make liberal use of the MQ register
|
︙ | | |
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
|
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
|
-
+
-
+
-
+
-
+
|
the `os8-run` documentation to understand this process better.
If you change the OS/8 CC8 source code, saying `make` at the PiDP-8/I
build root will update `bin/v3d.rk05` with new binaries automatically.
Because the CC8 native compiler is compiled by the CC8 *cross*-compiler,
the [standard memory layout](#memory) applies to both. Among other
things, this means each phase of the native compiler requires
things, this means each pass of the native compiler requires
approximately 20 kWords of core.
The phases are:
<a id="ncpass"></a>The compiler passes are:
1. `c8.c` → `c8.sb` → `CC.SV`: The compiler driver: accepts
the input file name from the user, and calls the first proper
compiler stage, `CC1`.
compiler pass, `CC1`.
2. `n8.c` → `n8.sb` → `CC1.SV`: The parser/tokeniser section
of the compiler.
3. `p8.c` → `p8.sb` → `CC2.SV`: The token to SABR code
converter section of the compiler.
`CC.SV` contains extremely rudimentary preprocessor features
documented [below](#os8pp).
There is also `libc.c` → `libc.sb` → `LIBC.RL`, the [C
library](#libc) linked to any program built with CC8, including the
stages above, but also to your own programs.
passes above, but also to your own programs.
All of these binaries end up on the automatically-built OS/8 boot disk:
`CC?.SV` on `SYS:`, and everything else on `DSK:`, based on the defaults
our OS/8 distribution is configured to use when seeking out files.
Input programs should go on `DSK:`. Compiler outputs are also placed on
`DSK:`.
|
︙ | | |
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
|
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
|
-
-
-
+
+
+
|
It also means that if you take a program that the cross-compiler
handles correctly and just copy it straight into OS/8 and try to
compile it, it probably still has the `#include <libc.h>` line and
possibly one for `init.h` as well. *Such code will fail to compile.*
You must strip such lines out when copying C files into OS/8.
(The native compiler emits startup code automatically, and it
hard-codes the LIBC call table in the `CC2` compiler stage,
implemented in `p8.c`, so it doesn’t need `#include` to make these
things work.)
hard-codes the LIBC call table in the [final compiler
pass](#ncpass), implemented in `p8.c`, so it doesn’t need
`#include` to make these things work.)
* [Broken](#os8asm) handling of [inline assmembly](#asm) via `#asm`.
* No support for `#if`, `#ifdef`, etc.
5. Variables are implicitly `static`, even when local.
|
︙ | | |
1578
1579
1580
1581
1582
1583
1584
1585
1586
1587
1588
1589
1590
1591
1592
1593
|
1578
1579
1580
1581
1582
1583
1584
1585
1586
1587
1588
1589
1590
1591
1592
1593
1594
|
+
-
-
+
+
|
[ub]: https://en.wikipedia.org/wiki/Undefined_behavior
[zp]: https://homepage.divms.uiowa.edu/~jones/pdp8/man/mri.html#pagezero
<a id="asm"></a>
## Inline Assembly Code
Both the [cross-compiler](#cross) and the [native compiler](#native)
The [cross-compiler](#cross) allows [SABR][sabr] assembly code between
`#asm` and `#endasm` markers in the C source code:
allow inline [SABR][sabr] assembly code between `#asm` and `#endasm`
markers in the C source code:
#asm
TAD (42 / add 42 to AC
#endasm
Such code is copied literally from the input C source file into the
compiler’s SABR output file, so it must be written with that context in
|
︙ | | |
1601
1602
1603
1604
1605
1606
1607
1608
1609
1610
1611
1612
1613
1614
1615
|
1602
1603
1604
1605
1606
1607
1608
1609
1610
1611
1612
1613
1614
1615
1616
|
-
+
|
body in assembly:
add48(a)
int a
{
a; /* load 'a' into AC; explained below */
#asm
TAD (48
TAD (D48
#endasm
}
Doing it this way saves you from having to understand the way the CC8
software stack works, which we’ve chosen not to document here yet, apart
from [its approximate location in core memory](#memory). All you need to
know is that parameters are passed on the stack and *somehow* extracted
|
︙ | | |
1700
1701
1702
1703
1704
1705
1706
1707
1708
1709
1710
1711
1712
1713
1714
1715
1716
1717
1718
1719
1720
1721
1722
1723
1724
1725
1726
1727
1728
1729
1730
1731
1732
1733
1734
1735
|
1701
1702
1703
1704
1705
1706
1707
1708
1709
1710
1711
1712
1713
1714
1715
1716
1717
1718
1719
1720
1721
1722
1723
1724
1725
1726
1727
1728
1729
1730
1731
1732
1733
1734
1735
1736
1737
1738
1739
1740
1741
1742
1743
1744
1745
1746
1747
1748
1749
1750
1751
1752
1753
1754
1755
1756
1757
1758
1759
1760
1761
1762
1763
1764
1765
1766
1767
1768
1769
1770
1771
1772
1773
1774
|
-
+
+
-
+
+
+
+
+
+
+
+
-
-
-
-
-
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
-
-
-
+
+
+
+
+
+
+
+
+
+
+
-
-
-
+
+
+
+
+
+
+
+
-
-
+
+
+
+
+
+
+
-
+
-
|
main program and the C library. This constitutes a compile time linkage
system to allow for standard and vararg functions to be called in the
library.
**TODO:** Explain this.
### <a id="os8asm"></a>Inline Assembly and the OS/8 CC8 Compiler
### <a id="os8asm"></a>Inline Assembly in the Native CC8 Compiler
#### Limitations
The native CC8 compiler does not properly do any of the above.
The native compiler has some significant limitations in the way it
handles inline assembly.
The primary one is that snippets of inline assembly are gathered by the
[first pass](#ncpass) of the compiler in a core memory buffer that’s
only 1024 characters in size. If the total amount of inline assembly in
your program exceeds this amount, `CC.SV` will overrun this buffer and
produce corrupt output.
There is a start at handling of `#asm` in this compiler, but it isn’t
properly integrated into the code generation stage. Instead, the
preprocessing stage just gathers up what it finds and dumps it to a
temporary file, which the code generation stage unceremoniously dumps in
at the *end* of the resulting `CC.SB` output file.
It’s difficult to justify increasing the size of that buffer, because
it’s already over [¼ the space given](#udf) in CC8 to global
variables.
It all has to be gathered in one pass, because this 1 kWord buffer
is written to a text file (`CASM.TX`) at the end of the [first compiler
pass](#ncpass), where it waits for the final compiler pass to read it
back in to be inserted into the output SABR code. Since LIBC’s
[`fopen()`](#fopen) is limited to a [single output file at a
time](#stdio) and it cannot append to an existing file, it’s got one
shot to write everything it collected.
This is one reason the CC8 LIBC has to be cross-compiled: its inline
assembly is over 6× the size of this buffer.
#### Incompatibilities
Furthermore, this process is currently limited to 1 kiB of total
text: if the preprocessor gathers any more than that, it’s likely to
crash the preprocessor.
The only known incompatibility between the compilers in the way they
handle inline assembly is that the native compiler inserts a `DECIM`
directive early in its SABR output, so all constants in inline assembly
that aren’t specifically given a radix are treated as decimal numbers:
#asm
TAD (42
#endasm
That instruction adds 42 decimal to AC when compiled with the native
compiler, but it adds 34 decimal (42₈) with the cross-compiler because
the cross-compiler leaves SABR in its default octal mode!
Since such code is not injected inline into the output SABR code at a
corresponding point to where it’s coded in the C source file, it’s not
even clear to us how you’d call such code. It may be possible to declare
If you want code to work with both, use the SABR-specific `D` and `K`
prefix feature on constants:
#asm
TAD (D42 / add 42 *decimal* to AC
#endasm
an assembly subroutine this way, but we currently don’t know how you’d
call an assembly function that has no prototype in the C code.
We cannot recommend using the `DECIM` and `OCTAL` SABR pseudo-ops in
code that has to work with both compilers because there’s no way to tell
what directive to give at the end of the `#asm` block to restore prior
behavior. If you switch the mode without switching it back properly,
SABR code emitted by the compiler itself will be misinterpreted.
There’s a `DECIM` directive high up in the implementation of LIBC, but
that’s fine since it knows it will be compiled by the cross-compiler
At this time, we recommend that only low-level experimenters attempt to
only.
use this feature of the native OS/8 CC8 compiler.
### <a id="opdef"></a>Predefined OPDEFs
In addition to the op-codes predefined for SABR — which you can find in
[Appendix C of the OS/8 Handbook, 1974 edition][os8hac] — the following
`OPDEF` directives are inserted at the top of every SABR file output
|
︙ | | |