PiDP-8/I SoftwareCheck-in [23f92ab553]
Not logged in

Many hyperlinks are disabled.
Use anonymous login to enable hyperlinks.

Overview
Comment:Added more detail about file I/O limitations to the LIBC user documentation section of the CC8 manual. (What used to be the "stdio" section is now broken up into several sections at the same level.)
Downloads: Tarball | ZIP archive | SQL archive
Timelines: family | ancestors | descendants | both | trunk
Files: files | file ages | folders
SHA3-256: 23f92ab553d56fe8ead96cdc2d230421068d97363f6b6fa73c44b15e410657ff
User & Date: tangent 2019-02-13 19:12:06
Context
2019-02-13
19:13
Moved the "Strings are of Words, Not of Bytes or Characters" section of the CC8 user manual up within the document to be after the "Character Set" section, where it fits better. (This wasn't really possible back when the latter section was part of the stdio section.) check-in: 7f8d8bcff5 user: tangent tags: trunk
19:12
Added more detail about file I/O limitations to the LIBC user documentation section of the CC8 manual. (What used to be the "stdio" section is now broken up into several sections at the same level.) check-in: 23f92ab553 user: tangent tags: trunk
18:52
Rewrote the "Inline Assembly in the Native CC8 Compiler" section in the CC8 user manual after learning more about its behavior and limitations. check-in: 573aba2f6a user: tangent tags: trunk
Changes
Hide Diffs Unified Diffs Ignore Whitespace Patch

Changes to doc/cc8-manual.md.

101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
...
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
...
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579














580
581
582
583
584
585
586
...
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
...
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
....
1104
1105
1106
1107
1108
1109
1110
1111
1112
1113
1114
1115
1116
1117
1118
....
1723
1724
1725
1726
1727
1728
1729
1730
1731
1732
1733
1734
1735
1736
1737
    disable them at run time with configuration directives. If you have
    a PiDP-8/I and were expecting a strict PDP-8/I simulation underneath
    that pretty front panel, we’re sorry to pop your bubble, but the
    fact of the matter is that a PiDP-8/I is a Family-of-8 mongrel.)

*   At build time, the OS/8 FORTRAN II/SABR subsystem must be available.

*   At run time, any [stdio](#stdio) operation involving file I/O
    assumes it is running atop OS/8. For instance, file name arguments
    to [`fopen()`](#fopen) are passed to OS/8 for interpretation.

There is likely a subset of CC8-built programs which will run
independently of OS/8, but the bounds on that class of programs is not
currently clear to us.

................................................................................

The ISO C Standard does not define what the `is*()` functions do when
the passed character is not representable as `unsigned char`. Since this
C compiler [does not distinguish types](#typeless), our `is*()`
functions return false for any value outside of the ASCII range, 0-127.


### <a id="stdio"></a>stdio

The stdio implementation currently assumes US-ASCII 7-bit text I/O.

Input characters have their upper 5 bits masked off so that only the
lower 7 bits are valid in the returned 12 bit PDP-8 word. Code using
[`fgetc`](#fgetc) cannot be used on arbitrary binary data because its
“end of file” return case is indistinguishable from reading a 0 byte.
................................................................................
implementation. Even if you are reading your files with some other code
which is capable of handling 8-bit data, there are further difficulties
such as a lack of functions taking an explicit length, like `fwrite()`,
which makes dealing with ASCII NUL difficult. You could write a NUL to
an output file with `fputc()`, but not with `fputs()`, since NUL
terminates the output string.

The stdio implementation only allows one file to be open at a time for
reading and one for writing. Since there is no input buffering to speak
of in the stdio implementation, there is no need to close files opened
for input, there being no resources to free. Together, those two facts
explain why [the `fclose()` implementation](#fclose) takes no argument:
it always closes the lone output file, if any.

This means that to open multiple output files, you have to `fclose` each
file before calling [`fopen("FILENA.ME", "w")`](#fopen) to open the next.
To open multiple input files, simply call `fopen()` to open each
subsequent file, implicitly closing the prior input file.

The CC8 LIBC file I/O implementation is built atop the OS/8 FORTRAN II
subsystem’s file I/O functions. This has some important implications in
the documentation below:

1.  We have not tried to “hoist” descriptions of what the FORTRAN II
    subsystem does up into this documentation where that documentation
    is correct and complete already. It may be necessary for you to read
    the FORTRAN II documentation in the OS/8 manual to understand how
    the CC8 LIBC’s stdio file I/O functions behave in certain respects.

2.  Programs built with CC8 which use its file I/O functions are
    dependent upon OS/8.


#### Ctrl-C Handling















Unlike on modern operating systems, there is nothing like `SIGINT` in
OS/8, which means Ctrl-C only kills programs that explicitly check for
it.  The keyboard input loop in the CC8 LIBC standard library does do
this.

The thing to be aware of is, this won’t happen while a program is stuck
................................................................................


### <a id="wordstr"></a>Strings are of Words, Not of Bytes or Characters

In several places, the Standard says a conforming C library is supposed
to operate on “bytes” or “characters,” at least according to [our chosen
interpretation][cppr]. Except for the text I/O restrictions called out
[above](#stdio), LIBC operates on strings of PDP-8 words, not on these
modern notions of fixed 8-bit bytes or the ever-nebulous “characters.”

Because you may be used to the idea that string and memory functions
like [`memcpy()`](#memcpy) and [`strcat()`](#strcat) will operate on
bytes, we’ve marked all of these cases with a reference back to this
section.

................................................................................

**Standard Violations:**

*   Does not take a `FILE*` argument.  (See [`fopen()`](#fopen) for
    justification.)

*   Always closes the last-opened *output* file, only, there being
    [no point](#stdio) in explicitly closing input files in this
    implementation.

[f2fio]: https://archive.org/details/bitsavers_decpdp8os8_39414792/page/n700


### <a id="fgets"></a>`fgets(s)`

................................................................................
    that terminated the loop inside each function’s implementation.

*   `fputs()` detects no I/O error conditions, and thus cannot return
    EOF to signal an error. It always returns 0, whether an error
    occurred or not.

*   `fputs()` does not take a `FILE*` as its first parameter due to the
    [implicit single output file](#stdio).


### <a id="revcpy"></a>`revcpy(dst, src, n)`

For non-overlapping buffers, has the same effect as
[`memcpy()`](#memcpy), using less efficient code.

................................................................................
variables.

It all has to be gathered in one pass, because this 1&nbsp;kWord buffer
is written to a text file (`CASM.TX`) at the end of the [first compiler
pass](#ncpass), where it waits for the final compiler pass to read it
back in to be inserted into the output SABR code.  Since LIBC’s
[`fopen()`](#fopen) is limited to a [single output file at a
time](#stdio) and it cannot append to an existing file, it’s got one
shot to write everything it collected.

This is one reason the CC8 LIBC has to be cross-compiled: its inline
assembly is over 6&times; the size of this buffer.


#### Incompatibilities







|







 







|







 







|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
>
>
>
>
>
>
>
>
>
>
>
>
>
>







 







|







 







|







 







|







 







|







101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
...
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
...
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
...
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
...
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
....
1118
1119
1120
1121
1122
1123
1124
1125
1126
1127
1128
1129
1130
1131
1132
....
1737
1738
1739
1740
1741
1742
1743
1744
1745
1746
1747
1748
1749
1750
1751
    disable them at run time with configuration directives. If you have
    a PiDP-8/I and were expecting a strict PDP-8/I simulation underneath
    that pretty front panel, we’re sorry to pop your bubble, but the
    fact of the matter is that a PiDP-8/I is a Family-of-8 mongrel.)

*   At build time, the OS/8 FORTRAN II/SABR subsystem must be available.

*   At run time, any [stdio](#fiolim) operation involving file I/O
    assumes it is running atop OS/8. For instance, file name arguments
    to [`fopen()`](#fopen) are passed to OS/8 for interpretation.

There is likely a subset of CC8-built programs which will run
independently of OS/8, but the bounds on that class of programs is not
currently clear to us.

................................................................................

The ISO C Standard does not define what the `is*()` functions do when
the passed character is not representable as `unsigned char`. Since this
C compiler [does not distinguish types](#typeless), our `is*()`
functions return false for any value outside of the ASCII range, 0-127.


### <a id="cset"></a>Character Set

The stdio implementation currently assumes US-ASCII 7-bit text I/O.

Input characters have their upper 5 bits masked off so that only the
lower 7 bits are valid in the returned 12 bit PDP-8 word. Code using
[`fgetc`](#fgetc) cannot be used on arbitrary binary data because its
“end of file” return case is indistinguishable from reading a 0 byte.
................................................................................
implementation. Even if you are reading your files with some other code
which is capable of handling 8-bit data, there are further difficulties
such as a lack of functions taking an explicit length, like `fwrite()`,
which makes dealing with ASCII NUL difficult. You could write a NUL to
an output file with `fputc()`, but not with `fputs()`, since NUL
terminates the output string.


### <a id="fiolim"></a>File I/O Limitations

Because LIBC’s stdio implementation is built atop the OS/8 FORTRAN II
library, it only allows one file to be open at a time for reading and
one for writing. OS/8’s underlying limit is 5 output files and 9 input
files, which appears to be an accommodation specifically for its FORTRAN
IV implementation, so it is possible that a future CC8 would be
retargeted at FORTRAN IV to lift this limitation, but it would be a
nontrivial amount of work.

Meanwhile, we generally defer to the OS/8 FORTRAN II manual where it
comes to documentation of these functions behavior.  The only time we
bring it up in this manual is when there is either a mismatch between
expected C behavior and actual FORTRAN II behavior or between the way
OS/8 FORTRAN II is documented and the way things actually work when it’s
being driven by CC8.

This underlying base has an important implication: programs built with
CC8 which use its file I/O functions are dependent upon OS/8.  That
underlying base determines how file names are interpreted, what devices
get used, etc.

Because of this single-file limitation, the stdio functions operating on
files do not take a `FILE*` argument as in Standard C, there being no
need to specify which file is meant. Output functions use the one and
only output file, and input functions use the one and only input file.
Our [`fopen()`](#fopen) doesn’t return a `FILE*` because the caller
doesn’t need one to pass to any of the other functions. That leaves only
[`fclose()`](#fclose), which would be an ambiguous call without a
`FILE*` argument if it wasn’t for the fact that OS/8 FORTRAN II doesn’t
have an `ICLOSE` library function, there apparently being no resources
to free on closing an input file.

All of this means that to open multiple output files, you have to
`fclose` each file before calling [`fopen("FILENA.ME", "w")`](#fopen) to
open the next.  To open multiple input files, simply call `fopen()` to
open each subsequent file, implicitly closing the prior input file.


### <a id="ctrlc"></a>Ctrl-C Handling

Unlike on modern operating systems, there is nothing like `SIGINT` in
OS/8, which means Ctrl-C only kills programs that explicitly check for
it.  The keyboard input loop in the CC8 LIBC standard library does do
this.

The thing to be aware of is, this won’t happen while a program is stuck
................................................................................


### <a id="wordstr"></a>Strings are of Words, Not of Bytes or Characters

In several places, the Standard says a conforming C library is supposed
to operate on “bytes” or “characters,” at least according to [our chosen
interpretation][cppr]. Except for the text I/O restrictions called out
[above](#cset), LIBC operates on strings of PDP-8 words, not on these
modern notions of fixed 8-bit bytes or the ever-nebulous “characters.”

Because you may be used to the idea that string and memory functions
like [`memcpy()`](#memcpy) and [`strcat()`](#strcat) will operate on
bytes, we’ve marked all of these cases with a reference back to this
section.

................................................................................

**Standard Violations:**

*   Does not take a `FILE*` argument.  (See [`fopen()`](#fopen) for
    justification.)

*   Always closes the last-opened *output* file, only, there being
    [no point](#fiolim) in explicitly closing input files in this
    implementation.

[f2fio]: https://archive.org/details/bitsavers_decpdp8os8_39414792/page/n700


### <a id="fgets"></a>`fgets(s)`

................................................................................
    that terminated the loop inside each function’s implementation.

*   `fputs()` detects no I/O error conditions, and thus cannot return
    EOF to signal an error. It always returns 0, whether an error
    occurred or not.

*   `fputs()` does not take a `FILE*` as its first parameter due to the
    [implicit single output file](#fiolim).


### <a id="revcpy"></a>`revcpy(dst, src, n)`

For non-overlapping buffers, has the same effect as
[`memcpy()`](#memcpy), using less efficient code.

................................................................................
variables.

It all has to be gathered in one pass, because this 1&nbsp;kWord buffer
is written to a text file (`CASM.TX`) at the end of the [first compiler
pass](#ncpass), where it waits for the final compiler pass to read it
back in to be inserted into the output SABR code.  Since LIBC’s
[`fopen()`](#fopen) is limited to a [single output file at a
time](#fiolim) and it cannot append to an existing file, it’s got one
shot to write everything it collected.

This is one reason the CC8 LIBC has to be cross-compiled: its inline
assembly is over 6&times; the size of this buffer.


#### Incompatibilities