The Perl INTERCAL compiler
... Character Sets
Normally, the compiler requires the program source to be in EBCDIC, although
there are compiler options to translate from ASCII or Baudot. Since there isn't
such thing as a standard EBCDIC, we have designed our own non-standard one.
The principle is simple: for each character, we selected a code which was
used for that character by at least one IBM terminal. However, to guarantee
incompatibility, our set differs in at least one character from any IBM
hardware for which we have been able to find documentation.
Here's the character table:
| + | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | a | b | c | d | e | f |
| 00 | | | | | | | | | BSP | TAB | LF | | | CR | | |
| 10 | | | | | | | | | | | | | | | | |
| 20 | | | | | | | | | | | | | | | | |
| 30 | | | | | | | | | | | | | | | | |
| 40 | SP | | | | | | | | | | ¢ | . | < | ( | + | ! |
| 50 | & | | | | | | | | | | ] | $ | * | ) | ; | ¬ |
| 60 | - | / | | | | xor | | | | | | | , | % | _ | > | ? |
| 70 | | | | | | | | | | | : | # | @ | ' | = | " |
| 80 | | a | b | c | d | e | f | g | h | i | | | | | | |
| 90 | | j | k | l | m | n | o | p | q | r | | | { | | [ | |
| a0 | | ~ | s | t | u | v | w | x | y | z | | | | | | ® |
| b0 | ^ | £ | | | © | | | | | | | | | | | |
| c0 | | A | B | C | D | E | F | G | H | I | | | | | | |
| d0 | | J | K | L | M | N | O | P | Q | R | | | } | | | |
| e0 | | | S | T | U | V | W | X | Y | Z | | | | | | |
| f0 | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | | | | | | DEL |
While the compiler and runtime accept ASCII and EBCDIC for input/output,
internally everything is represented in extended Baudot. The "letters"
and "figures" sets are identical to the standard Baudot, but we have a
nonstandard convention that shifting to letters while already in letters
causes a shift to lowercase letters, and shifting to figures while
already in figures causes a shift to a set containing special characters.
Thus to guarantee uppercase letters one woule first shift to figures and
then to letters, for example. If this extended Baudot is sent to a
teletype which understands standard Baudot, the result will be a text in
ALL CAPS with some of the symbols it cannot print replaced with others
it can.
Here's the character table:
| Code | Uppercase | Lowercase | Figures | Symbols |
| 00 | Invalid code |
| 01 | E | e | 3 | ¢ |
| 02 | Line Feed |
| 03 | A | a | - | + |
| 04 | Space |
| 05 | S | s | Bell | \ |
| 06 | I | i | 8 | # |
| 07 | U | u | 7 | = |
| 08 | Carriage Return |
| 09 | D | d | $ | * |
| 10 | R | r | 4 | { |
| 11 | J | j | ' | ~ |
| 12 | N | n | , | xor |
| 13 | F | f | ! | | |
| 14 | C | c | : | ^ |
| 15 | K | k | ( | < |
| 16 | T | t | 5 | [ |
| 17 | Z | z | " | } |
| 18 | W | w | ) | > |
| 19 | L | l | 2 | ] |
| 20 | H | h | Invalid | backspace |
| 21 | Y | y | 6 | @ |
| 22 | P | p | 0 | Invalid |
| 23 | Q | q | 1 | £ |
| 24 | O | o | 9 | ¬ |
| 25 | B | b | ? | delete |
| 26 | G | g | & | Invalid |
| 27 | Figures | Symbols |
| 28 | M | m | . | % |
| 29 | X | x | / | _ |
| 30 | V | v | ; | Invalid |
| 31 | Lowercase | Uppercase |
Back