CREATE statement and the ability to assign
to special registers (to control the runtime operating mode).
This document describes the syntax of a CREATE
statement, and shows an example of a simple parser for a subset of
INTERCAL-72
CREATE statement has the form:
CREATE grammar class template AS code
The grammar is optional, and defaults to the contents of the
value contained in special register %GR. If a grammar is
specified, it must be a constant (computed grammar are not yet supported
at the syntax level, although fully supported if you CREATE
the appropriate syntax), specified by a crawling horror ("_") followed
by a number. For example, assuming that %GR contains #2,
the following are equivalent:
CREATE _2 THE ,UNIVERSE, AS GOD
CREATE THE ,UNIVERSE, AS GOD
(Please forgive us the minor poetic licence: there is actually no code for
GOD in the distribution, although you can always add it to the
assembler, asm.iacc).
The class specifies a syntactic class (some other languages might call it a nonterminal). Usually, this takes the form of a what ("?") followed by some alphanumerics, although anything which evaluates to a number will do. Please note that this in CLC-INTERCAL the what does not introduce a unary logical operator as in C-INTERCAL, and it always produces a named constant (of sorts).
The template is composed of a sequence of terminals and nonterminals
which vaguely resemble the syntax you are trying to define. Nonterminals
are specified as a special type of constant, usually introduced by the
"what" discussed before. Terminals are specified as "array slices", that
is sequences of numbers enclosed in tails or hybrids and representing a
16- or 32-bit array. The text which would be produced by READing OUT these
arrays specifies the syntax to define. For example, consider the following
line from 1972.iacc:
DO CREATE _1 ?VERB ?LVALUE ,#91 + #91 + #79, ,#95 + #91 + #67,
?EXPRESSION AS GER + #1 + STO + ?EXPRESSION #1 + ?LVALUE #1
Ignore for just now the bit after the AS. We are creating
something in class ?VERB of grammar _1 (which
by default contains the compiler - the compiler compiler is in grammar
_2). There is a nonterminal (?LVALUE), which
presumably will match a register and anything else you can assign to; then
there are two terminals. To see what these are, you can run the following
program:
PLEASE ,1 <- #3
DO ,1 SUB #1 <- #91
DO ,1 SUB #2 <- #91
DO ,1 SUB #3 <- #72
PLEASE ,2 <- #3
DO ,2 SUB #1 <- #95
DO ,2 SUB #2 <- #91
DO ,2 SUB #3 <- #67
PLEASE READ OUT ,1 + ,2
DO GIVE UP
We assign the values to 16-bit arrays because the slices are encloses in
tails. If they had been hybrids, we'd use 32-bit arrays. Running the
program, we get "<-". This is not a coincidence: the above statement
CREATEs the assignment. It should not surprise anybody that the next
nonterminal is called ?EXPRESSION.
CREATE statements can define the syntax
for all the known variants of INTERCAL. The semantics is defined by the
code part of the CREATE statements. Before detailing how
the semantics is specified, we must take a detour.
The INTERCAL Common Bytecode Machine (ICBM) is an assembler-like language which is used by the CLC-INTERCAL compiler as an intermediate language. It can be interpreted directly by the bytecode interpreter, or run directly on any computer with the INTERCAL Operating System; alternatively, a second level compiler will be made available to convert ICBM programs to other languages, for example assembler, Perl, or DD/SH. Please note that it is no longer possible to create a code generator for C, FORTRAL, COBOL or Pascal because the target language must allow self-modifying programs.
ICBM assumes a memory model with multiple stashes: any value can be stashed and retrieved on any of the stashes, which function independently of each other. There is no limit to the size of an element on a stash, or to the number of elements in each stash. Memory allocated to a stash element is automatically freed when the element is removed from the stash.
ICBM programs are inherently multi-threaded. Memory between the threads is shared unless otherwise specified. When a thread terminates, it remains in memory until it is absolutely clear that the thread cannot came back to life. This is important, since threads in quantum programs can have their behaviour modified after they have completed execution.
ICBM programs also contain a number of compilers. Usually, there will be a "program compiler" and a "compiler compiler", provided by the system. When compiling a compiler, however, just the "compiler compiler" is provided: the "program compiler" is built during the execution. A compiler compiler is a special type of compiler, which replaces the original compiler compiler after (re-)building itself. In other words, the compiler compiler coincides with the compiler compiler compiler; this is necessary to avoid infinite regression.
Each block of ICBM code in a program has an associated source code and a compiler used to compile it. This allows to avoid to recompile a block if the compiler has not changed since the last time it has been compiled. A separate document describes this "just too late" compiler in some more detail.
ICBM code is described in detail in the documentation provided with the
Perl module Language::INTERCAL::ByteCode. Use the perldoc
program to access it.
CREATE
statement after the AS keyword. It consists of a list of
mnemonics (separated by intersection symbols). If the template refers to
n other templates, the codes generated by these templates will be
available in the code by repeating the template identifier (with
its own "what" or whatever was used to introduce it) followed by an expression.
If there is only one instance of each different template, the expression must
be #1, otherwise it indicates which instance to use. For example, considering
again the assignment statement from 1972.iacc:
DO CREATE _1 ?VERB ?LVALUE ,#91 + #91 + #79, ,#95 + #91 + #67,
?EXPRESSION AS GER + #1 + STO + ?EXPRESSION #1 + ?LVALUE #1
We see that we refer to both the ?EXPRESSION and the
?LVALUE: since there is one of each, these references are
followed by #1. Suppose the statement to execute is DO .1 <- #2.
Then the ?LVALUE will generate SPO + #1 and the
?EXPRESSION generates #2. So, replacing what needs
to be replaced, the assignment produces:
GER + #1 + STO + #2 + SPO + #1
Executing it, we first check whether gerund #1 is to be ABSTAINed FROM; if
it is not, we STOre #2 to register SPOt #1. Just as we required.
Example
See the various compiler parsers provided with the distribution for plenty
of examples.
These examples should be self-explanatory. If not, reread this document.
Back