Initial revision

1987-03-03 10:25:25 +00:00
parent f195cb9aa6
commit 4a8a8e67ed
12 changed files with 1354 additions and 0 deletions
--- a/doc/ego/ca/ca1
+++ b/doc/ego/ca/ca1
@@ -0,0 +1,75 @@
+.bp
+.NH 1
+Compact assembly generation
+.NH 2
+Introduction
+.PP
+The "Compact Assembly generation phase" (CA) transforms the
+intermediate code of the optimizer into EM code in
+Compact Assembly Language (CAL) format.
+In the intermediate code, all program entities
+(such as procedures, labels, global variables)
+are denoted by a unique identifying number (see 3.5).
+In the CAL output of the optimizer these numbers have to
+be replaced by normal identifiers (strings).
+The original identifiers of the input program are used whenever possible.
+Recall that the IC phase generates two files that can be
+used to map unique identifying numbers to procedure names and
+global variable names.
+For instruction labels CA always generates new names.
+The reasons for doing so are:
+.IP -
+instruction labels are only visible inside one procedure, so they can
+not be referenced in other modules
+.IP -
+the names are not very suggestive anyway, as they must be integer numbers
+.IP -
+the optimizer considerably changes the control structure of the program,
+so there is really no one to one mapping of instruction labels in
+the input and the output program.
+.LP
+As the optimizer combines all input modules into one module,
+visibility problems may occur.
+Two modules M1 and M2 can both define an identifier X (provided that
+X is not externally visible in any of these modules).
+If M1 and M2 are combined into one module M, two distinct
+entities with the same name would exist in M, which
+is not allowed.
+.[~[
+tanenbaum machine architecture
+.], section 11.1.4.3]
+In these cases, CA invents a new unique name for one of the entities.
+.NH 2
+Implementation
+.PP
+CA first reads the files containing the procedure and global variable names
+and stores the names in two tables.
+It scans these tables to make sure that all names are different.
+Subsequently it reads the EM text, one procedure at a time,
+and outputs it in CAL format.
+The major part of the code that does the latter transformation
+is adapted from the EM Peephole Optimizer.
+.PP
+The main problem of the implementation of CA is to
+assure that the visibility rules are obeyed.
+If an identifier must be externally visible (i.e.
+it was externally visible in the input program)
+and the identifier is defined (in the output program) before
+being referenced,
+an EXA or EXP pseudo must be generated for it.
+(Note that the optimizer may change the order of definitions and
+references, so some pseudos may be needed that were not
+present in the input program).
+On the other hand, an identifier may be only internally visible.
+If such an identifier is referenced before being defined,
+an INA or INP pseudo must be emitted prior to its first reference.
+.UH
+Acknowledgements
+.PP
+The author would like to thank Andy Tanenbaum for his guidance,
+Duk Bekema for implementing the Common Subexpression Elimination phase
+and writing the initial documentation of that phase,
+Dick Grune for reading the manuscript of this report
+and Ceriel Jacobs, Ed Keizer, Martin Kersten, Hans van Staveren
+and the members of the S.T.W. user's group for their
+interest and assistance.