Initial revision

1987-03-03 10:59:52 +00:00
parent 4d4c8b45fb
commit 004f017550
30 changed files with 3903 additions and 0 deletions
--- a/doc/ego/ud/ud1
+++ b/doc/ego/ud/ud1
@@ -0,0 +1,58 @@
+.bp
+.NH 1
+Use-Definition analysis
+.NH 2
+Introduction
+.PP
+The "Use-Definition analysis" phase (UD) consists of two related optimization
+techniques that both depend on "Use-Definition" information.
+The techniques are Copy Propagation and Constant Propagation.
+They are best explained via an example (see Figs. 11.1 and 11.2).
+.DS
+   (1)  A := B                  A := B
+	 ...          -->        ...
+   (2)  use(A)                  use(B)
+
+Fig. 11.1 An example of Copy Propagation
+.DE
+.DS
+   (1)  A := 12                  A := 12
+	 ...          -->        ...
+   (2)  use(A)                  use(12)
+
+Fig. 11.2 An example of Constant Propagation
+.DE
+Both optimizations have to check that the value of A at line (2)
+can only be obtained at line (1).
+Copy Propagation also has to assure that the value of B is
+the same at line (1) as at line (2).
+.PP
+One purpose of both transformations is to introduce
+opportunities for the Dead Code Elimination optimization.
+If the variable A is used nowhere else, the assignment A := B
+becomes useless and can be eliminated.
+.sp 0
+If B is less expensive to access than A (e.g. this is sometimes the case
+if A is a local variable and B is a global variable),
+Copy Propagation directly improves the code itself.
+If A is cheaper to access the transformation will not be performed.
+Likewise, a constant as operand may be cheeper than a variable.
+Having a constant as operand may also facilitate other optimizations.
+.PP
+The design of UD is based on the theory described in section
+14.1 and 14.3 of.
+.[
+aho compiler design
+.]
+As a main departure from that theory,
+we do not demand the statement A := B to become redundant after
+Copy Propagation.
+If B is cheaper to access than A, the optimization is always performed;
+if B is more expensive than A, we never do the transformation.
+If A and B are equally expensive UD uses the heuristic rule to
+replace infrequently used variables by frequently used ones.
+This rule increases the chances of the assignment to become useless.
+.PP
+In the next section we will give a brief outline of the data
+flow theory used
+for the implementation of UD.
--- a/doc/ego/ud/ud2
+++ b/doc/ego/ud/ud2
@@ -0,0 +1,64 @@
+.NH 2
+Data flow information
+.PP
+.NH 3
+Use-Definition information
+.PP
+A \fIdefinition\fR of a variable A is an assignment to A.
+A definition is said to \fIreach\fR a point p if there is a
+path in the control flow graph from the definition to p, such that
+A is not redefined on that path.
+.PP
+For every basic block B, we define the following sets:
+.IP GEN[b] 9
+the set of definitions in b that reach the end of b.
+.IP KILL[b]
+the set of definitions outside b that define a variable that
+is changed in b.
+.IP IN[b]
+the set of all definitions reaching the beginning of b.
+.IP OUT[b]
+the set of all definitions reaching the end of b.
+.LP
+GEN and KILL can be determined by inspecting the code of the procedure.
+IN and OUT are computed by solving the following data flow equations:
+.DS
+(1)    OUT[b] = IN[b] - KILL[b] + GEN[b]
+(2)    IN[b]  = OUT[p1] + ... + OUT[pn],
+	 where PRED(b) = {p1, ... , pn}
+.DE
+.NH 3
+Copy information
+.PP
+A \fIcopy\fR is a definition of the form "A := B".
+A copy is said to be \fIgenerated\fR in a basic block n if
+it occurs in n and there is no subsequent assignment to B in n.
+A copy is said to be \fIkilled\fR in n if:
+.IP (i)
+it occurs in n and there is a subsequent assignment to B within n, or
+.IP (ii)
+it occurs outside n, the definition A := B reaches the beginning of n
+and B is changed in n (note that a copy also is a definition).
+.LP
+A copy \fIreaches\fR a point p, if there are no assignments to B
+on any path in the control flow graph from the copy to p.
+.PP
+We define the following sets:
+.IP C_GEN[b] 11
+the set of all copies in b generated in b.
+.IP C_KILL[b]
+the set of all copies killed in b.
+.IP C_IN[b]
+the set of all copies reaching the beginning of b.
+.IP C_OUT[b]
+the set of all copies reaching the end of b.
+.LP
+C_IN and C_OUT are computed by solving the following equations:
+(root is the entry node of the current procedure; '*' denotes
+set intersection)
+.DS
+(1)    C_OUT[b] = C_IN[b] - C_KILL[b] + C_GEN[b]
+(2)    C_IN[b]  = C_OUT[p1] * ... * C_OUT[pn],
+	 where PRED(b) = {p1, ... , pn} and b /= root
+       C_IN[root] = {all copies}
+.DE
--- a/doc/ego/ud/ud3
+++ b/doc/ego/ud/ud3
@@ -0,0 +1,26 @@
+.NH 2
+Pointers and subroutine calls
+.PP
+The theory outlined above assumes that variables can
+only be changed by a direct assignment.
+This condition does not hold for EM.
+In case of an assignment through a pointer variable,
+it is in general impossible to see which variable is affected
+by the assignment.
+Similar problems occur in the presence of procedure calls.
+Therefore we distinguish two kinds of definitions:
+.IP -
+an \fIexplicit\fR definition is a direct assignment to one
+specific variable
+.IP -
+an \fIimplicit\fR definition is the potential alteration of
+a variable as a result of a procedure call or an indirect assignment.
+.LP
+An indirect assignment causes implicit definitions to
+all variables that may be accessed indirectly, i.e. 
+all local variables for which no register message was generated
+and all global variables.
+If a procedure contains an indirect assignment it may change the
+same set of variables, else it may change some global variables directly.
+The KILL, GEN, IN and OUT sets contain explicit as well
+as implicit definitions.
--- a/doc/ego/ud/ud4
+++ b/doc/ego/ud/ud4
@@ -0,0 +1,78 @@
+.NH 2
+Implementation
+.PP
+UD first builds a number of tables:
+.IP locals: 9
+contains information about the local variables of the
+current procedure (offset,size,whether a register message was found
+for it and, if so, the score field of that message)
+.IP defs:
+a table of all explicit definitions appearing in the
+current procedure.
+.IP copies:
+a table of all copies appearing in the
+current procedure.
+.LP
+Every variable (local as well as global), definition and copy
+is identified by a unique number, which is the index
+in the table.
+All tables are constructed by traversing the EM code.
+A fourth table, "vardefs" is used, indexed by a 'variable number',
+which contains for every variable the set of explicit definitions of it.
+Also, for each basic block b, the set CHGVARS containing all variables
+changed by it is computed.
+.PP
+The GEN sets are obtained in one scan over the EM text,
+by analyzing every EM instruction.
+The KILL set of a basic block b is computed by looking at the
+set of variables
+changed by b (i.e. CHGVARS[b]).
+For every such variable v, all explicit definitions to v
+(i.e. vardefs[v]) that are not in GEN[b] are added to KILL[b].
+Also, the implicit defininition of v is added to KILL[b].
+Next, the data flow equations for use-definition information
+are solved,
+using a straight forward, iterative algorithm.
+All sets are represented as bitvectors, so the operations
+on sets (union, difference) can be implemented efficiently.
+.PP
+The C_GEN and C_KILL sets are computed simultaneously in one scan
+over the EM text.
+For every copy A := B appearing in basic block b we do
+the following:
+.IP 1.
+for every basic block n /= b that changes B, see if the definition A := B
+reaches the beginning of n (i.e. check if the index number of A := B in
+the "defs" table is an element of IN[n]);
+if so, add the copy to C_KILL[n]
+.IP 2.
+if B is redefined later on in b, add the copy to C_KILL[b], else
+add it to C_GEN[b]
+.LP
+C_IN and C_OUT are computed from C_GEN and C_KILL via the second set of
+data flow equations.
+.PP
+Finally, in one last scan all opportunities for optimization are
+detected.
+For every use u of a variable A, we check if
+there is a unique explicit definition d reaching u.
+.sp
+If the definition is a copy A := B and B has the same value at d as
+at u, then the use of A at u may be changed into B.
+The latter condition can be verified as follows:
+.IP -
+if u and d are in the same basic block, see if there is
+any assignment to B in between d and u
+.IP -
+if u and d are in different basic blocks, the condition is
+satisfied if there is no assignment to B in the block of u prior to u
+and d is in C_IN[b].
+.LP
+Before the transformation is actually done, UD first makes sure the
+alteration is really desirable, as described before.
+The information needed for this purpose (access costs of local and
+global variables) is read from a machine descriptor file.
+.sp
+If the only definition reaching u has the form "A := constant", the use
+of A at u is replaced by the constant.
+
--- a/doc/ego/ud/ud5
+++ b/doc/ego/ud/ud5
@@ -0,0 +1,19 @@
+
+.NH 2
+Source files of UD
+.PP
+The sources of UD are in the following files and packages:
+.IP ud.h: 14
+declarations of global variables and data structures
+.IP ud.c:
+the routine main; initialization of target machine dependent tables
+.IP defs:
+routines to compute the GEN and KILL sets and routines to analyse
+EM instructions
+.IP const:
+routines involved in constant propagation
+.IP copy:
+routines involved in copy propagation
+.IP aux:
+contains auxiliary routines
+.LP