Initial revision

1987-03-03 10:59:52 +00:00
parent 1c5405b71c
commit 9ea069fca3
30 changed files with 3903 additions and 0 deletions
--- a/doc/ego/sr/sr1
+++ b/doc/ego/sr/sr1
@@ -0,0 +1,44 @@
+.bp
+.NH 1
+Strength reduction
+.NH 2
+Introduction
+.PP
+The Strength Reduction optimization technique (SR)
+tries to replace expensive operators
+by cheaper ones,
+in order to decrease the execution time
+of the program.
+A classical example is replacing a 'multiplication by 2'
+by an addition or a shift instruction.
+These kinds of local transformations are already
+done by the EM Peephole Optimizer.
+Strength reduction can also be applied
+more generally to operators used in a loop.
+.DS
+i := 1;                    i := 1;
+while i < 100 loop  -->    TMP := i * 118;
+   put(i * 118);           while i < 100 loop
+   i := i + 1;                put(TMP);
+end loop;                     i := i + 1;
+			      TMP := TMP + 118;
+			   end loop;
+
+Fig. 6.1 An example of Strenght Reduction
+.DE
+In Fig. 6.1, a multiplication inside a loop is
+replaced by an addition inside the loop and a multiplication
+outside the loop.
+Clearly, this is a global optimization; it cannot
+be done by a peephole optimizer.
+.PP
+In some cases a related technique, \fItest replacement\fR,
+can be used to eliminate the
+loop variable i.
+This technique will not be discussed in this report.
+.sp 0
+In the example above, the resulting code
+can be further optimized by using
+constant propagation.
+Obviously, this is not the task of the
+Strength Reduction phase.
--- a/doc/ego/sr/sr2
+++ b/doc/ego/sr/sr2
@@ -0,0 +1,217 @@
+.NH 2
+The model of strength reduction
+.PP
+In this section we will describe 
+the transformations performed by
+Strength Reduction (SR).
+Before doing so, we will introduce the
+central notion of an induction variable.
+.NH 3
+Induction variables
+.PP
+SR looks for variables whose
+values form an arithmetic progression
+at the beginning of a loop.
+These variables are called induction variables.
+The most frequently occurring example of such
+a variable is a loop-variable in a high-order
+programming language.
+Several quite sophisticated models of strength
+reduction can be found in the literature.
+.[
+cocke reduction strength cacm
+.]
+.[
+allen cocke kennedy reduction strength
+.]
+.[
+lowry medlock cacm
+.]
+.[
+aho compiler design
+.]
+In these models the notion of an induction variable
+is far more general than the intuitive notion
+of a loop-variable.
+The definition of an induction variable we present here
+is more restricted,
+yielding a simpler model and simpler transformations.
+We think the principle source for strength reduction lies in
+expressions using a loop-variable,
+i.e. a variable that is incremented or decremented
+by the same amount after every loop iteration,
+and that cannot be changed in any other way.
+.PP
+Of course, the EM code does not contain high level constructs
+such as for-statements.
+We will define an induction variable in terms
+of the Intermediate Code of the optimizer.
+Note that the notions of a loop in the
+EM text and of a firm basic block
+were defined in section 3.3.5.
+.sp
+.UL definition
+.sp 0
+An induction variable i of a loop L is a local variable
+that is never accessed indirectly,
+whose size is the word size of the target machine, and
+that is assigned exactly once within L,
+the assignment:
+.IP -
+being of the form i := i + c or i := c +i,
+c is a constant
+called the \fIstep value\fR of i.
+.IP -
+occurring in a firm block of L.
+.LP
+(Note that the first restriction on the assignment
+is not described in terms of the Intermediate Code;
+we will give such a description later; the current
+definition is easier to understand however).
+.NH 3
+Recognized expressions
+.PP
+SR recognizes certain expressions using
+an induction variable and replaces
+them by cheaper ones.
+Two kinds of expensive operations are recognized:
+multiplication and array address computations.
+The expressions that are simplified must
+use an induction variable
+as an operand of
+a multiplication or as index in an array expression.
+.PP
+Often a linear function of an induction variable is used,
+rather than the variable itself.
+In these cases optimization is still possible.
+We call such expressions \fIiv-expressions\fR.
+.sp
+.UL definition:
+.sp 0
+An iv-expression of an induction variable i of a loop L is
+an expression that:
+.IP -
+uses only the operators + and - (unary as well as binary)
+.IP -
+uses i as operand exactly once
+.IP -
+uses (besides i) only constants or variables that are
+never changed in L as operands.
+.LP
+.PP
+The expressions recognized by SR are of the following forms:
+.IP (1)
+iv_expression * constant
+.IP (2)
+constant * iv_expression
+.IP (3)
+A[iv-expression] :=       (assign to array element)
+.IP (4)
+A[iv-expression]          (use array element)
+.IP (5)
+& A[iv-expression]        (take address of array element)
+.LP
+(Note that EM has different instructions to use an array element,
+store into one, or take the address of one, resp. LAR, SAR, and AAR).
+.sp 0
+The size of the elements of A must
+be known statically.
+In cases (3) and (4) this size 
+must equal the word size of the
+target machine.
+.NH 3
+Transformations
+.PP
+With every recognized expression we associate
+a new temporary local variable TMP,
+allocated in the stack frame of the
+procedure containing the expression.
+At any program point within the loop, TMP will
+contain the following value:
+.IP multiplication: 18
+the current value of iv-expression * constant
+.IP arrays:
+the current value of &A[iv-expression].
+.LP
+In the second case, TMP essentially is a pointer variable,
+pointing to the element of A that is currently in use.
+.sp 0
+If the same expression occurs several times in the loop,
+the same temporary local is used each time.
+.PP
+Three transformations are applied to the EM text:
+.IP (1)
+TMP is initialized with the right value.
+This initialization takes place just
+before the loop.
+.IP (2)
+The recognized expression is simplified.
+.IP (3)
+TMP is incremented; this takes place just
+after the induction variable is incremented.
+.LP
+For multiplication, the initial value of TMP
+is the value of the recognized expression at
+the program point immediately before the loop.
+For arrays, TMP is initialized with the address
+of the first array element that is accessed.
+So the initialization code is:
+.DS
+TMP := iv-expression * constant;  or
+TMP := &A[iv-expression]
+.DE
+At the point immediately before the loop,
+the induction variable will already have been
+initialized,
+so the value used in the code above will be the
+value it has during the first iteration.
+.PP
+For multiplication, the recognized expression can simply be
+replaced by TMP.
+For array optimizations, the replacement
+depends on the form:
+.DS
+\fIform\fR                         \fIreplacement\fR
+(3) A[iv-expr] :=            *TMP :=     (assign indirect)
+(4) A[iv-expr]               *TMP        (use indirect)
+(5) &A[iv-expr]              TMP
+.DE
+The '*' denotes the indirect operator. (Note that
+EM has different instructions to do
+an assign-indirect and a use-indirect).
+As the size of the array elements is restricted
+to be the word size in case (3) and (4),
+only one EM instruction needs to
+be generated in all cases.
+.PP
+The amount by which TMP is incremented is:
+.IP multiplication: 18
+step value * constant
+.IP arrays:
+step value * element size
+.LP
+Note that the step value (see definition of induction variable above),
+the constant, and the element size (see previous section) can all
+be determined statically.
+If the sign of the induction variable in the
+iv-expression is negative, the amount
+must be negated.
+.PP
+The transformations are demonstrated by an example.
+.DS
+i := 100;                     i := 100;
+while i > 1 loop              TMP := (6-i) * 5;
+   X := (6-i) * 5 + 2;        while i > 1 loop
+   Y := (6-i) * 5 - 8;   -->     X := TMP + 2;
+   i := i - 3;                   Y := TMP - 8;
+end loop;                        i := i - 3;
+			         TMP := TMP + 15;
+			      end loop;
+
+Fig. 6.2 Example of complex Strength Reduction transformations
+.DE
+The expression '(6-i)*5' is recognized twice. The constant
+is 5.
+The step value is -3.
+The sign of i in the recognized expression is '-'.
+So the increment value of TMP is -(-3*5) = +15.
--- a/doc/ego/sr/sr3
+++ b/doc/ego/sr/sr3
@@ -0,0 +1,232 @@
+.NH 2
+Implementation
+.PP
+Like most phases, SR deals with one procedure
+at a time.
+Within a procedure, SR works on one loop at a time.
+Loops are processed in textual order.
+If loops are nested inside each other,
+SR starts with the outermost loop and proceeds in the
+inwards direction.
+This order is chosen,
+because it enables the optimization
+of multi-dimensional array address computations,
+if the elements are accessed in the usual way
+(i.e. row after row, rather than column after column).
+For every loop, SR first detects all induction variables
+and then tries to recognize
+expressions that can be optimized.
+.NH 3
+Finding induction variables
+.PP
+The process of finding induction variables
+can conveniently be split up
+into two parts.
+First, the EM text of the loop is scanned to find
+all \fIcandidate\fR induction variables,
+which are word-sized local variables
+that are assigned precisely once
+in the loop, within a firm block.
+Second, for every candidate, the single assignment
+is inspected, to see if it has the form
+required by the definition of an induction variable.
+.PP
+Candidates are found by scanning the EM code of the loop.
+During this scan, two sets are maintained.
+The set "cand" contains all variables that were
+assigned exactly once so far, within a firm block.
+The set "dismiss" contains all variables that
+should not be made a candidate.
+Initially, both sets are empty.
+If a variable is assigned to, it is put
+in the cand set, if three conditions are met:
+.IP 1.
+the variable was not in cand or dismiss already
+.IP 2.
+the assignment takes place in a firm block
+.IP 3.
+the assignment is not a ZRL instruction (assignment by zero)
+or a SDL instruction (store double local).
+.LP
+If any condition fails, the variable is dismissed from cand
+(if it was there already) and put in dismiss
+(if it was not there already).
+.sp 0
+All variables for which no register message was generated (i.e. those
+variables that may be accessed indirectly) are assumed
+to be changed in the loop.
+.sp 0
+All variables that remain in cand are candidate induction variables.
+.PP
+From the set of candidates, the induction variables can
+be determined, by inspecting the single assignment.
+The assignment must match one of the EM patterns below.
+('x' is the candidate. 'ws' is the word size of the target machine.
+'n' is any number.)
+.DS
+\fIpattern\fR                                     \fIstep size\fR
+INL x  |                                      +1
+DEL x  |                                      -1
+LOL x ; (INC | DEC) ; STL x  |                +1 | -1
+LOL x ; LOC n ; (ADI ws | SBI ws) ; STL x  |  +n | -n
+LOC n ; LOL x ; ADI ws ; STL x.               +n
+.DE
+From the patterns the step size of the induction variable
+can also be determined.
+These step sizes are displayed on the right hand side.
+.sp
+For every induction variable we maintain the following information:
+.IP -
+the offset of the variable in the stackframe of its procedure
+.IP -
+a pointer to the EM text of the assignment statement
+.IP -
+the step value
+.LP
+.NH 3
+Optimizing expressions
+.PP
+If any induction variables of the loop were found,
+the EM text of the loop is scanned again,
+to detect expressions that can be optimized.
+SR scans for multiplication and array instructions.
+Whenever it finds such an instruction, it analyses the
+code in front of it.
+If an expression is to be optimized, it must
+be generated by the following syntax rules.
+.DS
+   optimizable_expr:
+		iv_expr const mult |
+		const iv_expr mult |
+		address iv_expr address array_instr;
+   mult:
+		MLI ws |
+		MLU ws ;
+   array_instr:
+		LAR ws |
+		SAR ws |
+		AAR ws ;
+   const:
+		LOC n ;
+.DE
+An 'address' is an EM instruction that loads an
+address on the stack.
+An instruction like LOL may be an 'address', if
+the size of an address (pointer size, =ps) is
+the same as the word size.
+If the pointer size is twice the word size,
+instructions like LDL are an 'address'.
+(The addresses in the third grammar rule
+denote resp. the array address and the
+array descriptor address).
+.DS
+   address:
+		LAE |
+		LAL |
+		LOL if ps=ws |
+		LOE    ,,    |
+		LIL    ,,    |
+		LDL if ps=2*ws |
+		LDE    ,,      ;
+.DE
+The notion of an iv-expression was introduced earlier.
+.DS
+   iv_expr:
+		iv_expr unair_op |
+		iv_expr iv_expr binary_op |
+		loopconst |
+		iv ;
+   unair_op:
+		NGI ws |
+		INC |
+		DEC ;
+   binary_op:
+		ADI ws |
+		ADU ws |
+		SBI ws |
+		SBU ws ;
+   loopconst:
+		const |
+		LOL x  if x is not changed in loop ;
+   iv:
+		LOL x  if x is an induction variable ;
+.DE
+An iv_expression must satisfy one additional constraint:
+it must use exactly one operand that is an induction
+variable.
+A simple, hand written, top-down parser is used
+to recognize an iv-expression.
+It scans the EM code from right to left
+(recall that EM is essentially postfix).
+It uses semantic attributes (inherited as well as
+derived) to check the additional constraint.
+.PP
+All information assembled during the recognition
+process is put in a 'code_info' structure.
+This structure contains the following information:
+.IP -
+the optimizable code itself
+.IP -
+the loop and basic block the code is part of
+.IP -
+the induction variable
+.IP -
+the iv-expression
+.IP -
+the sign of the induction variable in the
+iv-expression
+.IP -
+the offset and size of the temporary local variable
+.IP -	
+the expensive operator (MLI, LAR etc.)
+.IP -
+the instruction that loads the constant
+(for multiplication) or the array descriptor
+(for arrays).
+.LP
+The entire transformation process is driven
+by this information.
+As the EM text is represented internally
+as a list, this process consists
+mainly of straightforward list manipulations.
+.sp 0
+The initialization code must be put
+immediately before the loop entry.
+For this purpose a \fIheader block\fR is
+created that has the loop entry block as
+its only successor and that dominates the
+entry block.
+The CFG and all relations (SUCC,PRED, IDOM, LOOPS etc.)
+are updated.
+.sp 0
+An EM instruction that will
+replace the optimizable code
+is created and put at the place of the old code.
+The list representing the old optimizable code
+is used to create a list for the initializing code,
+as they are similar.
+Only two modifications are required:
+.IP -
+if the expensive operator is a LAR or SAR,
+it must be replaced by an AAR, as the initial value
+of TMP is the \fIaddress\fR of the first
+array element that is accessed.
+.IP -
+code must be appended to store the result of the
+expression in TMP.
+.LP
+Finally, code to increment TMP is created and put after
+the code of the single assignment to the
+induction variable.
+The generated code uses either an integer addition
+(ADI) or an integer-to-pointer addition (ADS)
+to do the increment.
+.PP
+SR maintains a set of all expressions that have already
+been recognized in the present loop.
+Such expressions are said to be \fIavailable\fR.
+If an expression is recognized that is
+already available,
+no new temporary local variable is allocated for it,
+and the code to initialize and increment the local
+is not generated.
--- a/doc/ego/sr/sr4
+++ b/doc/ego/sr/sr4
@@ -0,0 +1,28 @@
+.NH 2
+Source files of SR
+.PP
+The sources of SR are in the following files
+and packages:
+.IP sr.h: 14
+declarations of global variables and
+data structures
+.IP sr.c:
+the routine main; a driving routine to process
+(possibly nested) loops in the right order
+.IP iv
+implements a procedure that finds the induction variables
+of a loop
+.IP reduce
+implements a procedure that finds optimizable expressions
+and that does the transformations
+.IP cand
+implements a procedure that finds the candidate induction
+variables; used to implement iv
+.IP xform
+implements several useful routines that transform
+lists of EM text or a CFG; used to implement reduce
+.IP expr
+implements a procedure that parses iv-expressions
+.IP aux
+implements several auxiliary procedures.
+.LP