Initial revision
This commit is contained in:
44
doc/ego/sr/sr1
Normal file
44
doc/ego/sr/sr1
Normal file
@@ -0,0 +1,44 @@
|
||||
.bp
|
||||
.NH 1
|
||||
Strength reduction
|
||||
.NH 2
|
||||
Introduction
|
||||
.PP
|
||||
The Strength Reduction optimization technique (SR)
|
||||
tries to replace expensive operators
|
||||
by cheaper ones,
|
||||
in order to decrease the execution time
|
||||
of the program.
|
||||
A classical example is replacing a 'multiplication by 2'
|
||||
by an addition or a shift instruction.
|
||||
These kinds of local transformations are already
|
||||
done by the EM Peephole Optimizer.
|
||||
Strength reduction can also be applied
|
||||
more generally to operators used in a loop.
|
||||
.DS
|
||||
i := 1; i := 1;
|
||||
while i < 100 loop --> TMP := i * 118;
|
||||
put(i * 118); while i < 100 loop
|
||||
i := i + 1; put(TMP);
|
||||
end loop; i := i + 1;
|
||||
TMP := TMP + 118;
|
||||
end loop;
|
||||
|
||||
Fig. 6.1 An example of Strenght Reduction
|
||||
.DE
|
||||
In Fig. 6.1, a multiplication inside a loop is
|
||||
replaced by an addition inside the loop and a multiplication
|
||||
outside the loop.
|
||||
Clearly, this is a global optimization; it cannot
|
||||
be done by a peephole optimizer.
|
||||
.PP
|
||||
In some cases a related technique, \fItest replacement\fR,
|
||||
can be used to eliminate the
|
||||
loop variable i.
|
||||
This technique will not be discussed in this report.
|
||||
.sp 0
|
||||
In the example above, the resulting code
|
||||
can be further optimized by using
|
||||
constant propagation.
|
||||
Obviously, this is not the task of the
|
||||
Strength Reduction phase.
|
||||
217
doc/ego/sr/sr2
Normal file
217
doc/ego/sr/sr2
Normal file
@@ -0,0 +1,217 @@
|
||||
.NH 2
|
||||
The model of strength reduction
|
||||
.PP
|
||||
In this section we will describe
|
||||
the transformations performed by
|
||||
Strength Reduction (SR).
|
||||
Before doing so, we will introduce the
|
||||
central notion of an induction variable.
|
||||
.NH 3
|
||||
Induction variables
|
||||
.PP
|
||||
SR looks for variables whose
|
||||
values form an arithmetic progression
|
||||
at the beginning of a loop.
|
||||
These variables are called induction variables.
|
||||
The most frequently occurring example of such
|
||||
a variable is a loop-variable in a high-order
|
||||
programming language.
|
||||
Several quite sophisticated models of strength
|
||||
reduction can be found in the literature.
|
||||
.[
|
||||
cocke reduction strength cacm
|
||||
.]
|
||||
.[
|
||||
allen cocke kennedy reduction strength
|
||||
.]
|
||||
.[
|
||||
lowry medlock cacm
|
||||
.]
|
||||
.[
|
||||
aho compiler design
|
||||
.]
|
||||
In these models the notion of an induction variable
|
||||
is far more general than the intuitive notion
|
||||
of a loop-variable.
|
||||
The definition of an induction variable we present here
|
||||
is more restricted,
|
||||
yielding a simpler model and simpler transformations.
|
||||
We think the principle source for strength reduction lies in
|
||||
expressions using a loop-variable,
|
||||
i.e. a variable that is incremented or decremented
|
||||
by the same amount after every loop iteration,
|
||||
and that cannot be changed in any other way.
|
||||
.PP
|
||||
Of course, the EM code does not contain high level constructs
|
||||
such as for-statements.
|
||||
We will define an induction variable in terms
|
||||
of the Intermediate Code of the optimizer.
|
||||
Note that the notions of a loop in the
|
||||
EM text and of a firm basic block
|
||||
were defined in section 3.3.5.
|
||||
.sp
|
||||
.UL definition
|
||||
.sp 0
|
||||
An induction variable i of a loop L is a local variable
|
||||
that is never accessed indirectly,
|
||||
whose size is the word size of the target machine, and
|
||||
that is assigned exactly once within L,
|
||||
the assignment:
|
||||
.IP -
|
||||
being of the form i := i + c or i := c +i,
|
||||
c is a constant
|
||||
called the \fIstep value\fR of i.
|
||||
.IP -
|
||||
occurring in a firm block of L.
|
||||
.LP
|
||||
(Note that the first restriction on the assignment
|
||||
is not described in terms of the Intermediate Code;
|
||||
we will give such a description later; the current
|
||||
definition is easier to understand however).
|
||||
.NH 3
|
||||
Recognized expressions
|
||||
.PP
|
||||
SR recognizes certain expressions using
|
||||
an induction variable and replaces
|
||||
them by cheaper ones.
|
||||
Two kinds of expensive operations are recognized:
|
||||
multiplication and array address computations.
|
||||
The expressions that are simplified must
|
||||
use an induction variable
|
||||
as an operand of
|
||||
a multiplication or as index in an array expression.
|
||||
.PP
|
||||
Often a linear function of an induction variable is used,
|
||||
rather than the variable itself.
|
||||
In these cases optimization is still possible.
|
||||
We call such expressions \fIiv-expressions\fR.
|
||||
.sp
|
||||
.UL definition:
|
||||
.sp 0
|
||||
An iv-expression of an induction variable i of a loop L is
|
||||
an expression that:
|
||||
.IP -
|
||||
uses only the operators + and - (unary as well as binary)
|
||||
.IP -
|
||||
uses i as operand exactly once
|
||||
.IP -
|
||||
uses (besides i) only constants or variables that are
|
||||
never changed in L as operands.
|
||||
.LP
|
||||
.PP
|
||||
The expressions recognized by SR are of the following forms:
|
||||
.IP (1)
|
||||
iv_expression * constant
|
||||
.IP (2)
|
||||
constant * iv_expression
|
||||
.IP (3)
|
||||
A[iv-expression] := (assign to array element)
|
||||
.IP (4)
|
||||
A[iv-expression] (use array element)
|
||||
.IP (5)
|
||||
& A[iv-expression] (take address of array element)
|
||||
.LP
|
||||
(Note that EM has different instructions to use an array element,
|
||||
store into one, or take the address of one, resp. LAR, SAR, and AAR).
|
||||
.sp 0
|
||||
The size of the elements of A must
|
||||
be known statically.
|
||||
In cases (3) and (4) this size
|
||||
must equal the word size of the
|
||||
target machine.
|
||||
.NH 3
|
||||
Transformations
|
||||
.PP
|
||||
With every recognized expression we associate
|
||||
a new temporary local variable TMP,
|
||||
allocated in the stack frame of the
|
||||
procedure containing the expression.
|
||||
At any program point within the loop, TMP will
|
||||
contain the following value:
|
||||
.IP multiplication: 18
|
||||
the current value of iv-expression * constant
|
||||
.IP arrays:
|
||||
the current value of &A[iv-expression].
|
||||
.LP
|
||||
In the second case, TMP essentially is a pointer variable,
|
||||
pointing to the element of A that is currently in use.
|
||||
.sp 0
|
||||
If the same expression occurs several times in the loop,
|
||||
the same temporary local is used each time.
|
||||
.PP
|
||||
Three transformations are applied to the EM text:
|
||||
.IP (1)
|
||||
TMP is initialized with the right value.
|
||||
This initialization takes place just
|
||||
before the loop.
|
||||
.IP (2)
|
||||
The recognized expression is simplified.
|
||||
.IP (3)
|
||||
TMP is incremented; this takes place just
|
||||
after the induction variable is incremented.
|
||||
.LP
|
||||
For multiplication, the initial value of TMP
|
||||
is the value of the recognized expression at
|
||||
the program point immediately before the loop.
|
||||
For arrays, TMP is initialized with the address
|
||||
of the first array element that is accessed.
|
||||
So the initialization code is:
|
||||
.DS
|
||||
TMP := iv-expression * constant; or
|
||||
TMP := &A[iv-expression]
|
||||
.DE
|
||||
At the point immediately before the loop,
|
||||
the induction variable will already have been
|
||||
initialized,
|
||||
so the value used in the code above will be the
|
||||
value it has during the first iteration.
|
||||
.PP
|
||||
For multiplication, the recognized expression can simply be
|
||||
replaced by TMP.
|
||||
For array optimizations, the replacement
|
||||
depends on the form:
|
||||
.DS
|
||||
\fIform\fR \fIreplacement\fR
|
||||
(3) A[iv-expr] := *TMP := (assign indirect)
|
||||
(4) A[iv-expr] *TMP (use indirect)
|
||||
(5) &A[iv-expr] TMP
|
||||
.DE
|
||||
The '*' denotes the indirect operator. (Note that
|
||||
EM has different instructions to do
|
||||
an assign-indirect and a use-indirect).
|
||||
As the size of the array elements is restricted
|
||||
to be the word size in case (3) and (4),
|
||||
only one EM instruction needs to
|
||||
be generated in all cases.
|
||||
.PP
|
||||
The amount by which TMP is incremented is:
|
||||
.IP multiplication: 18
|
||||
step value * constant
|
||||
.IP arrays:
|
||||
step value * element size
|
||||
.LP
|
||||
Note that the step value (see definition of induction variable above),
|
||||
the constant, and the element size (see previous section) can all
|
||||
be determined statically.
|
||||
If the sign of the induction variable in the
|
||||
iv-expression is negative, the amount
|
||||
must be negated.
|
||||
.PP
|
||||
The transformations are demonstrated by an example.
|
||||
.DS
|
||||
i := 100; i := 100;
|
||||
while i > 1 loop TMP := (6-i) * 5;
|
||||
X := (6-i) * 5 + 2; while i > 1 loop
|
||||
Y := (6-i) * 5 - 8; --> X := TMP + 2;
|
||||
i := i - 3; Y := TMP - 8;
|
||||
end loop; i := i - 3;
|
||||
TMP := TMP + 15;
|
||||
end loop;
|
||||
|
||||
Fig. 6.2 Example of complex Strength Reduction transformations
|
||||
.DE
|
||||
The expression '(6-i)*5' is recognized twice. The constant
|
||||
is 5.
|
||||
The step value is -3.
|
||||
The sign of i in the recognized expression is '-'.
|
||||
So the increment value of TMP is -(-3*5) = +15.
|
||||
232
doc/ego/sr/sr3
Normal file
232
doc/ego/sr/sr3
Normal file
@@ -0,0 +1,232 @@
|
||||
.NH 2
|
||||
Implementation
|
||||
.PP
|
||||
Like most phases, SR deals with one procedure
|
||||
at a time.
|
||||
Within a procedure, SR works on one loop at a time.
|
||||
Loops are processed in textual order.
|
||||
If loops are nested inside each other,
|
||||
SR starts with the outermost loop and proceeds in the
|
||||
inwards direction.
|
||||
This order is chosen,
|
||||
because it enables the optimization
|
||||
of multi-dimensional array address computations,
|
||||
if the elements are accessed in the usual way
|
||||
(i.e. row after row, rather than column after column).
|
||||
For every loop, SR first detects all induction variables
|
||||
and then tries to recognize
|
||||
expressions that can be optimized.
|
||||
.NH 3
|
||||
Finding induction variables
|
||||
.PP
|
||||
The process of finding induction variables
|
||||
can conveniently be split up
|
||||
into two parts.
|
||||
First, the EM text of the loop is scanned to find
|
||||
all \fIcandidate\fR induction variables,
|
||||
which are word-sized local variables
|
||||
that are assigned precisely once
|
||||
in the loop, within a firm block.
|
||||
Second, for every candidate, the single assignment
|
||||
is inspected, to see if it has the form
|
||||
required by the definition of an induction variable.
|
||||
.PP
|
||||
Candidates are found by scanning the EM code of the loop.
|
||||
During this scan, two sets are maintained.
|
||||
The set "cand" contains all variables that were
|
||||
assigned exactly once so far, within a firm block.
|
||||
The set "dismiss" contains all variables that
|
||||
should not be made a candidate.
|
||||
Initially, both sets are empty.
|
||||
If a variable is assigned to, it is put
|
||||
in the cand set, if three conditions are met:
|
||||
.IP 1.
|
||||
the variable was not in cand or dismiss already
|
||||
.IP 2.
|
||||
the assignment takes place in a firm block
|
||||
.IP 3.
|
||||
the assignment is not a ZRL instruction (assignment by zero)
|
||||
or a SDL instruction (store double local).
|
||||
.LP
|
||||
If any condition fails, the variable is dismissed from cand
|
||||
(if it was there already) and put in dismiss
|
||||
(if it was not there already).
|
||||
.sp 0
|
||||
All variables for which no register message was generated (i.e. those
|
||||
variables that may be accessed indirectly) are assumed
|
||||
to be changed in the loop.
|
||||
.sp 0
|
||||
All variables that remain in cand are candidate induction variables.
|
||||
.PP
|
||||
From the set of candidates, the induction variables can
|
||||
be determined, by inspecting the single assignment.
|
||||
The assignment must match one of the EM patterns below.
|
||||
('x' is the candidate. 'ws' is the word size of the target machine.
|
||||
'n' is any number.)
|
||||
.DS
|
||||
\fIpattern\fR \fIstep size\fR
|
||||
INL x | +1
|
||||
DEL x | -1
|
||||
LOL x ; (INC | DEC) ; STL x | +1 | -1
|
||||
LOL x ; LOC n ; (ADI ws | SBI ws) ; STL x | +n | -n
|
||||
LOC n ; LOL x ; ADI ws ; STL x. +n
|
||||
.DE
|
||||
From the patterns the step size of the induction variable
|
||||
can also be determined.
|
||||
These step sizes are displayed on the right hand side.
|
||||
.sp
|
||||
For every induction variable we maintain the following information:
|
||||
.IP -
|
||||
the offset of the variable in the stackframe of its procedure
|
||||
.IP -
|
||||
a pointer to the EM text of the assignment statement
|
||||
.IP -
|
||||
the step value
|
||||
.LP
|
||||
.NH 3
|
||||
Optimizing expressions
|
||||
.PP
|
||||
If any induction variables of the loop were found,
|
||||
the EM text of the loop is scanned again,
|
||||
to detect expressions that can be optimized.
|
||||
SR scans for multiplication and array instructions.
|
||||
Whenever it finds such an instruction, it analyses the
|
||||
code in front of it.
|
||||
If an expression is to be optimized, it must
|
||||
be generated by the following syntax rules.
|
||||
.DS
|
||||
optimizable_expr:
|
||||
iv_expr const mult |
|
||||
const iv_expr mult |
|
||||
address iv_expr address array_instr;
|
||||
mult:
|
||||
MLI ws |
|
||||
MLU ws ;
|
||||
array_instr:
|
||||
LAR ws |
|
||||
SAR ws |
|
||||
AAR ws ;
|
||||
const:
|
||||
LOC n ;
|
||||
.DE
|
||||
An 'address' is an EM instruction that loads an
|
||||
address on the stack.
|
||||
An instruction like LOL may be an 'address', if
|
||||
the size of an address (pointer size, =ps) is
|
||||
the same as the word size.
|
||||
If the pointer size is twice the word size,
|
||||
instructions like LDL are an 'address'.
|
||||
(The addresses in the third grammar rule
|
||||
denote resp. the array address and the
|
||||
array descriptor address).
|
||||
.DS
|
||||
address:
|
||||
LAE |
|
||||
LAL |
|
||||
LOL if ps=ws |
|
||||
LOE ,, |
|
||||
LIL ,, |
|
||||
LDL if ps=2*ws |
|
||||
LDE ,, ;
|
||||
.DE
|
||||
The notion of an iv-expression was introduced earlier.
|
||||
.DS
|
||||
iv_expr:
|
||||
iv_expr unair_op |
|
||||
iv_expr iv_expr binary_op |
|
||||
loopconst |
|
||||
iv ;
|
||||
unair_op:
|
||||
NGI ws |
|
||||
INC |
|
||||
DEC ;
|
||||
binary_op:
|
||||
ADI ws |
|
||||
ADU ws |
|
||||
SBI ws |
|
||||
SBU ws ;
|
||||
loopconst:
|
||||
const |
|
||||
LOL x if x is not changed in loop ;
|
||||
iv:
|
||||
LOL x if x is an induction variable ;
|
||||
.DE
|
||||
An iv_expression must satisfy one additional constraint:
|
||||
it must use exactly one operand that is an induction
|
||||
variable.
|
||||
A simple, hand written, top-down parser is used
|
||||
to recognize an iv-expression.
|
||||
It scans the EM code from right to left
|
||||
(recall that EM is essentially postfix).
|
||||
It uses semantic attributes (inherited as well as
|
||||
derived) to check the additional constraint.
|
||||
.PP
|
||||
All information assembled during the recognition
|
||||
process is put in a 'code_info' structure.
|
||||
This structure contains the following information:
|
||||
.IP -
|
||||
the optimizable code itself
|
||||
.IP -
|
||||
the loop and basic block the code is part of
|
||||
.IP -
|
||||
the induction variable
|
||||
.IP -
|
||||
the iv-expression
|
||||
.IP -
|
||||
the sign of the induction variable in the
|
||||
iv-expression
|
||||
.IP -
|
||||
the offset and size of the temporary local variable
|
||||
.IP -
|
||||
the expensive operator (MLI, LAR etc.)
|
||||
.IP -
|
||||
the instruction that loads the constant
|
||||
(for multiplication) or the array descriptor
|
||||
(for arrays).
|
||||
.LP
|
||||
The entire transformation process is driven
|
||||
by this information.
|
||||
As the EM text is represented internally
|
||||
as a list, this process consists
|
||||
mainly of straightforward list manipulations.
|
||||
.sp 0
|
||||
The initialization code must be put
|
||||
immediately before the loop entry.
|
||||
For this purpose a \fIheader block\fR is
|
||||
created that has the loop entry block as
|
||||
its only successor and that dominates the
|
||||
entry block.
|
||||
The CFG and all relations (SUCC,PRED, IDOM, LOOPS etc.)
|
||||
are updated.
|
||||
.sp 0
|
||||
An EM instruction that will
|
||||
replace the optimizable code
|
||||
is created and put at the place of the old code.
|
||||
The list representing the old optimizable code
|
||||
is used to create a list for the initializing code,
|
||||
as they are similar.
|
||||
Only two modifications are required:
|
||||
.IP -
|
||||
if the expensive operator is a LAR or SAR,
|
||||
it must be replaced by an AAR, as the initial value
|
||||
of TMP is the \fIaddress\fR of the first
|
||||
array element that is accessed.
|
||||
.IP -
|
||||
code must be appended to store the result of the
|
||||
expression in TMP.
|
||||
.LP
|
||||
Finally, code to increment TMP is created and put after
|
||||
the code of the single assignment to the
|
||||
induction variable.
|
||||
The generated code uses either an integer addition
|
||||
(ADI) or an integer-to-pointer addition (ADS)
|
||||
to do the increment.
|
||||
.PP
|
||||
SR maintains a set of all expressions that have already
|
||||
been recognized in the present loop.
|
||||
Such expressions are said to be \fIavailable\fR.
|
||||
If an expression is recognized that is
|
||||
already available,
|
||||
no new temporary local variable is allocated for it,
|
||||
and the code to initialize and increment the local
|
||||
is not generated.
|
||||
28
doc/ego/sr/sr4
Normal file
28
doc/ego/sr/sr4
Normal file
@@ -0,0 +1,28 @@
|
||||
.NH 2
|
||||
Source files of SR
|
||||
.PP
|
||||
The sources of SR are in the following files
|
||||
and packages:
|
||||
.IP sr.h: 14
|
||||
declarations of global variables and
|
||||
data structures
|
||||
.IP sr.c:
|
||||
the routine main; a driving routine to process
|
||||
(possibly nested) loops in the right order
|
||||
.IP iv
|
||||
implements a procedure that finds the induction variables
|
||||
of a loop
|
||||
.IP reduce
|
||||
implements a procedure that finds optimizable expressions
|
||||
and that does the transformations
|
||||
.IP cand
|
||||
implements a procedure that finds the candidate induction
|
||||
variables; used to implement iv
|
||||
.IP xform
|
||||
implements several useful routines that transform
|
||||
lists of EM text or a CFG; used to implement reduce
|
||||
.IP expr
|
||||
implements a procedure that parses iv-expressions
|
||||
.IP aux
|
||||
implements several auxiliary procedures.
|
||||
.LP
|
||||
Reference in New Issue
Block a user