Initial revision
This commit is contained in:
132
doc/ego/il/il4
Normal file
132
doc/ego/il/il4
Normal file
@@ -0,0 +1,132 @@
|
||||
.NH 2
|
||||
Heuristic rules
|
||||
.PP
|
||||
Using the information described
|
||||
in the previous section,
|
||||
we can find all calls that can
|
||||
be expanded in line, and for which
|
||||
this expansion is desirable.
|
||||
In general, we cannot expand all these calls,
|
||||
so we have to choose the 'best' ones.
|
||||
With every CAL instruction
|
||||
that may be expanded, we associate
|
||||
a \fIpay off\fR,
|
||||
which expresses how desirable it is
|
||||
to expand this specific CAL.
|
||||
.sp
|
||||
Let Tc denote the portion of EM text involved
|
||||
in a specific call, i.e. the pushing of the actual
|
||||
parameter expressions, the CAL itself,
|
||||
the popping of the parameters and the
|
||||
pushing of the result (if any, via an LFR).
|
||||
Let Te denote the EM text that would be obtained
|
||||
by expanding the call in line.
|
||||
Let Pc be the original program and Pe the program
|
||||
with Te substituted for Tc.
|
||||
The pay off of the CAL depends on two factors:
|
||||
.IP -
|
||||
T = execution_time(Pe) - execution_time(Pc)
|
||||
.IP -
|
||||
S = code_size(Pe) - code_size(Pc)
|
||||
.LP
|
||||
The change in execution time (T) depends on:
|
||||
.IP -
|
||||
T1 = execution_time(Te) - execution_time(Tc)
|
||||
.IP -
|
||||
N = number of times Te or Tc get executed.
|
||||
.LP
|
||||
We assume that T1 will be the same every
|
||||
time the code gets executed.
|
||||
This is a reasonable assumption.
|
||||
(Note that we are talking about one CAL,
|
||||
not about different calls to the same procedure).
|
||||
Hence
|
||||
.DS
|
||||
T = N * T1
|
||||
.DE
|
||||
T1 can be estimated by a careful analysis
|
||||
of the transformations that are performed.
|
||||
Below, we list everything that will be
|
||||
different when a call is expanded in line:
|
||||
.IP -
|
||||
The CAL instruction is not executed.
|
||||
This saves a subroutine jump.
|
||||
.IP -
|
||||
The instructions in the procedure prolog
|
||||
are not executed.
|
||||
These instructions, generated from the PRO pseudo,
|
||||
save some machine registers
|
||||
(including the old LB), set the new LB and allocate space
|
||||
for the locals of the called routine.
|
||||
The savings may be less if there are no
|
||||
locals to allocate.
|
||||
.IP -
|
||||
In line parameters are not evaluated before the call
|
||||
and are not pushed on the stack.
|
||||
.IP -
|
||||
All remaining parameters are stored in local variables,
|
||||
instead of being pushed on the stack.
|
||||
.IP -
|
||||
If the number of parameters is nonzero,
|
||||
the ASP instruction after the CAL is not executed.
|
||||
.IP -
|
||||
Every reference to an in line parameter is
|
||||
substituted by the parameter expression.
|
||||
.IP -
|
||||
RET (return) instructions are replaced by
|
||||
BRA (branch) instructions.
|
||||
If the called procedure 'falls through'
|
||||
(i.e. it has only one RET, at the end of its code),
|
||||
even the BRA is not needed.
|
||||
.IP -
|
||||
The LFR (fetch function result) is not executed
|
||||
.PP
|
||||
Besides these changes, which are caused directly by IL,
|
||||
other changes may occur as IL influences other optimization
|
||||
techniques, such as Register Allocation and Constant Propagation.
|
||||
Our heuristic rules do not take into account the quite
|
||||
inpredictable effects on Register Allocation.
|
||||
It does, however, favour calls that have numeric \fIconstants\fR
|
||||
as parameter; especially the constant "0" as an inline
|
||||
parameter gets high scores,
|
||||
as further optimizations may often be possible.
|
||||
.PP
|
||||
It cannot be determined statically how often a CAL instruction gets
|
||||
executed.
|
||||
We will use \fIloop nesting\fR information here.
|
||||
The nesting level of the loop in which
|
||||
the CAL appears (if any) will be used as an
|
||||
indication for the number of times it gets executed.
|
||||
.PP
|
||||
Based on all these facts,
|
||||
the pay off of a call will be computed.
|
||||
The following model was developed empirically.
|
||||
Assume procedure P calls procedure Q.
|
||||
The call takes place in basic block B.
|
||||
.DS
|
||||
ZP = # zero parameters
|
||||
CP = # constant parameters - ZP
|
||||
LN = Loop Nesting level (0 if outside any loop)
|
||||
F = \fIif\fR # formal parameters of Q > 0 \fIthen\fR 1 \fIelse\fR 0
|
||||
FT = \fIif\fR Q falls through \fIthen\fR 1 \fIelse\fR 0
|
||||
S = size(Q) - 1 - # inline_parameters - F
|
||||
L = \fIif\fR # local variables of P > 0 \fIthen\fR 0 \fIelse\fR -1
|
||||
A = CP + 2 * ZP
|
||||
N = \fIif\fR LN=0 and P is never called from a loop \fIthen\fR 0 \fIelse\fR (LN+1)**2
|
||||
FM = \fIif\fR B is a firm block \fIthen\fR 2 \fIelse\fR 1
|
||||
|
||||
pay_off = (100/S + FT + F + L + A) * N * FM
|
||||
.DE
|
||||
S stands for the size increase of the program,
|
||||
which is slightly less than the size of Q.
|
||||
The size of a procedure is taken to be its number
|
||||
of (non-pseudo) EM instructions.
|
||||
The terms "loop nesting level" and "firm" were defined
|
||||
in the chapter on the Intermediate Code (section "loop tables").
|
||||
If a call is not inside a loop and the calling procedure
|
||||
is itself never called from a loop (transitively),
|
||||
then the call will probably be executed at most once.
|
||||
Such a call is never expanded in line (its pay off is zero).
|
||||
If the calling procedure doesn't have local variables, a penalty (L)
|
||||
is introduced, as it will most likely get local variables if the
|
||||
call gets expanded.
|
||||
Reference in New Issue
Block a user