Initial revision
This commit is contained in:
94
doc/ego/cf/cf1
Normal file
94
doc/ego/cf/cf1
Normal file
@@ -0,0 +1,94 @@
|
||||
.bp
|
||||
.NH
|
||||
The Control Flow Phase
|
||||
.PP
|
||||
In the previous chapter we described the intermediate
|
||||
code of the global optimizer.
|
||||
We also specified which part of this code
|
||||
was constructed by the IC phase of the optimizer.
|
||||
The Control Flow Phase (\fICF\fR) does
|
||||
the remainder of the job,
|
||||
i.e. it determines:
|
||||
.IP -
|
||||
the control flow graphs
|
||||
.IP -
|
||||
the loop tables
|
||||
.IP -
|
||||
the calling, change and use attributes of
|
||||
the procedure table entries
|
||||
.LP
|
||||
CF operates on one procedure at a time.
|
||||
For every procedure it first reads the EM instructions
|
||||
from the EM-text file and groups them into basic blocks.
|
||||
For every basic block, its successors and
|
||||
predecessors are determined,
|
||||
resulting in the control flow graph.
|
||||
Next, the immediate dominator of every basic block
|
||||
is computed.
|
||||
Using these dominators, any loop in the
|
||||
procedure is detected.
|
||||
Finally, interprocedural analysis is done,
|
||||
after which we will know the global effects of
|
||||
every procedure call on its environment.
|
||||
.sp
|
||||
CF uses the same internal data structures
|
||||
for the procedure table and object table as IC.
|
||||
.NH 2
|
||||
Partitioning into basic blocks
|
||||
.PP
|
||||
With regard to flow of control, we distinguish
|
||||
three kinds of EM instructions:
|
||||
jump instructions, instruction label definitions and
|
||||
normal instructions.
|
||||
Jump instructions are all conditional or unconditional
|
||||
branch instructions,
|
||||
the case instructions (CSA/CSB)
|
||||
and the RET (return) instruction.
|
||||
A procedure call (CAL) is not considered to be a jump.
|
||||
A defining occurrence of an instruction label
|
||||
is regarded as an EM instruction.
|
||||
.PP
|
||||
An instruction starts
|
||||
a new basic block, in any of the following cases:
|
||||
.IP 1.
|
||||
It is the first instruction of a procedure
|
||||
.IP 2.
|
||||
It is the first of a list of instruction label
|
||||
defining occurrences
|
||||
.IP 3.
|
||||
It follows a jump
|
||||
.LP
|
||||
If there are several consecutive instruction labels
|
||||
(which is highly unusual),
|
||||
all of them are put in the same basic block.
|
||||
Note that several cases may overlap,
|
||||
e.g. a label definition at the beginning of a procedure
|
||||
or a label following a jump.
|
||||
.PP
|
||||
A simple Finite State Machine is used to model
|
||||
the above rules.
|
||||
It also recognizes the end of a procedure,
|
||||
marked by an END pseudo.
|
||||
The basic blocks are stored internally as a doubly linked
|
||||
linear list.
|
||||
The blocks are linked in textual order.
|
||||
Every node of this list has the attributes described
|
||||
in the previous chapter (see syntax rule for
|
||||
basic_block).
|
||||
Furthermore, every node contains a pointer to its
|
||||
EM instructions,
|
||||
which are represented internally
|
||||
as a linear, doubly linked list,
|
||||
just as in the IC phase.
|
||||
However, instead of one list per procedure (as in IC)
|
||||
there is now one list per basic block.
|
||||
.PP
|
||||
On the fly, a table is build that maps
|
||||
every label identifier to the label definition
|
||||
instruction.
|
||||
This table is used for computing the control flow.
|
||||
The table is stored as a dynamically allocated array.
|
||||
The length of the array is the number of labels
|
||||
of the current procedure;
|
||||
this value can be found in the procedure table,
|
||||
where it was stored by IC.
|
||||
50
doc/ego/cf/cf2
Normal file
50
doc/ego/cf/cf2
Normal file
@@ -0,0 +1,50 @@
|
||||
.NH 2
|
||||
Control Flow
|
||||
.PP
|
||||
A \fIsuccessor\fR of a basic block B is a block C
|
||||
that can be executed immediately after B.
|
||||
C is said to be a \fIpredecessor\fR of B.
|
||||
A block ending with a RET instruction
|
||||
has no successors.
|
||||
Such a block is called a \fIreturn block\fR.
|
||||
Any block that has no predecessors cannot be
|
||||
executed at all (i.e. it is unreachable),
|
||||
unless it is the first block of a procedure,
|
||||
called the \fIprocedure entry block\fR.
|
||||
.PP
|
||||
Internally, the successor and predecessor
|
||||
attributes of a basic block are stored as \fIsets\fR.
|
||||
Alternatively, one may regard all these
|
||||
sets of all basic blocks as a conceptual \fIgraph\fR,
|
||||
in which there is an edge from B to C if C
|
||||
is in the successor set of B.
|
||||
We call this conceptual graph
|
||||
the \fIControl Flow Graph\fR.
|
||||
.PP
|
||||
The only successor of a basic block ending on an
|
||||
unconditional branch instruction is the block that
|
||||
contains the label definition of the target of the jump.
|
||||
The target instruction can be found via the LAB_ID
|
||||
that is the operand of the jump instruction,
|
||||
by using the label-map table mentioned
|
||||
above.
|
||||
If the last instruction of a block is a
|
||||
conditional jump,
|
||||
the successors are the target block and the textually
|
||||
next block.
|
||||
The last instruction can also be a case jump
|
||||
instruction (CSA or CSB).
|
||||
We then analyze the case descriptor,
|
||||
to find all possible target instructions
|
||||
and their associated blocks.
|
||||
We require the case descriptor to be allocated in
|
||||
a ROM, so it cannot be changed dynamically.
|
||||
A case jump via an alterable descriptor could in principle
|
||||
go to any label in the program.
|
||||
In the presence of such an uncontrolled jump,
|
||||
hardly any optimization can be done.
|
||||
We do not expect any front end to generate such a descriptor,
|
||||
however, because of the controlled nature
|
||||
of case statements in high level languages.
|
||||
If the basic block does not end in a jump instruction,
|
||||
its only successor is the textually next block.
|
||||
53
doc/ego/cf/cf3
Normal file
53
doc/ego/cf/cf3
Normal file
@@ -0,0 +1,53 @@
|
||||
.NH 2
|
||||
Immediate dominators
|
||||
.PP
|
||||
A basic block B dominates a block C if every path
|
||||
in the control flow graph from the procedure entry block
|
||||
to C goes through B.
|
||||
The immediate dominator of C is the closest dominator
|
||||
of C on any path from the entry block.
|
||||
See also
|
||||
.[~[
|
||||
aho compiler design
|
||||
.], section 13.1.]
|
||||
.PP
|
||||
There are a number of algorithms to compute
|
||||
the immediate dominator relation.
|
||||
.IP 1.
|
||||
Purdom and Moore give an algorithm that is
|
||||
easy to program and easy to describe (although the
|
||||
description they give is unreadable;
|
||||
it is given in a very messy Algol60 program full of gotos).
|
||||
.[
|
||||
predominators
|
||||
.]
|
||||
.IP 2.
|
||||
Aho and Ullman present a bitvector algorithm, which is also
|
||||
easy to program and to understand.
|
||||
(See
|
||||
.[~[
|
||||
aho compiler design
|
||||
.], section 13.1.]).
|
||||
.IP 3
|
||||
Lengauer and Tarjan introduce a fast algorithm that is
|
||||
hard to understand, yet remarkably easy to implement.
|
||||
.[
|
||||
lengauer dominators
|
||||
.]
|
||||
.LP
|
||||
The Purdom-Moore algorithm is very slow if the
|
||||
number of basic blocks in the flow graph is large.
|
||||
The Aho-Ullman algorithm in fact computes the
|
||||
dominator relation,
|
||||
from which the immediate dominator relation can be computed
|
||||
in time quadratic to the number of basic blocks, worst case.
|
||||
The storage requirement is also quadratic to the number
|
||||
of blocks.
|
||||
The running time of the third algorithm is proportional
|
||||
to:
|
||||
.DS
|
||||
(number of edges in the graph) * log(number of blocks).
|
||||
.DE
|
||||
We have chosen this algorithm because it is fast
|
||||
(as shown by experiments done by Lengauer and Tarjan),
|
||||
it is easy to program and requires little data space.
|
||||
93
doc/ego/cf/cf4
Normal file
93
doc/ego/cf/cf4
Normal file
@@ -0,0 +1,93 @@
|
||||
.NH 2
|
||||
Loop detection
|
||||
.PP
|
||||
Loops are detected by using the loop construction
|
||||
algorithm of.
|
||||
.[~[
|
||||
aho compiler design
|
||||
.], section 13.1.]
|
||||
This algorithm uses \fIback edges\fR.
|
||||
A back edge is an edge from B to C in the CFG,
|
||||
whose head (C) dominates its tail (B).
|
||||
The loop associated with this back edge
|
||||
consists of C plus all nodes in the CFG
|
||||
that can reach B without going through C.
|
||||
.PP
|
||||
As an example of how the algorithm works,
|
||||
consider the piece of program of Fig. 4.1.
|
||||
First just look at the program and think for
|
||||
yourself what part of the code constitutes the loop.
|
||||
.DS
|
||||
loop
|
||||
if cond then 1
|
||||
-- lots of simple
|
||||
-- assignment
|
||||
-- statements 2 3
|
||||
exit; -- exit loop
|
||||
else
|
||||
S; -- one statement
|
||||
end if;
|
||||
end loop;
|
||||
|
||||
Fig. 4.1 A misleading loop
|
||||
.DE
|
||||
Although a human being may be easily deceived
|
||||
by the brackets "loop" and "end loop",
|
||||
the loop detection algorithm will correctly
|
||||
reply that only the test for "cond" and
|
||||
the single statement in the false-part
|
||||
of the if statement are part of the loop!
|
||||
The statements in the true-part only get
|
||||
executed once, so there really is no reason at all
|
||||
to say they're part of the loop too.
|
||||
The CFG contains one back edge, "3->1".
|
||||
As node 3 cannot be reached from node 2,
|
||||
the latter node is not part of the loop.
|
||||
.PP
|
||||
A source of problems with the algorithm is the fact
|
||||
that different back edges may result in
|
||||
the same loop.
|
||||
Such an ill-structured loop is
|
||||
called a \fImessy\fR loop.
|
||||
After a loop has been constructed, it is checked
|
||||
if it is really a new loop.
|
||||
.PP
|
||||
Loops can partly overlap, without one being nested
|
||||
inside the other.
|
||||
This is the case in the program of Fig. 4.2.
|
||||
.DS
|
||||
1: 1
|
||||
S1;
|
||||
2:
|
||||
S2; 2
|
||||
if cond then
|
||||
goto 4;
|
||||
S3; 3 4
|
||||
goto 1;
|
||||
4:
|
||||
S4;
|
||||
goto 1;
|
||||
|
||||
Fig. 4.2 Partly overlapping loops
|
||||
.DE
|
||||
There are two back edges "3->1" and "4->1",
|
||||
resulting in the loops {1,2,3} and {1,2,4}.
|
||||
With every basic block we associate a set of
|
||||
all loops it is part of.
|
||||
It is not sufficient just to record its
|
||||
most enclosing loop.
|
||||
.PP
|
||||
After all loops of a procedure are detected, we determine
|
||||
the nesting level of every loop.
|
||||
Finally, we find all strong and firm blocks of the loop.
|
||||
If the loop has only one back edge (i.e. it is not messy),
|
||||
the set of firm blocks consists of the
|
||||
head of this back edge and its dominators
|
||||
in the loop (including the loop entry block).
|
||||
A firm block is also strong if it is not a
|
||||
successor of a block that may exit the loop;
|
||||
a block may exit a loop if it has an (immediate) successor
|
||||
that is not part of the loop.
|
||||
For messy loops we do not determine the strong
|
||||
and firm blocks. These loops are expected
|
||||
to occur very rarely.
|
||||
79
doc/ego/cf/cf5
Normal file
79
doc/ego/cf/cf5
Normal file
@@ -0,0 +1,79 @@
|
||||
.NH 2
|
||||
Interprocedural analysis
|
||||
.PP
|
||||
It is often desirable to know the effects
|
||||
a procedure call may have.
|
||||
The optimization below is only possible if
|
||||
we know for sure that the call to P cannot
|
||||
change A.
|
||||
.DS
|
||||
A := 10; A:= 10;
|
||||
P; -- procedure call --> P;
|
||||
B := A + 2; B := 12;
|
||||
.DE
|
||||
Although it is not possible to predict exactly
|
||||
all the effects a procedure call has, we may
|
||||
determine a kind of upper bound for it.
|
||||
So we compute all variables that may be
|
||||
changed by P, although they need not be
|
||||
changed at every invocation of P.
|
||||
We can get hold of this set by just looking
|
||||
at all assignment (store) instructions
|
||||
in the body of P.
|
||||
EM also has a set of \fIindirect\fR assignment
|
||||
instructions,
|
||||
i.e. assignment through a pointer variable.
|
||||
In general, it is not possible to determine
|
||||
which variable is affected by such an assignment.
|
||||
In these cases, we just record the fact that P
|
||||
does an indirect assignment.
|
||||
Note that this does not mean that all variables
|
||||
are potentially affected, as the front ends
|
||||
may generate messages telling that certain
|
||||
variables can never be accessed indirectly.
|
||||
We also set a flag if P does a use (load) indirect.
|
||||
Note that we only have to look at \fIglobal\fR
|
||||
variables.
|
||||
If P changes or uses any of its locals,
|
||||
this has no effect on its environment.
|
||||
Local variables of a lexically enclosing
|
||||
procedure can only be accessed indirectly.
|
||||
.PP
|
||||
A procedure P may of course call another procedure.
|
||||
To determine the effects of a call to P,
|
||||
we also must know the effects of a call to the second procedure.
|
||||
This second one may call a third one, and so on.
|
||||
Effectively, we need to compute the \fItransitive closure\fR
|
||||
of the effects.
|
||||
To do this, we determine for every procedure
|
||||
which other procedures it calls.
|
||||
This set is the "calling" attribute of a procedure.
|
||||
One may regard all these sets as a conceptual graph,
|
||||
in which there is an edge from P to Q
|
||||
if Q is in the calling set of P. This graph will
|
||||
be referred to as the \fIcall graph\fR.
|
||||
(Note the resemblance with the control flow graph).
|
||||
.PP
|
||||
We can detect which procedures are called by P
|
||||
by looking at all CAL instructions in its body.
|
||||
Unfortunately, a procedure may also be
|
||||
called indirectly, via a CAI instruction.
|
||||
Yet, only procedures that are used as operand of an LPI
|
||||
instruction can be called indirect,
|
||||
because this is the only way to take the address of a procedure.
|
||||
We determine for every procedure whether it does
|
||||
a CAI instruction.
|
||||
We also build a set of all procedures used as
|
||||
operand of an LPI.
|
||||
.sp
|
||||
After all procedures have been processed (i.e. all CFGs
|
||||
are constructed, all loops are detected,
|
||||
all procedures are analyzed to see which variables
|
||||
they may change, which procedures they call,
|
||||
whether they do a CAI or are used in an LPI) the
|
||||
transitive closure of all interprocedural
|
||||
information is computed.
|
||||
During the same process,
|
||||
the calling set of every procedure that uses a CAI
|
||||
is extended with the above mentioned set of all
|
||||
procedures that can be called indirect.
|
||||
21
doc/ego/cf/cf6
Normal file
21
doc/ego/cf/cf6
Normal file
@@ -0,0 +1,21 @@
|
||||
.NH 2
|
||||
Source files
|
||||
.PP
|
||||
The sources of CF are in the following files and packages:
|
||||
.IP cf.h: 14
|
||||
declarations of global variables and data structures
|
||||
.IP cf.c:
|
||||
the routine main; interprocedural analysis;
|
||||
transitive closure
|
||||
.IP succ:
|
||||
control flow (successor and predecessor)
|
||||
.IP idom:
|
||||
immediate dominators
|
||||
.IP loop:
|
||||
loop detection
|
||||
.IP get:
|
||||
read object and procedure table;
|
||||
read EM text and partition it into basic blocks
|
||||
.IP put:
|
||||
write tables, CFGs and EM text
|
||||
.LP
|
||||
Reference in New Issue
Block a user