Initial revision

1987-03-03 10:25:25 +00:00
parent f195cb9aa6
commit 4a8a8e67ed
12 changed files with 1354 additions and 0 deletions
--- a/doc/ego/cf/cf1
+++ b/doc/ego/cf/cf1
@@ -0,0 +1,94 @@
+.bp
+.NH
+The Control Flow Phase
+.PP
+In the previous chapter we described the intermediate
+code of the global optimizer.
+We also specified which part of this code
+was constructed by the IC phase of the optimizer.
+The Control Flow Phase (\fICF\fR) does
+the remainder of the job,
+i.e. it determines:
+.IP -
+the control flow graphs
+.IP -
+the loop tables
+.IP -
+the calling, change and use attributes of
+the procedure table entries
+.LP
+CF operates on one procedure at a time.
+For every procedure it first reads the EM instructions
+from the EM-text file and groups them into basic blocks.
+For every basic block, its successors and
+predecessors are determined,
+resulting in the control flow graph.
+Next, the immediate dominator of every basic block
+is computed.
+Using these dominators, any loop in the
+procedure is detected.
+Finally, interprocedural analysis is done,
+after which we will know the global effects of
+every procedure call on its environment.
+.sp
+CF uses the same internal data structures
+for the procedure table and object table as IC.
+.NH 2
+Partitioning into basic blocks
+.PP
+With regard to flow of control, we distinguish
+three kinds of EM instructions:
+jump instructions, instruction label definitions and
+normal instructions.
+Jump instructions are all conditional or unconditional
+branch instructions,
+the case instructions (CSA/CSB)
+and the RET (return) instruction.
+A procedure call (CAL) is not considered to be a jump.
+A defining occurrence of an instruction label
+is regarded as an EM instruction.
+.PP
+An instruction starts
+a new basic block, in any of the following cases:
+.IP 1.
+It is the first instruction of a procedure
+.IP 2.
+It is the first of a list of instruction label
+defining occurrences
+.IP 3.
+It follows a jump
+.LP
+If there are several consecutive instruction labels
+(which is highly unusual),
+all of them are put in the same basic block.
+Note that several cases may overlap,
+e.g. a label definition at the beginning of a procedure
+or a label following a jump.
+.PP
+A simple Finite State Machine is used to model
+the above rules.
+It also recognizes the end of a procedure,
+marked by an END pseudo.
+The basic blocks are stored internally as a doubly linked
+linear list.
+The blocks are linked in textual order.
+Every node of this list has the attributes described
+in the previous chapter (see syntax rule for
+basic_block).
+Furthermore, every node contains a pointer to its
+EM instructions,
+which are represented internally
+as a linear, doubly linked list,
+just as in the IC phase.
+However, instead of one list per procedure (as in IC)
+there is now one list per basic block.
+.PP
+On the fly, a table is build that maps
+every label identifier to the label definition
+instruction.
+This table is used for computing the control flow.
+The table is stored as a dynamically allocated array.
+The length of the array is the number of labels
+of the current procedure;
+this value can be found in the procedure table,
+where it was stored by IC.
--- a/doc/ego/cf/cf2
+++ b/doc/ego/cf/cf2
@@ -0,0 +1,50 @@
+.NH 2
+Control Flow
+.PP
+A \fIsuccessor\fR of a basic block B is a block C
+that can be executed immediately after B.
+C is said to be a \fIpredecessor\fR of B.
+A block ending with a RET instruction
+has no successors.
+Such a block is called a \fIreturn block\fR.
+Any block that has no predecessors cannot be
+executed at all (i.e. it is unreachable),
+unless it is the first block of a procedure,
+called the \fIprocedure entry block\fR.
+.PP
+Internally, the successor and predecessor
+attributes of a basic block are stored as \fIsets\fR.
+Alternatively, one may regard all these
+sets of all basic blocks as a conceptual \fIgraph\fR,
+in which there is an edge from B to C if C
+is in the successor set of B.
+We call this conceptual graph
+the \fIControl Flow Graph\fR.
+.PP
+The only successor of a basic block ending on an
+unconditional branch instruction is the block that
+contains the label definition of the target of the jump.
+The target instruction can be found via the LAB_ID
+that is the operand of the jump instruction,
+by using the label-map table mentioned
+above.
+If the last instruction of a block is a
+conditional jump,
+the successors are the target block and the textually
+next block.
+The last instruction can also be a case jump
+instruction (CSA or CSB).
+We then analyze the case descriptor,
+to find all possible target instructions
+and their associated blocks.
+We require the case descriptor to be allocated in
+a ROM, so it cannot be changed dynamically.
+A case jump via an alterable descriptor could in principle
+go to any label in the program.
+In the presence of such an uncontrolled jump,
+hardly any optimization can be done.
+We do not expect any front end to generate such a descriptor,
+however, because of the controlled nature
+of case statements in high level languages.
+If the basic block does not end in a jump instruction,
+its only successor is the textually next block.
--- a/doc/ego/cf/cf3
+++ b/doc/ego/cf/cf3
@@ -0,0 +1,53 @@
+.NH 2
+Immediate dominators
+.PP
+A basic block B dominates a block C if every path
+in the control flow graph from the procedure entry block
+to C goes through B.
+The immediate dominator of C is the closest dominator
+of C on any path from the entry block.
+See also
+.[~[
+aho compiler design
+.], section 13.1.]
+.PP
+There are a number of algorithms to compute
+the immediate dominator relation.
+.IP 1.
+Purdom and Moore give an algorithm that is
+easy to program and easy to describe (although the
+description they give is unreadable;
+it is given in a very messy Algol60 program full of gotos).
+.[
+predominators 
+.]
+.IP 2.
+Aho and Ullman present a bitvector algorithm, which is also
+easy to program and to understand.
+(See 
+.[~[
+aho compiler design
+.], section 13.1.]).
+.IP 3
+Lengauer and Tarjan introduce a fast algorithm that is
+hard to understand, yet remarkably easy to implement.
+.[
+lengauer dominators
+.]
+.LP
+The Purdom-Moore algorithm is very slow if the
+number of basic blocks in the flow graph is large.
+The Aho-Ullman algorithm in fact computes the
+dominator relation,
+from which the immediate dominator relation can be computed
+in time quadratic to the number of basic blocks, worst case.
+The storage requirement is also quadratic to the number
+of blocks.
+The running time of the third algorithm is proportional
+to:
+.DS
+(number of edges in the graph) * log(number of blocks).
+.DE
+We have chosen this algorithm because it is fast
+(as shown by experiments done by Lengauer and Tarjan),
+it is easy to program and requires little data space.
--- a/doc/ego/cf/cf4
+++ b/doc/ego/cf/cf4
@@ -0,0 +1,93 @@
+.NH 2
+Loop detection
+.PP
+Loops are detected by using the loop construction
+algorithm of.
+.[~[
+aho compiler design
+.], section 13.1.]
+This algorithm uses \fIback edges\fR.
+A back edge is an edge from B to C in the CFG,
+whose head (C) dominates its tail (B).
+The loop associated with this back edge
+consists of C plus all nodes in the CFG
+that can reach B without going through C.
+.PP
+As an example of how the algorithm works,
+consider the piece of program of Fig. 4.1.
+First just look at the program and think for
+yourself what part of the code constitutes the loop.
+.DS
+loop
+   if cond then                       1
+      -- lots of simple
+      -- assignment
+      -- statements              2          3
+      exit; -- exit loop
+   else
+      S; -- one statement
+   end if;
+end loop;
+
+Fig. 4.1 A misleading loop
+.DE
+Although a human being may be easily deceived
+by the brackets "loop" and "end loop",
+the loop detection algorithm will correctly
+reply that only the test for "cond" and
+the single statement in the false-part
+of the if statement are part of the loop!
+The statements in the true-part only get
+executed once, so there really is no reason at all
+to say they're part of the loop too.
+The CFG contains one back edge, "3->1".
+As node 3 cannot be reached from node 2,
+the latter node is not part of the loop.
+.PP
+A source of problems with the algorithm is the fact
+that different back edges may result in
+the same loop.
+Such an ill-structured loop is
+called a \fImessy\fR loop.
+After a loop has been constructed, it is checked
+if it is really a new loop.
+.PP
+Loops can partly overlap, without one being nested
+inside the other.
+This is the case in the program of Fig. 4.2.
+.DS
+1:                              1
+   S1;
+2:
+   S2;                          2
+   if cond then
+      goto 4;
+   S3;                     3         4
+   goto 1;
+4:
+   S4;
+   goto 1;
+
+Fig. 4.2 Partly overlapping loops
+.DE
+There are two back edges "3->1" and "4->1",
+resulting in the loops {1,2,3} and {1,2,4}.
+With every basic block we associate a set of
+all loops it is part of.
+It is not sufficient just to record its
+most enclosing loop.
+.PP
+After all loops of a procedure are detected, we determine
+the nesting level of every loop.
+Finally, we find all strong and firm blocks of the loop.
+If the loop has only one back edge (i.e. it is not messy),
+the set of firm blocks consists of the
+head of this back edge and its dominators
+in the loop (including the loop entry block).
+A firm block is also strong if it is not a
+successor of a block that may exit the loop;
+a block may exit a loop if it has an (immediate) successor
+that is not part of the loop.
+For messy loops we do not determine the strong
+and firm blocks. These loops are expected
+to occur very rarely.
--- a/doc/ego/cf/cf5
+++ b/doc/ego/cf/cf5
@@ -0,0 +1,79 @@
+.NH 2
+Interprocedural analysis
+.PP
+It is often desirable to know the effects
+a procedure call may have.
+The optimization below is only possible if
+we know for sure that the call to P cannot
+change A.
+.DS
+A := 10;                        A:= 10;
+P;  -- procedure call    -->    P;
+B := A + 2;                     B := 12;
+.DE
+Although it is not possible to predict exactly
+all the effects a procedure call has, we may
+determine a kind of upper bound for it.
+So we compute all variables that may be
+changed by P, although they need not be
+changed at every invocation of P.
+We can get hold of this set by just looking
+at all assignment (store) instructions
+in the body of P.
+EM also has a set of \fIindirect\fR assignment
+instructions,
+i.e. assignment through a pointer variable.
+In general, it is not possible to determine
+which variable is affected by such an assignment.
+In these cases, we just record the fact that P
+does an indirect assignment.
+Note that this does not mean that all variables
+are potentially affected, as the front ends
+may generate messages telling that certain
+variables can never be accessed indirectly.
+We also set a flag if P does a use (load) indirect.
+Note that we only have to look at \fIglobal\fR
+variables.
+If P changes or uses any of its locals,
+this has no effect on its environment.
+Local variables of a lexically enclosing
+procedure can only be accessed indirectly.
+.PP
+A procedure P may of course call another procedure.
+To determine the effects of a call to P,
+we also must know the effects of a call to the second procedure.
+This second one may call a third one, and so on.
+Effectively, we need to compute the \fItransitive closure\fR
+of the effects.
+To do this, we determine for every procedure
+which other procedures it calls.
+This set is the "calling" attribute of a procedure.
+One may regard all these sets as a conceptual graph,
+in which there is an edge from P to Q
+if Q is in the calling set of P. This graph will
+be referred to as the \fIcall graph\fR.
+(Note the resemblance with the control flow graph).
+.PP
+We can detect which procedures are called by P
+by looking at all CAL instructions in its body.
+Unfortunately, a procedure may also be
+called indirectly, via a CAI instruction.
+Yet, only procedures that are used as operand of an LPI
+instruction can be called indirect,
+because this is the only way to take the address of a procedure.
+We determine for every procedure whether it does
+a CAI instruction.
+We also build a set of all procedures used as
+operand of an LPI.
+.sp
+After all procedures have been processed (i.e. all CFGs
+are constructed, all loops are detected,
+all procedures are analyzed to see which variables
+they may change, which procedures they call,
+whether they do a CAI or are used in an LPI) the
+transitive closure of all interprocedural
+information is computed.
+During the same process,
+the calling set of every procedure that uses a CAI
+is extended with the above mentioned set of all
+procedures that can be called indirect.
--- a/doc/ego/cf/cf6
+++ b/doc/ego/cf/cf6
@@ -0,0 +1,21 @@
+.NH 2
+Source files
+.PP
+The sources of CF are in the following files and packages:
+.IP cf.h: 14
+declarations of global variables and data structures
+.IP cf.c:
+the routine main; interprocedural analysis;
+transitive closure
+.IP succ:
+control flow (successor and predecessor)
+.IP idom:
+immediate dominators
+.IP loop:
+loop detection
+.IP get:
+read object and procedure table;
+read EM text and partition it into basic blocks
+.IP put:
+write tables, CFGs and EM text
+.LP