Lesson 5: Global Analysis
- discussion thread
- global analysis & optimization
- tasks due February 24
Gist
Dominators
Lots of definitions!
- Reminders: Successors & predecessors. Paths in CFGs.
- A dominates B iff all paths from the entry to B include A.
- The dominator tree is a convenient data structure for storing the dominance relationships in an entire function. The recursive children of a given node in a tree are the nodes that that node dominates.
- A strictly dominates B iff A dominates B and A ≠ B. (Dominance is reflexive, so “strict” dominance just takes that part away.)
- A immediately dominates B iff A dominates B but A does not strictly dominate any other node that strictly dominates B. (In which case A is B’s direct parent in the dominator tree.)
- A dominance frontier is the set of nodes that are just “one edge away” from being dominated by a given node. Put differently, A’s dominance frontier contains B iff A does not strictly dominate B, but A does dominate some predecessor of B.
- Post-dominance is the reverse of dominance. A post-dominates B iff all paths from B to the exit include A. (You can extend the strict version, the immediate version, trees, etc. to post-dominance.)
An algorithm for finding dominators:
dom = {every block -> all blocks}
dom[entry] = {entry}
while dom is still changing:
for vertex in CFG except entry:
dom[vertex] = {vertex} ∪ ⋂(dom[p] for p in vertex.preds}
The dom
relation will, in the end, map each block to its set of dominators.
We initialize it as the “complete” relation, i.e., mapping every block to the set of all blocks.
The exception is the entry block, which we ensure always only has itself as a dominator.
(This keeps the algorithm from being confused by blocks that jump back to the entry node.)
The loop pares down the sets by iterating to convergence.
The running time is O(n²) in the worst case. But there’s a trick: if you iterate over the CFG in reverse post-order, and the CFG is well behaved (reducible), it runs in linear time—the outer loop runs a constant number of times.
Natural Loops
Some things about loops:
- Natural loops are strongly connected components in the CFG with a single entry.
- Natural loops are formed around backedges, which are edges from A to B where B dominates A.
- (Side note: There are actually two common definitions of backedges: this one, and one that relies on a depth-first search (DFS). By the other definition, a backedge is any edge that takes you to an already-visited node during DFS. The relationship between these two definitions is not 100% clear to me, although they are certainly not equivalent, at least for irreducible CFGs.)
- A natural loop is the smallest set of vertices L including A and B such that, for every v in L, either all the predecessors of v are in L or v=B.
- A CFG is reducible iff every backedge has a natural loop.
- A language that only has
for
,while
,if
,break
,continue
, etc. can only generate reducible CFGs. You needgoto
or something to generate irreducible CFGs.
- A language that only has
Loop-Invariant Code Motion (LICM)
And finally, loop-invariant code motion (LICM) is an optimization that works on natural loops. It moves code from inside a loop to before the loop, if the computation always does the same thing on every iteration of the loop.
A loop’s preheader is its header’s unique predecessor. LICM moves code to the preheader. But while natural loops need to have a unique header, the header does not necessarily have a unique predecessor. So it’s often convenient to invent an empty preheader block that jumps directly to the header, and then move all the in-edges to the header to point there instead.
LICM needs two ingredients: identifying loop-invariant instructions in the loop body, and deciding when it’s safe to move one from the body to the preheader.
To identify loop-invariant instructions:
iterate to convergence:
for every instruction in the loop:
mark it as LI iff, for all arguments x, either:
all reaching defintions of x are outside of the loop, or
there is exactly one definition, and it is already marked as
loop invariant
(This determination requires that you already calculated reaching definitions! Presumably using data flow.)
It’s safe to move a loop-invariant instruction to the preheader iff:
- The definition dominates all of its uses, and
- No other definitions of the same variable exist in the loop, and
- The instruction dominates all loop exits.
The last criterion is somewhat tricky: it ensures that the computation would have been computed eventually anyway, so it’s safe to just do it earlier. But it’s not true of loops that may execute zero times, which, when you think about it, rules out most for
loops! It’s possible to relax this condition if:
- The assigned-to variable is dead after the loop, and
- The instruction can’t have side effects, including exceptions—generally ruling out division because it might divide by zero. (A thing that you generally need to be careful of in such speculative optimizations that do computations that might not actually be necessary.)
Tasks
- Implement some dominance utilities:
- Find dominators for a function.
- Construct the dominance tree.
- Compute the dominance frontier.
- Devise a way to test your implementations. For example, is there a way you can algorithmically confirm that a block A dominates a block B? While computing these sets should be cheap, checking their output could use slow, naive algorithms.