Control-flow analysis

\( \newcommand{\IN}{\mathit{in}} \newcommand{\OUT}{\mathit{out}} \newcommand{\USE}{\mathit{use}} \newcommand{\DEF}{\mathit{def}} \newcommand{\GEN}{\mathit{gen}} \newcommand{\KILL}{\mathit{kill}} \newcommand{\DEFS}{\mathit{defs}} \newcommand\INV{\mathit{INV}} \newcommand{\dom}{\mathop{~\text{dom}~}} \newcommand{\idom}{\mathop{~\text{idom}~}} \)

Loops, dominators, natural loops, control trees, dominator analysis

Loop optimizations are a fruitful class of optimizations because most execution time in programs is spent in loops: a 90/10 split between loops and non-loops is typical. Consequently, many loop optimizations have been developed: for example, loop-invariant code motion, loop unrolling, loop peeling, strength reduction using induction variables, removing bounds checks, and loop tiling.

When should we do loop optimizations? In source or high-level IR form, loops are easy to recognize, but there may be many kinds of loops, all of which can benefit from the same optimizations. Furthermore, loop optimizations often benefit from other optimizations that we want to do on a lower-level representation. We want to be able to interleave loop optimizations with these other optimizations.

In order to do loop optimizations, the first problem we must tackle is to define what we mean by “a loop” at the IR level, and to efficiently find these loops.

Definition of a loop

At the level of a control-flow graph, a loop is a set of nodes that form a strongly connected subgraph: every loop node is reachable from every other node following edges within the subgraph. In addition, there is one distinguished node called the loop header. There are no entering edges to the loop from the rest of the CFG except to the loop header. There may be any number (including 0) nodes with outgoing edges, however; these are loop exit nodes.

Loops in a CFG

For example, the CFG on the right contains three loops as indicated, with header nodes are marked in the corresponding color.

A given CFG may contain multiple loops, and loops, as sets of nodes, may contain each other. If the nodes in loop 1 are a strict superset of the nodes in loop 2, we say that these are nested loops, with loop 2 nested inside loop 1.

Assuming that any two loops only intersect when one is nested inside the other, the loops in the CFG form a control tree in which the nodes are loops and the edges define the nesting relationship.

Control-flow analysis builds on the key idea of dominators, which we have seen earlier. Recall that a node \(A\) dominates another node \(B\) (written \(A\dom B\)) if every path from the start node of the CFG to node \(B\) includes \(A\). An edge from \(A\) to \(B\) is called a forward edge if \(A\dom B\) and a back edge if \(B\dom A\). Every loop must contain at least one back edge.

Natural loops

Each back edge from node \(n\) to node \(h\) in the CFG defines a natural loop with \(h\) as its header node. The natural loop is a strongly connected subgraph that contains both \(n\) and \(h\). It consists of the nodes that are dominated by \(h\) and that can reach \(n\) without going through \(h\). (Any reached nodes that are not dominated by \(h\) must be unreachable from the start node.) Thus, the CFG above contains three natural loops, with their header nodes colored accordingly.

Finding natural loops and control trees

The first step in construction of the program control tree is to compute the domination relation, as described previously. Then, for each back edge \(n⟶h\) (that is, \(h \dom n\)), we run a depth-first search starting from \(n\) in the transposed CFG: that is, following edges backward in the CFG. However, the search is stopped when node \(h\) is encountered. The set of reached nodes that are dominated by \(h\) are the natural loop.

Overlapping natural loops

Having found the natural loops, we can assemble them into a control tree. Observe that two natural loops can be disjoint or can be nested. Or they can intersect without being nested, but only if they share the same header node. In this case, we can treat them as a single loop for the purpose of constructing the control tree. The figure on the right shows an example of two natural loops that share a header and can be merged together in this fashion.

Once overlapping natural loops are merged, all loops are either disjoint or nested. The control tree then falls out directly from subset inclusion on the various loops.

Loop-invariant code motion

Loop-invariant code motion is an optimization in which computations are moved out of loops, making them less expensive. The first step is to identify loop-invariant expressions that take the same value every time they are computed.

An expression is loop-invariant if:

It contains no memory operands that could be affected during the execution of the loop (i.e., that do not alias any memory operands updated during the loop). To be conservative, we could simply not allow memory operands at all, though fetching array lengths is a good example of a loop-invariant computation that can be profitably hoisted before the loop.
And, the definitions it uses (in the sense of reaching definitions) either come from outside the loop, or come from inside the loop but are loop-invariant themselves.

Analysis

The recursive nature of this definition suggests that we should use an iterative algorithm to find the loop-invariant expressions, as a fixed point. The algorithm works as follows:

Run a reaching definitions analysis.
Initialize \(\INV := \{\text{all expressions in loop, including subexpressions}\} \).
Repeat until no change:
- Remove all expressions from \(\INV\) that:
  - use a memory operand whose value might change in the loop
  - might cause a side effect that prevents hoisting, including not only writes to memory locations but also exceptions and nontermination.
  - use variables \(x\) with more than one definition inside the loop, or whose single definition \(x←e\) in the loop has \(e∉\INV\).
  (The final case is simplified if the code is in SSA form.)

Code transformation

There are actually two kinds of loop-invariant code motion. The first hoists an assignment to a variable before the loop, the other some computation done inside the loop.

In the first version, we can move the assignment \(x←e\) with loop-invariant expression \(e\) before the loop header if:

it is the only definition of \(x\) in the loop,
it dominates all loop exits where \(x\) is live-out, and
it is the only definition of \(x\) reaching uses of \(x\) in the loop: it is not live-in at the loop header.

If these conditions are not satisfied, we may still be able to perform the second kind of loop-invariant code motion. Here the idea is to hoist the computation of loop-invariant expression (or subexpression) \(e\) out of the loop and assign it to a new variable \(t\). Then the occurrences of the expression \(e\) are replaced with \(t\). If the value of \(e\) is assigned to a variable \(x\), the assignment becomes \(x ← t\), which may enable copy propagation approximating the effect of the first version of the optimization.