Brief comments re: Prelim #2.
Today:
preserving the semantics of the language
Compilers versus Evaluators (or: all about PS#6!)
The evaluator takes a program as input and runs it, returning its value
Evaluator contract is
Program ---> [ Evaluator ] ---> Value
Compiler contract is
Program ---> [ Compiler ] ---> Program' ---> [ Evaluator ] ---> Value
where we preserve the semantics of the language:
(eval P env) = (eval (compile P) env)
Typically the output of the compiler is a different language (such as
PPC assembler, which is interpreted by the PPC chip). In CS212, the
output of the compiler will be a subset of Scheme.
Our compiler thus performs source-to-source transformation. Sometimes called "translators".
The input is a Scheme program (represented as a list); the output is a Scheme
program. In many respects the compiler and evaluator are similar --
they are programs that "walk over" source code. An evaluator computes
a value on each recursive call, while a compiler computes code which
will eventually compute a value.
Why bother? Program' is just like Program, only faster. To do
this, the compiler reasons about Scheme programs (although the
reasoning is quite simple).
Combining 2 themes behind 212: reasoning about programs, and efficiency. Note that the efficiency gained is never something that shows up in asymptotic analysis, but it�s still important
To see why this might be useful, consider defining
(define useless (lambda (x) (+ (* 3 5) x)))
(map useless '(1 2 3 ... 100))
How does this work in the evaluator?
... extend global env by [x: 1] ... evaluate (+ (* 3 5) x) ... evaluate (* 3 5)...
... extend global env by [x: 2] ... evaluate (+ (* 3 5) x) ... evaluate (* 3 5)...
.
.
.
... extend global env by [x: 100] ... evaluate (+ (* 3 5) x) ... evaluate (* 3 5)...
That's a lot of evaluations of (* 3 5)
Note: you might not write code like this, but a macro could (see lecture in 1 week). Or in-line functions could (suppose you call someone else's code).
Usually there is a Program' that is a *lot* faster (typically 100-1000 times). Compilers involve getting from Program to Program'
In this example, we want to get from
(lambda (x) (+ (* 3 5) x)) to (lambda (x) (+ 15 x))
This is (a simple) part of PS#6.
To make life easier, we will consider only compiling a subset of Scheme programs.
Note: we�d really need LETREC too (why?)
Even this language subset includes very complicated expressions. Our strategy is to produce an intermediate form from an expression and then optimize that intermediate form. [Note: this is how all compilers work.]
To see why this is necessary, consider the expression
(f (g x) (g x))
We want to turn this into something like (let ((temp (g x))) (f temp temp))
To evaluate this expression, we evaluate (g x), then we evaluate (g
x), then we invoke f on the first result and the second result. But
in Scheme, these intermediate results are implicit. We need to make
them explicit, through a process we call LINEARIZATION.
It's a little hair raising in places (we'll provide the code for those
who want to look at it). You should know what a linearized expression
is, but not necessarily how to write code to linearize one.
Linearization will produce an intermediate form like:
(let ((val1 (g x)))
(let ((val2 (g x)))
(let ((val3 (f val1 val2)))
t3)))
We will then optimize this intermediate form to produce
(let ((val1 (g x)))
(let ((val2 val1))
(let ((val3 (f val1 val2)))
t3)))
which optimizes out one call to g [NOTE: side effects are harmful!]
You will write this optimization in PS#6
In the linearized form two things are made explicit:
There are thus 2 parts to the compilation process:
Program ---> [Linearizer] ---> Linearized Program ---> [Optimizer] ---> Program'
As before, everything will be a Scheme subset [label the languages above]
The output of the linearizer, which will also be the output of the
optimizer (and hence of the compiler) will be a very restricted
subset of Scheme, called Linear Scheme (Linear-S for short).
Scheme subset ---> [Linearizer] ---> Linear-S ---> [Optimizer] ---> Linear-S
The key property of Linear-S is that all combinations are SIMPLE.
A combination is SIMPLE if the operator and the operands are all
atomic (i.e., symbols or numbers). For example, (f a 23) is simple,
while ((f) (g)) is not.
In addition, in conditionals the test is required to be atomic. So
(if x 1 2) is simple, while (if (not x) 1 2) is not. In fact, the
latter expression would be linearized to
(let ((val1 (not x)))
(let ((val2 (if val1 1 2)))
val2))
[Note: we can not turn (if e1 e2 e3) into
(let ((v1 e1))
(let ((v2 e2))
(let ((v3 e3))
(let ((v4 (if v1 v2 v3)))
v4))))
Why?]
A Linear-S expression is essentially a giant series of LETs which
eventually returns a value in the body. Every let involves a single
simple computation (no nesting).
An important part of linearization is called ALPHA-renaming.
Basically, whenever we see a LAMBDA we need to give its parameters
unique names, or we will get confused. [Note that this would have simplified the change! prelim question considerably�]
For example, consider
((lambda (f) (f x)) (lambda (f) f))
(which applies the identity function to x)
will be alpha-renamed to
((lambda (f1) (f1 x)) (lambda (f2) f2))
After alpha-renaming, we can be sure that any two variables with the same name are the same variable.
OK, we now have linearized code. How do we optimize it? The
optimizations we will consider are all fairly simple, although they
can improve your code a lot. Remember: compile Program to Program� (once), then run Program� (possibly, many times).
To understand optimization, we need to go back to our first example
and think about the relationship between compilation and evaluation
(define useless (lambda (x) (+ (* 3 5) x)))
We can try and turn this into a better piece of code, but we have to
bear in mind that we have no idea what the value of x is. In fact,
we won't know until we actually apply this procedure to something
(i.e., at run time).
On the other hand, we know what the value of (* 3 5) is, irrespective
of the value of x (i.e., at compile time).
Important lesson:
Obvious consequence: if you want to optimize a program that doesn't
contain any procedures, you can simply compute the value. If the
compiler is given some complex arithmetic expression, it should simply
return the value.
[Non-obvious consequence: compiler writers can (and do!) "cheat" on various benchmarks, by emitting the answer or special purpose code.]
Four optimization rules to live by:
Here is a short description of some optimizations we will look at (and implement!) Note that we are always doing substitutions of some kind.
OPTIMIZATION |
DESCRIPTION |
Constant folding |
Replace VARIABLES with VALUES |
Common Subexpression Elimination |
Replace EXPRESSIONS with VARIABLES |
Inlining |
Replace PROCEDURE CALLS with EXPRESSIONS |
Part of what a compiler does can be described as PARTIAL EVALUATION.
We take code like:
(lambda (x) (+ (* 3 5) x))
and return code somewhat like
(lambda (x) (+ 15 x))
In essence, anything can be computed at compile time should be
computed.
The simplest such optimization is called CONSTANT FOLDING, which
replaces operations by constants where possible.
Given an Linear-S expression like
(let ((val1 (* 3 5)))
(let ((val2 (+ val1 x)))
val2))
constant folding will produce a Linear-S expression like
(let ((val1 15))
(let ((val2 (+ val1 x)))
val2))
Basic idea:
(let ((val1 (* 3 5)))
BODY)
---> replace with a new BODY with val1 replaced by 15 wherever it occurs (and no let)
Note: what if F is known to be a lambda at compile time and G, H are
constants?? We need something like an evaluator in our compiler! A
real partial evaluator requires a lot of work...
Sample:
(let ((val1 14))
(let ((val2 (lambda (val3) (* val1 3))))
(let ((val3 (val2 run-time-variable)))
val3)))
==> 42
We�ll come back to this kind of constant folding later � it�s called inlining, and is more or less what macros do. [One way to think of a macro is as code run at compile time, which produces code run at run time�]
There are other related optimizations which aren't quite partial
evaluation, but which are similar in flavor.
Example: algebraic simplification. The simplest examples can be
handled by pattern matching -- look for (let ((x (* y 0))) ...) and
the like. We've done something like this already, just not as part of
a compiler.
More complex examples involve quite non-trivial computation; do two
arbitrary expressions compute the same value? For arithmetic
expressions there is actually a pretty simple algorithm, based on the fact that zeros of polynomials are sparse.
An interesting example is COMMON SUBEXPRESSION ELIMINATION. Compute
something once - why compute it again?
Consider an expression like (f (+ a b) (+ a b))
Linearizer produces
(let ((val1 (+ a b)))
(let ((val2 (+ a b)))
(let ((val3 (f val1 val2)))
val3)))
We'd like to avoid computing (+ a b) twice.
This is actually pretty similar to constant folding --
if we know something at compile time we don't need to
recompute it. However, while in constant folding we replace
variables with values, in common subexpression elimination
we replace expressions with variables.
This needs to be converted to
(let ((val1 (+ a b)))
(let ((val2 (f val1 val1)))
val2))
The key is to see that val1 and val2 are the same expression, and
replace them by one computation.
Another example: dead code elimination
When an IF's test value is known at compile time, we can eliminate the
consequent or alternate.
A similar procedure can get rid of useless lets like (let ((var1
var2)) ...)
Procedure inlining
There is overhead involved in a procedure call. If we have
(define foo (lambda () (* a c)))
then
(+ (foo) (foo))
is slower than
(+ (* a c) (* a c))
[not much, but it can matter inside a loop!]
One solution is to write your code as macros. Disadvantages?
* Kind of painful (macros are hard to debug)
* Space versus time
Alternative: inlining
Note that in Linear-S we leave calls to lambdas alone. Thus
(+ (foo) (foo))
is linearized into
(let ((val1 (foo)))
(let ((val2 (foo)))
(let ((val3 (+ val1 val2)))
val3)))
which after eliminating common subexpressions becomes:
(let ((val1 (foo)))
(let ((val2 val1))
(let ((val3 (+ val1 val2)))
val3)))
but we might be better off inlining the call to foo...
This depends, of course, on what foo does. Can you tell what a
function does without running it? NO! See last lecture of CS212.
Summary: reasoning about programs has some inherent limits.
For some simple functions, we can "inline" (or "open-code") them.
When is this a fatal error? Recursion!
Still, this can be useful. ANSI C supports an "inline" declaration.
Use this at your own risk (the time-space tradeoffs are not always
obvious!)
Limited inlining of recursive code can be very useful (it's called loop
unrolling).
Final note: many of these optimizations enable each other. Doing
inlining can enable common subexpression, for example. This
combination can be pretty similar to simply memoizing (which we
talked about in streams).
(+ (foo) (bar)) ==>
(+ (* a c) (* c a)) ==>
(let ((val1 (* a c)))
(let ((val2 (+ val1 val1)))
val2))
Or eliminating dead code can enable partial evaluation:
(lambda (x) (if (foo) (y) (f x))) è [inlining] (lambda (x) (if #t (y) (f x))) è [dead code elimination] (y) etc�
A typical compiler makes several passes over the code, doing a bunch
of different optimizations. Some passes need to be done more than
once. How this is done is beyond the scope of this course. Some of it is beyond the scope of the instructor! Typically, they don�t do all possible optimizations � this is one of the things that compiler flags control!
Just in time compilation (a.k.a. dynamic compilation). Examples: Java, Apple PowerPC emulator for 68K.
Compiler:
advantage: runs "offline".
disadvantage: don�t know values of run-time parameters
Interpreter tradeoffs are the opposite.
Suppose someone runs a big piece of code in the interpreter (why would anyone do this?) At run time, the parameters are known [lookup rule failure in compiler à "oh well", interpreter à debugger]. It may well be worthwhile to compile the users code, especially if there are loops. But we can now do even better than the regular compiler, because we know the values of all parameters!
This can be done even if the original code is compiled.
Big lessons: