In the second part of the problem set, you will write an optimizer that will take in Scheme programs written in a subset of Scheme and output an equivalent program that runs faster.
The subset consists of expressions that have no side effects, only arithmetic and boolean primitives, with special forms limited to if, lambda and let. Furthermore, the let expressions contain only one binding.
Even this language subset includes very complicated expressions. Our strategy is to produce an *intermediate form* from an expression and then optimize that intermediate form. To see why this is necessary, consider the expression (f (g x) (g x)). We want to turn this into something like
(let ((val1 (g x))) (let ((val2 (g x))) (let ((val3 (f val1 val2))) val3))).
To evaluate the original expression, we evaluate (g x), then we evaluate (g x), then we invoke f on the first result and the second result. But in Scheme, these intermediate results are implicit. We need to make them explicit, through a process we call linearization.
We have provided you with a function linearize that takes an expression in the subset of Scheme mentioned above and returns a linear expression. A linear expression is essentially a giant series of let's which eventually returns a value in the body. Every let binding involves a single computation. Experiment with linearize to see what it is doing. You don't need to understand how it works for this problem set. Especially, try (+ 2 3), (if (< 2 3) (+ 4 5) (/6 0))), (lambda (x) (+ x 2)). Note that (linearize '(if (< 2 3) 4 (/5 0)))) is not
(let ((var1 (< 2 3))) (let ((var2 4)) (let ((var3 (/ 5 0))) (let ((var4 (if var1 var2 var3))) var4))))
WHY?
You are also given a function optimize. Currently it doesn't do anything useful. You will extend it so that it can do several program optimizations on the input. The input is assumed to be a linear expression, as it is easier to do the optimizations on a linear expression.
Hence, for this subset, compilation will be first linearizing the input expression, then optimizing it and then reducing it for the interpreter.
Consider the Scheme expression
(let ((x (+ 2 3))) x)
Here, if we replace (+ 2 3) with 5, the expression will have exactly the same value, except it will evaluate faster. In the source code, you are given a function const-fold that currently does nothing. Using the utility functions provided, extend const-fold so that given an expression e1, it will return a new expression e2, such that all arithmetic expressions with known operator and known operands are replaced by their values:
==>(const-fold '(* 2 3)) 6 ==>(const-fold '(* x 2)) (* x 2)Then add calls to your const-fold in the optimizer so that optimize now does constant folding:
==>(optimize '(let ((x (+ 2 3))) x)) (let ((x 5)) x) ==>(optimize '(let ((f (lambda (x) (let ((y (+ 3 5))) y)))) (let ((x 2)) x))) (let ((f (lambda (x) (let ((y 8)) y)) (let ((x 2)) x))
Note that the input expressions to optimize can be assumed to be linear. Make sure that you can handle +, -, *, /. The above examples are not enough for output. You need to test all the possibilities and convince us that you did so.
The optimizer folds integer constants. In this problem, extend const-fold so that it can deal with booleans. Eventually const-fold will behave like:
==>(const-fold '(= 3 4)) #f ==>(const-fold '(not #f)) #t
Make sure that you can handle =, <, <=, >, >=, not. optimize should behave as follows:
==>(optimize '(let ((x (< 2 3))) (let ((y (if x (let ((z (+ 1 2))) z ) 4))) y))) (let ((x #t)) (let ((y (if x (let (( z 3)) z) 4))) y))
Give output for both const-fold and optimize. The above examples are not enough for output. You need to test all the possibilities and convince us that you did so.
Using the same ideas, expressions containing variables can be simplified:
==>(optimize '(let ((x (* y 1))) x)) (let ((x y)) x)
Extend const-fold, so that it can make the following simplifications:
(+ x 0) = x (+ 0 x) = x (+ x x) = (* 2 x) (- x 0) = x (- x x) = 0 (* x 1) = x (* 1 x) = x (* x 0) = 0 (* 0 x) = 0 (/ x 1) = x
As usual, give convincing output.
Dead code in a program is parts of that program that will not affect the output of the program. Hence, e2 in (if #t e1 e2) is dead code. In this problem, extend optimize-op so that you delete dead code in the if-expressions:
==>(optimize '(let ((x (if #t (let ((y (+ 2 2))) y) (let ((z 3)) z)))) x))something equivalent to (let ((y 4)) (let ((x y)) x)) (the same expression, with a change of variable names).
Make sure the output is linear. As in the above cases, give plenty of examples of output.
Previous optimizations can create expressions like:
(let ((x 1)) (let ((y x)) (let ((z (+ y 1))) z)))
Here, y is useless and we can put an x for each y in the rest of the code to get the following equivalent expression:
(let ((x 1)) (let ((z (+ x 1))) z))
Extend optimize-op so that you clean up useless variables. You can use the functions given to you in the source code. As usual, give convincing output.
We can sometimes have other redundancies in our code. For example, (+ a b) is unnecessarily computed twice in the following:
(let ((x (+ a b))) (let ((y 3)) (let ((z (+ a b))) z)))
The equivalent code
(let ((x (+ a b))) (let ((y 3)) (let ((z x)) z)))
is better.
The process of eliminating this type of redundancies is called common subexpression elimination. Note that we can do this transformation only if there are no name clashes:
(let ((x (+ a b))) (let ((a 3)) (let ((y (+ a b))) y)))
cannot be eliminated. However, the function linearize renames variables and makes sure there are no such name clashes.
Extend optimize-op so that you eliminate common subexpressions. Hint: you can use some of the functions in the source code. As usual, give convincing output.
Make sure your code is written in such a way that useless variable elimination will clean up useless variables at the end. So, the above expression will be further optimized to:
(let ((x (+ a b))) (let ((y 3)) x))
We can do better than what we did in problem 1. (+ b a) in the following can also be eliminated:
(let ((x (+ a b))) (let ((y 3)) (let ((z (+ b a))) z)))
Write a function equivalent? that checks whether two basic expressions are equivalent:
==>(equivalent? '(+ x y) '(+ y x)) #t ==>(equivalent? '(- x y) '(- y x)) #f
Make sure you can handle the arithmetic and boolean operators. Now, use this function and extend any other function you might need to in the source code to handle this extended case of common subexpression elimination. Don't forget to give convincing output.