The substitution method is a condensed way of proving an asymptotic bound on a recurrence by induction. In the substitution method, instead of trying to find an exact closed-form solution, we only try to find a closed-form bound on the recurrence. This is often much easier than finding a full closed-form solution, as there is much greater leeway in dealing with constants.
The substitution method is a powerful approach that is able to prove upper bounds for almost all recurrences. However, its power is not always needed; for certain types of recurrences, the master method (see below) can be used to derive a tight bound with less work. In those cases, it is better to simply use the master method, and to save the substitution method for recurrences that actually need its full power.
Note that the substitution method still requires the use of induction. The induction will always be of the same basic form, but it is still important to state the property you are trying to prove, split into one or more base cases and the inductive case, and note when the inductive hypothesis is being used.
Consider the following reccurence relation, which shows up fairly frequently for some types of algorithms:
T(1) = 1
T(n) = 2T(n-1) + c1
By expanding this out a bit (using the "iteration method"), we can guess that this will be O(2n). To use the substitution method to prove this bound, we now need to guess a closed-form upper bound based on this asymptotic bound. We will guess an upper bound of k2n - b, where b is some constant. We include the b in anticipation of having to deal with the constant c1 that appears in the recurrence relation, and because it does no harm. In the process of proving this bound by induction, we will generate a set of constraints on k and b, and if b turns out to be unnecessary, we will be able to set it to whatever we want at the end.
Our property, then, is T(n) ≤ k2n - b, for some two constants k and b. Note that this property logically implies that T(n) is O(2n), which can be verified with reference to the definition of O.
Base case: n = 1. T(1) = 1 ≤ k21 - b = 2k - b. This is true as long as k ≥ (b + 1)/2.
Inductive case: We assume our property is true for n - 1. We now want to show that it is true for n.
T(n) = 2T(n-1) + c1 ≤ 2(k2n - 1 - b) + c1 (by IH) = k2n - 2b + c1 ≤ k2n - b
This is true as long as b ≥ c1.
So we end up with two constraints that need to be satisfied for this proof to work, and we can satisfy them simply by letting b = c1 and k = (b + 1)/2, which is always possible, as the definition of O allows us to choose any constant. Therefore, we have proved that our property is true, and so T(n) is O(2n).
The biggest thing worth noting about this proof is the importance of adding additional terms to the upper bound we assume. In almost all cases in which the recurrence has constants or lower-order terms, it will be necessary to have additional terms in the upper bound to "cancel out" the constants or lower-order terms. Without the right additional terms, the inductive case of the proof will get stuck in the middle, or generate an impossible constraint; this is a signal to go back to your upper bound and determine what else needs to be added to it that will allow the proof to proceed without causing the bound to change in asymptotic terms.
The master method gives us a quick way to find solutions to recurrence relations of the form T(n) = aT(n/b) + h(n), where a and b are constants, a ≥ 1 and b > 1. Conceptually, a represents how many recursive calls are made, b represents the factor by which the work is reduced in each recursive call, and h(n) represents how much work is done by each call apart from the recursion, as a function of n.
Once we have a recurrence relation of that form, the master method tells us the solution based on the relation between a, b, and h(n), as follows:
The simplest application of the master method is to a recurrence relation with fixed a, b, and h(n). Given such a recurrence relation, all you need to do is prove it falls under a particular case and then present the conclusion derived from that case. Note that the master method does not help you come up with a recurrence relation in the first place; all it does is help you solve recurrence relations once you have them. Therefore, for analyzing the runtime of algorithms, the first step still must be to derive a recurrence relation for the runtime.
Example 1: Say you have derived the recurrence relation T(n) = 8T(n/2) + cn2, where c is some positive constant. We see that this has the appropriate form for applying the master method, and that a=8, b=2, and h(n) = cn2. cn2 is O(nlog28 - ε) = O(n3 - ε) for any ε ≤ 1, so this falls into case 1. Therefore, T(n) is Θ(n3).
Example 2: Say you have derived the recurrence relation T(n) = T(n/2) + cn, where c is some positive constant. We see that this has the appropriate form for applying the master method, and that a=1, b=2, and h(n) = cn. cn is Ω(nlog21 + ε) = Ω(nε) for any ε ≤ 1, so this falls into case 3. And ah(n/b) = cn/2 ≤ cn = kh(n) for k=1 Therefore, T(n) is Θ(n).
Example 3: Say you have derived the recurrence relation T(n) = 8T(n/4) + cn3/2, where c is some positive constant. We see that this has the appropriate form for applying the master method, and that a=8, b=4, and h(n) = cn3/2. cn3/2 is Θ(nlog48) = Θ(n3/2), so this falls into case 2. Therefore, T(n) is Θ(n3/2log n).
The master method can also be used to analyze recurrences where one of a, b, or h(n) varies. The structure of the master method's three cases in asymptotic is roughly that of a "less than" case, an "equal" case, and a "greater than" case, and so given a recurrence relation with a, b, or h(n) unknown, we can use the master method to analyze it for a range of values of the unknown by determing the value of the unknown at which the recurrence relation falls into case 2 (the "equal") case. Once that is shown, it is usually easy to prove that all greater values of the unknown will be covered by one of the two remaining cases, and all lesser values of the unknown will be covered by the final case.
As a and b, the range for them will typically be all values for which the master method can be applied. For h(n), it is impossible to analyze for all possible functions, so the range will normally be restricted to functions of a particular type, such as cnk for k ≥ 0.
Consider, for example, the recurrence relation
T(n) = 6T(n/b) + cn2, with a=6, h(n) = cn, and b unknown. To fall into case 2, we need cn2 to be Θ(nlogb6), which requires logb6 = 2, so b = √6. And we can see that for b > √6, we will have logb6 < 2, so we will fall into case 3, and the reverse will be true for b < √6, so we will fall into case 1.
Applying those three cases, we determine that T(n)
Note that as occured in the example, analyzing a class of recurrence relations in this fashion may produce a "tipping point" for the case 2 value that is not an integer (and might not make sense in the original context). This is not worrisome – even if the original context is integer-only, we still learn where the split between case 1 and case 3 occurs.
While the master method is very useful in practice, it is worth keeping in mind that there are recurrence relations that it cannot solve. It is always important to verify that the conditions required by the master method hold, as it is possible for a recurrence relation to superficially look to be in the right form, but still violate the requirements or not fall into any of the cases.
For example, T(n) = 2T(n/2) + n/(log n) satisfies all the explicit requirements: we have a=2, b=2, and h(n) = n/(log n). Nevertheless, it does not fit any of the cases:
So we actually can't use the master method to solve this recurrence relation. We can, however, still derive an upper bound for this recurrence by using a little trick: we find a similar recurrence that is larger than T(n), analyze the new recurrence using the master method, and use the result as an upper bound for T(n).
T(n) = 2T(n/2) + n/(log n) ≤ 2T(n/2) + n, so if we call R(n) the function such that R(n)=2T(n/2)+ n, we know that R(n)≥T(n). This is something we can apply the master method to: n is Θ(n), so R(n) is Θ(n log n). Since T(n) ≤R(n), we can conclude that T(n) is O(n log n). Note that we only end up with O, not Θ; since we were only able apply the master method indirectly, we could not show a tight bound.
Recurrence relations not solvable by the master method