3.2. Recursion

As we have seen throughout the book, there are many cases where we must repeat operations and can do so with a for or while loop. This form of repetition is formally known as iteration and involves defining a loop condition and a block of statements to be repeated as long as that condition is true. Each repetition of the block of statements is called an iteration and an algorithm that uses this style of repetition is called an iterative algorithm.

In this chapter, we will discuss a different way of expressing repetition: recursion. Recursion is equally expressive as iteration, meaning that anything we can do with iteration can also be done with recursion (and vice versa). There are some algorithms, however, where a recursive solution will require less code and will result in cleaner and more intuitive code than the equivalent iterative solution. In this chapter, we will introduce recursion and work through several problems that lend themselves naturally to a recursive solution.

3.2.1. Factorials

Let’s look at the factorial operation:

\[4! = 4\cdot 3 \cdot 2 \cdot 1 = 24\]

One of the formal definitions of factorial is:

\[n!=\prod_{k=1}^n k \!\]

We can implement this definition using a for loop:

def factorial(n):
    '''  Compute n! iteratively '''

    rv = 1
    for k in range(1, n+1):
        rv = rv * k
    return rv
>>> factorial(4)
24
>>> factorial(5)
120

Factorials can also be defined as follows:

\[\begin{split}n!=\left\{ \begin{array}{ll} 1 & \textrm{if n=1}\\ n\cdot(n-1)! & \textrm{if n>1} \end{array} \right.\end{split}\]

This is a recursive definition. Notice how there is no reference to loops or to repetition. Instead, factorials are defined in terms of themselves. At first, this approach may seem odd. Imagine finding a definition in the dictionary that looked like this:

Recursion: See recursion.

You would keep coming back to the definition of “recursion” infinitely!

However, the recursive definition of factorial works because it is divided into two cases:

  • The base case: In this case, the value of the factorial can be obtained immediately and trivially. In the case where \(n=1\), the value of the factorial is known immediately and without further computation: it is simply 1.

  • The recursive case: In this case, we define the factorial in terms of itself. For factorials, we can define \(n!\) as \(n\) times \((n-1)!\)

So, if we were computing \(4!\), we would start in the recursive case, which tells us that:

\[4! = 4\cdot 3!\]

To evaluate this formula, we need to find the value of \(3!\) which, again, involves the recursive case:

\[3! = 3\cdot 2!\]

Similarly for \(2!\):

\[2! = 2\cdot 1!\]

But, when we get to \(1!\), we know that \(1!=1\), so we can plug \(1\) into the above formula, and we get:

\[2! = 2\cdot 1 = 2\]

Now that we know the value of \(2!\), we can plug that into this formula:

\[3! = 3\cdot 2!\]

And we get:

\[3! = 3\cdot 2 = 6\]

And, finally, now that we know the value of \(3!\), plug it into the formula for \(4!\):

\[4! = 4\cdot 3!\]

And we get:

\[4! = 4\cdot 6 = 24\]

Notice that the recursive case must be defined so that it gets us closer to the base case when we use it; otherwise, we would fall into infinite recursion.

We can implement our recursive definition of factorial in Python like this:

def factorial_r(n):
    '''  Compute n! recursively '''

    if n == 1:
       return 1
    elif n > 1:
       return n * factorial_r(n-1)
>>> factorial_r(4)
24
>>> factorial_r(5)
120

Notice that our function factorial_r calls itself. We refer to such functions as recursive functions, and we refer to the point where the function calls itself as a recursive call. While the concept of recursion can be easy to understand at a high level (think, for example, of how easily we defined factorials recursively), writing recursive functions and understanding what happens during a recursive call often stumps beginning programmers.

So, we are going to spend some time dissecting exactly what happens in a recursive function. After that, we will work through several examples of recursive algorithms that will help us to understand how to design recursive functions, as well as when to use a recursive algorithm instead of an iterative solution.

3.2.2. The anatomy of a recursive function call

To show what happens in the factorial_r function, we have prepared a more verbose version that will do the same thing as the function shown earlier, but will print messages to help us understand exactly what’s going on. Don’t worry if you don’t understand how we’re formatting the output (especially how we indent the messages, which requires using an extra parameter). Focus instead on following what happens during each recursive call.

def factorial_r(n, indent=""):
    '''
    Compute n! recursively and print information about the
    computation along the way.

    Inputs:
      n (int): operand
      indent (string): spaces to use as a prefix when printing.

    Returns (int): n!
    '''

    if n == 1:
       print(indent + "factorial_r(1) -- BASE CASE -- The value of 1! is 1")
       print()
       return 1
    elif n > 1:
       print(indent + "factorial_r({}) -- START OF RECURSIVE CASE".format(n))
       print(indent + "                  The value of {}! is {}*{}!".format(n, n, n-1))
       print(indent + "                  I need to find out the value of {}!".format(n-1))
       print()
       x = factorial_r(n-1, indent=indent+"    ")
       print(indent + "factorial_r({}) -- END OF RECURSIVE CASE".format(n))
       print(indent + "                  I now know that {}! is {}".format(n-1, x))
       print(indent + "                  Which means that {}! = {}*{}".format(n, n, x))
       print()
       return n * x
>>> factorial_r(4)
factorial_r(4) -- START OF RECURSIVE CASE
                  The value of 4! is 4*3!
                  I need to find out the value of 3!

    factorial_r(3) -- START OF RECURSIVE CASE
                      The value of 3! is 3*2!
                      I need to find out the value of 2!

        factorial_r(2) -- START OF RECURSIVE CASE
                          The value of 2! is 2*1!
                          I need to find out the value of 1!

            factorial_r(1) -- BASE CASE -- The value of 1! is 1

        factorial_r(2) -- END OF RECURSIVE CASE
                          I now know that 1! is 1
                          Which means that 2! = 2*1

    factorial_r(3) -- END OF RECURSIVE CASE
                      I now know that 2! is 2
                      Which means that 3! = 3*2

factorial_r(4) -- END OF RECURSIVE CASE
                  I now know that 3! is 6
                  Which means that 4! = 4*6

24

Notice how, whenever we run into a recursive case, we put that function call on “hold” while we go deeper into the recursion rabbit hole until we get to a base case and can start “wrapping up” all of the recursive cases that we put on hold. Examining the function call stack (see The function call stack in chapter Introduction to Functions) can help explain what happens during a call to a recursive function. This process is a fairly low-level aspect of how recursive calls work, and so you can skip this discussion for now if you want. However, if you’re the kind of person who understands concepts better if you know exactly what happens under the hood, read on.

When a call is made to factorial_r(4), the following entry is added to the function call stack. For simplicity, we will omit the indent parameter, which is used purely for formatting purposes.

Function: factorial_r

Parameters:

  • n: 4

Variables:

  • x: undefined

Return value: None

When the function reaches the recursive call (factorial_r(n-1)), it will need to make a call to factorial_r(3) before it can set a value for x. So, we add an entry for factorial_r(3):

Function: factorial_r

Parameters:

  • n: 4

Variables:

  • x: undefined

Return value: None

Function: factorial_r

Parameters:

  • n: 3

Variables:

  • x: undefined

Return value: None

Recall that our stacks grow down the page, so we added the frame for the call factorial_r(3) beneath the frame for the call factorial_4.

This process is repeated for every recursive call, until we reach the recursive call that triggers the base case:

Function: factorial_r

Parameters:

  • n: 4

Variables:

  • x: undefined

Return value: None

Function: factorial_r

Parameters:

  • n: 3

Variables:

  • x: undefined

Return value: None

Function: factorial_r

Parameters:

  • n: 2

Variables:

  • x: undefined

Return value: None

Function: factorial_r

Parameters:

  • n: 1

Variables:

  • x: undefined

Return value: None

At this point, we have a function call stack that is holding information for all the calls from factorial_r(4) to factorial_r(1). When we reach the base case, the recursion starts to unwind because we have reached a point where factorial_r can return a value without having to make any more recursive calls. The last entry in the function call stack will return 1:

Function: factorial_r

Parameters:

  • n: 1

Variables:

  • x: undefined

Return value: 1

The function call represented by the previous entry in the function call stack can now assign a value to variable x because it has finished evaluating the function call factorial_r(1), so the function call stack will look like this:

Function: factorial_r

Parameters:

  • n: 4

Variables:

  • x: undefined

Return value: None

Function: factorial_r

Parameters:

  • n: 3

Variables:

  • x: undefined

Return value: None

Function: factorial_r

Parameters:

  • n: 2

Variables:

  • x: 1

Return value: 2

Remember that once a function call ends, its entry is removed (or popped) from the function call stack, so we no longer have an entry for factorial_r(1).

Now, the function call to factorial_r(2) will be able to return the value 2 (i.e., n multiplied by x), which will become the value for x in factorial_r(3):

Function: factorial_r

Parameters:

  • n: 4

Variables:

  • x: undefined

Return value: None

Function: factorial_r

Parameters:

  • n: 3

Variables:

  • x: 2

Return value: 6

We repeat this process one final time, to obtain the return value of factorial_r(4):

Function: factorial_r

Parameters:

  • n: 4

Variables:

  • x: 6

Return value: 24

Thinking about recursive calls in terms of their function call stack reinforces the notion that a fresh set of parameters and local variables is created for a every call to a function. Keeping this fact in mind can help avoid a common mistake: thinking that the recursive call simply “jumps” back to the beginning of the function. For example, in the factorial_r function:

def factorial_r(n):
    '''  Compute n! recursively '''

    if n == 1:
       return 1
    elif n > 1:
       x = factorial_r(n-1)
       return n * x

When we reach the factorial_r(n-1) call, we do not change the value of n and “jump” back up to the beginning of the function that is currently running. Instead, an entirely new entry in the function call stack is created, with its own values for n and x. If you find yourself struggling to understand how a recursive function works, try working through the function call stack step-by-step as we did above.

3.2.3. Recursion vs. iteration

At this point, we have seen that we can implement the factorial function using iteration:

def factorial(n):
    '''  Compute n! iteratively '''

    rv = 1
    for k in range(1, n+1):
        rv = rv * k
    return rv

And using recursion:

def factorial_r(n):
    '''  Compute n! recursively '''

    if n == 1:
       return 1
    elif n > 1:
       return n * factorial_r(n-1)

In this simple example, the amount of code we write for both functions is roughly the same, so why would we choose recursion over iteration? In general, there are a number of algorithms and data structures that lend themselves naturally to a recursive definition. In those cases, a recursive implementation will typically be simpler and more elegant than the equivalent iterative implementation. This will become more apparent as we see more recursive algorithms.

However, you should resist the urge to use recursion as just another way of performing repetition in your code. While it is true that recursion and iteration are equally expressive (every iterative algorithm can be converted to a recursive one, and vice versa), you will usually use recursion only when you are faced with problems that have an inherently recursive definition.

For example, let’s say we want to create a list with all the numbers between a lower bound lb and ub (both inclusive). We can do this work very simply with iteration (for the sake of argument, let’s assume we cannot use the range function):

def gen_nums(lb, ub):
    '''
    Generate a list of integers from a lower bound to an upper
    bound inclusive.

    Inputs:
      lb (int): lower bound
      ub (int): upper bound

    Returns (list of ints): list of integers from lb to ub
      inclusive
    '''

    l = []
    i = lb
    while i <= ub:
       l.append(i)
       i += 1

    return l
>>> gen_nums(1,5)
[1, 2, 3, 4, 5]

Although we can also write a recursive solution to this problem, it is arguably not as easy to understand as the equivalent iterative solution:

def gen_nums_r(lb, ub):
    '''
    Generate a list of integers from a lower bound to an upper
    bound inclusive.

    Inputs:
      lb (int): lower bound
      ub (int): upper bound

    Returns (list of ints): list of integers from lb to ub
      inclusive
    '''

    if lb > ub:
       return []
    else:
       return [lb] + gen_nums_r(lb+1, ub)
>>> gen_nums_r(1,5)
[1, 2, 3, 4, 5]

Notice that our base case is the case when the lower bound is greater than the upper bound. In this case, the list can be obtained trivially: it will simply be an empty list. In the recursive case, we create a list whose first element is the lower bound and the rest of the list (from lb+1 to ub) is obtained recursively (we’re guaranteed to reach the base case eventually because we increment lb by one in each call and get closer to ub with each recursive call). As suggested earlier, you may want to work through the function call stack for a simple example call, such as gen_nums_r(2,5), if you have trouble understanding how this function works.

Choosing between iteration and recursion can also be a question of taste. Most functional programming languages depend on recursion to repeat operations and many of them do not include any iterative constructs at all, so someone who learned how to program using a functional programming language may find the recursive version of gen_nums_r to be more elegant and easier to understand. Programmers who have learned to program under the imperative paradigm (using loops for repetition), however, tend to use recursion only when dealing with problems that have an inherently recursive definition because it is easier to translate the problem into code using that definition rather than trying to figure out the equivalent iterative solution.

3.2.4. Permutations (or how to think recursively)

Generating permutations is a good example of a problem that has a simple recursive definition: given \(n\) elements, the permutations of those elements are all possible ways of arranging those elements.

For example, these are all of the possible permutations of (1,2,3):

1 2 3
1 3 2
2 1 3
2 3 1
3 1 2
3 2 1

And these are all of the possible permutations of (1,2,3,4):

1 2 3 4
1 2 4 3
1 3 2 4
1 3 4 2
1 4 2 3
1 4 3 2

2 1 3 4
2 1 4 3
2 3 1 4
2 3 4 1
2 4 1 3
2 4 3 1

3 1 2 4
3 1 4 2
3 2 1 4
3 2 4 1
3 4 1 2
3 4 2 1

4 1 2 3
4 1 3 2
4 2 1 3
4 2 3 1
4 3 1 2
4 3 2 1

Look closely at both sets of permutations. In particular, look at all the permutations of (1,2,3,4) that start with 4:

4 1 2 3
4 1 3 2
4 2 1 3
4 2 3 1
4 3 1 2
4 3 2 1

These permutations are generated by taking all of the permutations of (1,2,3) and adding 4 as the first element. In fact, the permutations of (1,2,3,4) are:

  • All permutations of (2,3,4) with 1 as the first element;

  • All permutations of (1,3,4) with 2 as the first element;

  • All permutations of (1,2,4) with 3 as the first element; plus

  • All permutations of (1,2,3) with 4 as the first element.

And the permutations of (1,2,3) are:

  • All permutations of (2,3) with 1 as the first element;

  • All permutations of (1,3) with 2 as the first element; plus

  • All permutations of (1,2) with 3 as the first element.

And so on.

Intuitively, it looks like permutations are defined recursively, because we can define permutations of \(n\) elements in terms of permutations of \(n-1\) elements. To write a recursive solution, we will generally take the following three steps.

Step #1: Determine the input(s) and output(s) to the function. This step may seem obvious given that we do this work any time we write a function, but when solving a recursive problem, it is especially important to make these decisions first.

In this case, the input to our function will be a list of elements we want to generate permutations on. The output, or return value, will be a list of permutations, with each permutation being a list of elements. For example, if we call the function with the following list:

[1, 2, 3]

We would expect it to return the following:

[[1, 2, 3],
 [1, 3, 2],
 [2, 1, 3],
 [2, 3, 1],
 [3, 1, 2],
 [3, 2, 1]]

The reason we need to figure the types out first is because we need to ensure that both our base case and our recursive case expect the same type of inputs and return the same type of outputs.

Step #2: Determine the base case(s) for the function. We need to think about the case (or cases) where solving the problem is trivial. With permutations, this case is when we’re asked to produce all the permutations of one element. In this case, there is only one such permutation: the permutation with that element.

So, we can start writing our permutations_r function as follows:

def permutations_r(p):
    '''
    Compute all the permutations of the values in p.

    Inputs:
      p (list): list of values to permute

    Returns (list of lists): permutations of p
    '''

    if len(p) == 1:
        # Base case
        return p
    else:
        # TODO: Recursive case
        return []

However, the above code for the base case is not correct! Let’s try calling the function in such a way that we hit the base case immediately:

>>> permutations_r([1])
[1]

The function returns [1], but we decided that the return value of our function would be a list of permutations (and [1] represents a single permutation). What we really want to get from the function is [ [1] ]. So we re-write the function as follows:

def permutations_r(p):
    '''
    Compute all the permutations of the values in p.

    Inputs:
      p (list): list of values to permute

    Returns (list of lists): permutations of p
    '''

    if len(p) == 1:
        # Base case
        return [ p ]
    else:
        # TODO: Recursive case
        return []
>>> permutations_r([1])
[[1]]

This is a common way to mess up the base case. While it may make intuitive sense for the function to return [1] in the trivial case because we only have one permutation, we must make sure that the code uses inputs and outputs consistently. Otherwise, the recursive case will not work.

Make sure that your function works correctly for the base case before moving onto the next step. Test it informally using the Python interpreter on inputs that will immediately hit the base case and make sure the return value is consistent with your desired type of output. Paying careful attention to the input and output types will save you a fair amount of trouble and debugging time farther down the road.

Step #3: Determine the recursive case(s) for the function. This step can be tricky because thinking recursively doesn’t come naturally to many people. We suggest taking the following approach: if you are writing a function that takes some input \(x\), write a solution that assumes that calls to that function with a smaller value for \(x\) will “just work”.

For example, let’s go back to the factorial function:

def factorial_r(n):
    '''  Compute n! recursively '''

    if n == 1:
       return 1
    elif n > 1:
       return n * factorial_r(n-1)

After designing our base case, we wrote a recursive case that assumes that the call to factorial_r(n-1) will “just work”. If we actually write out the entire sequence of function calls and the function call stack we can understand how this works, but when starting to write a recursive solution , it is better to not go down the rabbit hole of trying to trace every recursive call all the way down to the base case.

Instead, we implement the recursive case under the assumption that recursive calls will “just work” as long as the parameters we pass to the recursive call move us closer to the base case. This is certainly true in the case of factorial: for any positive value of n, calling the function with n-1 gets us closer to the base case of n == 1.

So, what is the recursive case for permutations? Our function is called with a list of elements we want to generate permutations on, and our base case is reached whenever that list contains a single element. So, if the function is called like this:

permutations_r([1,2,3,4])

We can assume that the following calls will “just work”:

permutations_r([2,3,4])
permutations_r([1,3,4])
permutations_r([1,2,4])
permutations_r([1,2,3])

Or, more generally, if the function is called with a list of size n, we should implement our recursive case under the assumption that recursive calls with a list of size n-1 will work. Notice how this approach will eventually get us to the base case of calling the function with a list of size n == 1.

Now, remember the example permutations we showed above. We remarked that the permutations of (1,2,3,4) are simply:

  • All permutations of (2,3,4) with 1 as the first element;

  • All permutations of (1,3,4) with 2 as the first element;

  • All permutations of (1,2,4) with 3 as the first element; plus

  • All permutations of (1,2,3) with 4 as the first element

So, let’s implement this logic under the assumption that the recursive calls (with those smaller lists) will return the correct permutations. First of all, we said our function will return a list of permutations, so let’s start there:

def permutations_r(p):
    '''
    Compute all the permutations of the values in p.

    Inputs:
      p (list): list of values to permute

    Returns (list of lists): permutations of p
    '''

    if len(p) == 1:
        # Base case
        return [ p ]
    else:
        # Recursive case
        rv = []

        # rv will contain the list of permutations

        return rv

Then, for each element of p, we want to obtain the permutations resulting from removing that element.

def permutations_r(p):
    '''
    Compute all the permutations of the values in p.

    Inputs:
      p (list): list of values to permute

    Returns (list of lists): permutations of p
    '''

    if len(p) == 1:
        # Base case
        return [ p ]
    else:
        # Recursive case
        rv = []

        for x in p:
           p_minus_x = [v for v in p if v != x]
           perms_without_x = permutations_r(p_minus_x)

        return rv

p_minus_x is simply a copy of the list p but with x removed from it. Next, we make the recursive call with that list. If p is [1,2,3,4] and x is 4, then this recursive call would return the following list:

[[1, 2, 3],
 [1, 3, 2],
 [2, 1, 3],
 [2, 3, 1],
 [3, 1, 2],
 [3, 2, 1]]

Remember: avoid the urge to go down the rabbit hole of understanding exactly what happens in this recursive call! For now, we take the approach of writing the code under the assumption that the recursive call will return exactly what we expect (according to how we specified the inputs and outputs of our function).

However, our function is not yet finished. You should also resist the urge to try and test it as soon as you’ve written the recursive call. We said our function has to return a list of permutations, so we need to take the permutations returned by the recursive call, add x to each permutation, and then add it to our rv list. For example, in the specific case where p is [1,2,3,4] and x is 4, we want to add the following permutations to rv:

[[4, 1, 2, 3],
 [4, 1, 3, 2],
 [4, 2, 1, 3],
 [4, 2, 3, 1],
 [4, 3, 1, 2],
 [4, 3, 2, 1]]

So we add a loop that iterates over the permutations returned by the recursive call. For each permutation, we construct a new list containing x and the values from permutation and add it to the list we are going to return:

def permutations_r(p):
    '''
    Compute all the permutations of the values in p.

    Inputs:
      p (list): list of values to permute

    Returns (list of lists): permutations of p
    '''

    if len(p) == 1:
        # Base case
        return [ p ]
    else:
        # Recursive case
        rv = []

        for x in p:
           p_minus_x = [v for v in p if v!=x]
           perms_without_x = permutations_r(p_minus_x)
           for perm in perms_without_x:
               pl = [x] + perm
               rv.append(pl)
        return rv

At this point, our function is finished:

>>> permutations_r([1])
[[1]]
>>> permutations_r([1,2,3])
[[1, 2, 3], [1, 3, 2], [2, 1, 3], [2, 3, 1], [3, 1, 2], [3, 2, 1]]

Recursion and Induction

If you’re familiar with inductive proofs, they can also provide a good framework to think about recursion. When doing an inductive proof, we prove a statement for one or more base cases (such as \(n=0\) and \(n=1\)) and once we’ve done that, we take the inductive step: assuming the statement holds for \(n\), prove it holds for \(n+1\).

Thinking recursively is similar to doing an inductive proof: once we’ve implemented the base case, we then implement the recursive case under the assumption that a recursive call (with parameters that get us closer to the base case) will work.

Like we did for the factorial function, here is a version of the permutations function that prints out messages explaining what happens at each recursive call. We encourage you to play around with it so you can understand what happens when you make recursive calls. This function has several extra parameters: n, which is the desired number of elements in each permutation, verbose, which controls the printing of information about the computation, and level, which is used internally by the function to track the depth of the recursion for printing purposes. Notice that the code handles the fact that there might be more than one element in p when n == 1.

def permutations(p, n, verbose=False, level=0):
    '''
    Compute all the permutations of length n of the values in p.

    Inputs:
      p (list): list of values to permute
      n (int): desired size of the permutations
      verbose (boolean): indicates whether information about the
        computation should be printed (default: False)
      level (int): depth of the recursion (default: 0)

    Returns (list of lists): permutations of length n of values in p
    '''

    assert len(p) >= n

    if verbose: print(("    " * level) + "permutations({}, {})".format(p, n))
    if n == 1:
        rv = [[x] for x in p]
        if verbose: print(("    " * level) + "result: {}".format(rv))
        return rv
    elif len(p) == 1:
        if verbose: print(("    " * level) + "result: {}".format([p]))
        return [p]
    else:
        rv = []
        for x in p:
            if verbose: print(("    " * level) + "{} with...".format(x))
            rem = [v for v in p if v != x]
            for perms in permutations(rem, n-1, verbose, level+1):
                rv.append([x] + perms)
        if verbose: print(("    " * level) + "result: {}".format(rv))
        if verbose: print()
        return rv
>>> ps = permutations([1,2,3], 3, verbose=True)
permutations([1, 2, 3], 3)
1 with...
    permutations([2, 3], 2)
    2 with...
        permutations([3], 1)
        result: [[3]]
    3 with...
        permutations([2], 1)
        result: [[2]]
    result: [[2, 3], [3, 2]]

2 with...
    permutations([1, 3], 2)
    1 with...
        permutations([3], 1)
        result: [[3]]
    3 with...
        permutations([1], 1)
        result: [[1]]
    result: [[1, 3], [3, 1]]

3 with...
    permutations([1, 2], 2)
    1 with...
        permutations([2], 1)
        result: [[2]]
    2 with...
        permutations([1], 1)
        result: [[1]]
    result: [[1, 2], [2, 1]]

result: [[1, 2, 3], [1, 3, 2], [2, 1, 3], [2, 3, 1], [3, 1, 2], [3, 2, 1]]