Reverse a linked list (recursive solution) - explanation

Suppose we have the following linked list:

[A] -> [B] -> [C] -> [D] -> [E] -> NULL

If we wanted to reverse this list, the desired result would be:

[E] -> [D] -> [C] -> [B] -> [A] -> NULL

This procedure can be accomplished both iteratively and recursively. We will examine the recursive implementation for fun and education :)

Thinking recursively

When I take on a recursive mindset, my thoughts center around what I can do when I assume my recursive function works for a subset of the data I'm given. Assuming the recursive function "works" and taking advantage of that assumption is a practical way to approach recursion.

To clarify, we cannot assume that the recursive function "works" for the entire set of data - otherwise we would end up with code like

/*
 * Returns the head of the linked list after reversal
 */
struct node* reverse(struct node* head){
    reverse(head);
}

This code will not do anything other than infinitely call itself.

We can, however, suppose that reverse() works for a portion of the linked list and work with that. Suppose that we can trust reverse() will successfully reverse a linked list starting with the node immediately following the HEAD. Using our example, without writing any code we will assume that reverse() can reverse the linked list starting from B.

/*
 * Returns the head of the linked list after reversal
 */
struct node* reverse(struct node* head){
    // We can assume this correctly reverses the linked list after HEAD but
    // does not include HEAD in the reversal.
    struct node *newHead = reverse(head->next);
}

So we will assume that reverse() recursively works just as long as we don't attempt to use it to reverse the entire linked list. How can we leverage this assumption? The code above still doesn't do anything meaningful yet.

To demonstrate, let's reverse the linked list starting from B, but keep A pointing to B. Before this reverse, the linked list looks like:

HEAD
|A| -> |B| -> |C| -> |D| -> |E| -> NULL

Afterwards, the linked list looks like:

HEAD
|A| |E| -> |D| -> |C| -> |B| -> NULL (A points to B)
 |------------------------^

This assumption that reverse() works for a subset of our linked list (but not the whole thing) has gotten us pretty far. Now we need to do the rest of the work to have the whole linked list reversed.

Working with the recursive assumption

From the current situation with a partially reversed linked list, we can achieve a fully reversed linked list with the following operations:

1) Move A to after B in the linked list by having B->next point to A. A and B will now be pointing at each other. We will fix this next.

                             HEAD
|E| -> |D| -> |C| -> |B| <-> |A|

2) Update A to point to NULL.

                            HEAD
|E| -> |D| -> |C| -> |B| -> |A| -> NULL

3) HEAD is updated to point to E and returned as the new head of the linked list

HEAD
|E| -> |D| -> |C| -> |B| -> |A| -> NULL

The original linked list is now fully reversed.

Handling the base case

When working within the context of a recursive assumption, we often must define a terminating / simple case so the function knows when to stop recursing. Generally the base case defines when there is nothing to do. In the case of reversing a linked list, if we want to reverse a linked list of length 1 (a linked list with just a HEAD) then we just return the HEAD. This is because a linked list of length 1 is the same when reversed, so there is no work to be done.

For further details on base cases in recursive functions, view my Introduction to Recursion YouTube Video.

Healthy skepticism

It would be very reasonable to question why we get to assume that reverse() works for a subset of the linked list. Isn't that cheating? Or magic?

This approach is rooted in mathematics, namely the principle of mathematical induction. The principle of mathematical induction is often used to prove theorems about infinite sequences of numbers. (See an example induction proof). To prove that a sequence of numbers has a certain property, induction proofs will assume that a subsequence of the numbers have that property. Then assumption is leveraged in the proof along with some extra work to show that the entire sequence has that property.

Sound familiar? This is exactly how we approached the reverse() recursion!

Generalizing the algorithm

We demonstrated how we can use a recursive assumption to sort our example linked list, but now we must generalize the algorithm so our reverse() function can sort any linked list we give it. The pseudo-code is as follows:

struct node* reverse(struct node* head){
    // BASE CASE: If no next node (AKA this linked list has length 1),
    // just return the current HEAD

    // reverse the linked list starting from the next node. store
    // the address of the head of this reversed list.
    // We can now assume the entire linked list after the head is correctly
    // reversed. Now we just need to get the head to the end of the list to
    // complete the full reversal.

    // head's next node is now at the end of the reversed list. (Refer to our
    // A-E example if you need help visualizing this). We need head to be
    // placed after it. In other words, the next node of head's current next node needs
    // to be updated to point to head

    // update head to point to NULL, as it is now the tail of the reversed
    // list. This makes sense as the first node in the list has now
    // become the last node.
}

The C implementation of the pseudo-code:

struct node* reverse(struct node* head){
    if(head->next == NULL){
        return head;
    }

    struct node* newHead = reverse(head->next);
    head->next->next = head;
    head->next = NULL;

    return newHead;
}

Notes on mindset

When recursion is discussed, often times the conversation goes straight to recursive functions resolving or recursive calls "bubbling up". These are useful conversations to have when you need to analyze the implementation of your recursive function, but these are not the conversations you need to have when first authoring your recursive function. Understanding the assumptions you can make and how the "leftover" work relates to the assumptions you made is a much simpler approach to designing recursive functions.

Tagged as c

Date published - September 10, 2018