Sorting a linked list

For God’s sake, don’t try sorting a linked list.

— Steve Yegge

Why would it be a bad idea to sort a linked list? A few things come to mind:

  1. To locate the kth item you have to traverse the k-1 items that precede it. This makes the formulation of many sorting algorithms unusable for linked lists: the basic “swap" operation no longer takes constant time.
  2. Linked lists are inefficient in general: going over the items in a list means jumping around memory in random. This is bad for the CPU caches, and it might be faster to copy the linked list into an array, and then sort the array.

But let’s not be discouraged: in defiance of Steve, let’s sort a linked list, and do it efficiently, with minimal cache misses!

The Algorithm

What we can do is use a modified merge sort algorithm. The basic idea behind merge sort is:

  1. If the input list is empty or has exactly one element, return because it’s already sorted.
  2. Otherwise, break the list into two roughly equal sized parts.
  3. Sort the two parts recursively using merge sort.
  4. Merge the two sorted lists into one sorted list.

Since we don’t know how long the list is, breaking the list into two equal parts is not trivial. It can be done in a single pass over the items, but that means traversing many links, each of which is a potential cache miss. Instead we’ll use the following variation of the algorithm: sort first the first 2 items of the list with the merge operation, then the next 2, merge these so that the first 4 items are sorted, repeat for the next 4 elements and merge so that the first 8 items are sorted, and so on until the end. In pseudocode:

  1. k :=1, p := head[list].next
  2. while p ≠ nil:
  3.     Sort k items of the list that starts at p
  4.     Merge two lists of size k that start at head[list] and at p
  5.     k := 2k, p := next item of the last node that was merged

The Implementation

The above pseudocode looks simple, but can be tricky to implement. For example, you have to remember that when you sort a list its head moves to the middle of the list, so the sort operation has to return the new head of the list somehow. This may seem obvious, but if you start coding without thinking things through it may come as a surprise!

First we’ll need some representation for linked list nodes. A class that holds the item and a pointer to the next node does nicely:

    static class Node<T> {
        T value;
        Node<T> next;
    }

The merge operation as we use it has to return both the first node of the merged list, and the node past the end of the merged list. Let’s call this construct a bracket:

    private static class Bracket<T> {
        Node<T> top;
        Node<T> bot;
    }

Now let’s define the topmost sort method, following the pseudocode:

    public static <T> void sort(Comparator<T> cmp, 
                                Node<T> head) {
        Bracket<T> bracket = new Bracket<T>();
        int k = 1;
        Node<T> p = head.next;
        while (p != null) {
            bracket.top = p;
            sort(cmp, k, bracket);
            merge(cmp, head, k, bracket.top, k, bracket);
            head = bracket.top; p = bracket.bot;
            k = 2*k;
        }
        return head;
    }

What’s left is the method that sorts the first k items in a linked list with a recursive merge sort, and the merge operation that actually does the work of putting items in order. Both return a bracket from the head of the sorted list until item past the end of the sorted list. No surprises here, pretty standard stuff. I’ll use whitespace creatively to save space though, so you don’t have to scroll so much to skip ahead:

private static <T> void sort(Comparator<T> cmp, int k,
                             Bracket<T> bracket) {
    if (k < 2) { bracket.bot = bracket.top.next; return; }
    sort(cmp, k/2, bracket);
    if (bracket.bot != null) {
        Node<T> top = bracket.top;
        bracket.top = bracket.bot;
        sort(cmp, k/2, bracket);
        Node<T> bot = bracket.top;
        merge(cmp, top, k/2, bot, k/2, bracket);
    }
}

private static <T> void merge(Comparator<T> comparator, 
                              Node<T> top, int ctop, 
                              Node<T> bot, int cbot,
                              Bracket<T> bracket) {
    int count = ctop+cbot;
    Node<T> head=null, tail=null, next;
    while (ctop + cbot > 0) {
        if (cbot == 0)      { next = top; top = top.next; ctop--; }
        else if (ctop == 0) { next = bot; bot = bot.next; cbot--; }
        else {
            int cmp = comparator.compare(top.value, bot.value);
            if (cmp > 0) { next = bot; bot = bot.next; cbot--; }
            else         { next = top; top = top.next; ctop--; }
        }
        if (head == null) head = next;
        else tail.next = next;
        tail = next;
        if (bot == null) cbot=0;
    }
    tail.next = bot;

    bracket.top=head;
    bracket.bot=tail.next;
}

The Analysis

Like any serious comparison-based sorting algorithm, this one makes O(n log n) comparisons between list items when sorting a list of length n. Merge sort can be implemented in constant space as shown by Simon Tatham, but this implementation will end up using O(log n) space due to the use of the stack for recursive calls. How about our goal of minimizing cache misses?

List items are being traversed only in the merge method, so that’s the only place where cache misses can be a problem. (Let’s assume for simplicity that the local variables can fit in registers, or never leave the cache). The algorithm will make:

  • n/2 merges of size 2, for a total of n potential cache misses
  • n/4 merges of size 4, for a total of n potential cache misses
  • 1 merge of size 2log2 n, for a total of n potential cache misses

As there are log n levels of merges, in total there are n log n potential cache misses. Every merge sort on linked lists has to do this same number of merges (with the exception of natural sort perhaps), in addition to any other list traversal they may do, so the algorithm presented here is optimal for CPU cache usage.

Whether this algorithm is measurably better than the many alternatives (see for example Tatham, GeeksForGeeks) remains to be seen.