Balanced Search Trees Made Simple (In C#)

Balanced Binary Search Trees (BST) is nothing new. The two most popular variants of them are AVL trees and Red-Black trees. Apart from standard textbooks on algorithms and data structures (like Cormen et al.) a great resource on this is GNU libavl. Despite its name it deals not only with AVL trees but with Red-Black trees too. Also it is not quite a library. It is a complete book with detailed discussion of BSTs in general and accompinied with the working C-library.

Even though BSTs are known for decades and the afore-mentioned two types of BSTs found widespread use in real-world software the computer scientists keep inventing new BSTs. For instance, just a few years ago there was a lot of research papers on Relaxed-Balance BSTs. Supposedly this type of trees should work better in case of parallel insert and search access.

However here I am going to provide an implementation of a BST that is not very new. It is described in Arne Andersson's paper that dates back to 1993. Sometimes this variant of tree is called AA-tree upon the initials of its inventor. I am not sure if this is correct as there is yet another balanced tree variant invented by him. However as I am going to stick to the former tree here I will call it AA-tree too.

The distinguishing feature of the AA-tree is that it has the most simple implementation among balanced trees. Its performance might be somewhat worse than the performance of a Red-Black tree. However it might still be useful in a case when you need to quickly hatch together a tree and the time spent on debugging more complex code outweights a few percents of performance advantage. Some may say that when things like STL is available who cares about writing binary trees themselves. In the vast majority of cases this is absolutely true. There is no need to re-invent the wheel. However sometimes unexpected happens. Enough to say that I have used AA-trees once in a real project.

The original paper provides Pascal implementation of the tree. This is unfortunate as Real Programmers Don't Use Pascal. But not everything is lost, there is very nice and funny (well, funny only for geek-types perhaps) Eternally Confuzzled web-site that has an AA-tree tutorial. By itself the tutorial is very good. It describes exactly what AA-trees are. It provides insight into their internal working that might be very useful for programmers as Andersson's paper is perhaps a little bit too dry.

Another thing that the tutorial demonstrates very well is the sort of things the Real C-Programmers might do to optimize the code. The tutorial provides few different C-implementaions of AA-trees. In a series of gradual improvements it comes up with the code that much better than the original one from the point of a view of any competent C-programmer. However this code looks more complex. I would say if one cares about performance more than about simplicity then the right thing would be to start with a Red-Black tree in the first place.

Unfortunately, the Wikipedia article on AA-trees also provides an "optimized" C-code that does not resemble the original Pascal code. So I am here to fix this. [Update: it seems that now wikipedians fixed this too] I will provide my version of AA-trees. But I've chosen C# as the implementation language.

And I am sorry I provide only the code. I will be unable to explain AA-trees any better than the Eternally Confuzzled tutorial does.

C# Implementation

First of all, the supplementary Node class definition. A typical tree node, nothing special.

        private class Node
        {
                // node internal data
                internal int level;
                internal Node left;
                internal Node right;

                // user data
                internal TKey key;
                internal TValue value;

                // constuctor for the sentinel node
                internal Node()
                {
                        this.level = 0;
                        this.left = this;
                        this.right = this;
                }

                // constuctor for regular nodes (that all start life as leaf nodes)
                internal Node(TKey key, TValue value, Node sentinel)
                {
                        this.level = 1;
                        this.left = sentinel;
                        this.right = sentinel;
                        this.key = key;
                        this.value = value;
                }
        }

The Node class is defined inside the AATree class and is not externally visible. The AATree class is generic. It has two type parameters: TKey and TValue. The TKey type is the one used to order tree nodes. It has to implement IComparable interface. Now let's see the AATree class, its member fields and the constructor that initializes them.

public class AATree<TKey, TValue> where TKey : IComparable<TKey>
{
        Node root;
        Node sentinel;
        Node deleted;

        public AATree()
        {
                root = sentinel = new Node();
                deleted = null;
        }

We are ready for the methods that implement AA-tree algorithm as it presented in the Andersson's paper. They very closely match the original Pascal procedures.

        private void Skew(ref Node node)
        {
                if (node.level == node.left.level) {
                        // rotate right
                        Node left = node.left;
                        node.left = left.right;
                        left.right = node;
                        node = left;
                }
        }

        private void Split(ref Node node)
        {
                if (node.right.right.level == node.level) {
                        // rotate left
                        Node right = node.right;
                        node.right = right.left;
                        right.left = node;
                        node = right;
                        node.level++;
                }
        }

        private bool Insert(ref Node node, TKey key, TValue value)
        {
                if (node == sentinel) {
                        node = new Node(key, value, sentinel);
                        return true;
                }

                int compare = key.CompareTo(node.key);
                if (compare < 0) {
                        if (!Insert(ref node.left, key, value))
                                return false;
                } else if (compare > 0) {
                        if (!Insert(ref node.right, key, value))
                                return false;
                } else {
                        return false;
                }

                Skew(ref node);
                Split(ref node);

                return true;
        }

        private bool Delete(ref Node node, TKey key)
        {
                if (node == sentinel) {
                        return (deleted != null);
                }

                int compare = key.CompareTo(node.key);
                if (compare < 0) {
                        if (!Delete(ref node.left, key))
                                return false;
                } else {
                        if (compare == 0)
                                deleted = node;
                        if (!Delete(ref node.right, key))
                                return false;
                }

                if (deleted != null) {
                        deleted.key = node.key;
                        deleted.value = node.value;
                        deleted = null;
                        node = node.right;
                } else if (node.left.level < node.level - 1
                        || node.right.level < node.level - 1) {
                        --node.level;
                        if (node.right.level > node.level)
                                node.right.level = node.level;
                        Skew(ref node);
                        Skew(ref node.right);
                        Skew(ref node.right.right);
                        Split(ref node);
                        Split(ref node.right);
                }

                return true;
        }

The most notable difference is the Delete method uses three-way compare whereas the original paper used two-way compare. This is a possible performance disadvantage in exchange to using generic IComporable interface.

And at last the source code file: aatree.cs

References

Wikipedia - AA tree