B-tree

B-trees are tree data structures that are most commonly found in databases and filesystem implementations. B-trees keep data sorted and allow amortized logarithmic time insertion and deletions. Conceptually speaking B-trees grow from the bottom up as elements are inserted, whereas most binary trees generally grow down.

The idea behind B-trees is that inner nodes can have a variable number of child nodes within some pre-defined range. This causes B-trees to not need re-balancing frequently, unlike AVL trees. The lower and upper bounds on the number of child nodes are fixed for a particluar implementation. For example, in a 2-3 B-tree (often simply 2-3 tree), each node may have only 2 or 3 child nodes. A node is considered to be in an illegal state if it has an invalid number of child nodes.

Inner node structures

Generally speaking, the "separation values" can simply be the values of the tree.

Each inner node has separation values which divide its sub-trees. For example, if an inner node has 3 child nodes (or sub-trees) then it must have 2 separation values a1 and a2. All values less than a1 will be in the leftmost sub-tree, values between a1 and a2 will be in the middle sub-tree, and values greater than a2 will be in the rightmost sub-tree.

Steps for Deletion

If after removing the desired node, no inner node is in an illegal state then the process is finished.
If some inner node is in an illegal state then there are two possible cases:
1. Its sibling node (a child of the same parent node) can transfer one of its child nodes to the current node and return it to a legal state. If so, after updating the separation values in the parent and the two siblings the operation ends.
2. Its sibling does not have an extra child because it is on the lower bound too. In that case both these nodes are merged into a single node and the action is transferred to the parent node, since it has had a child node removed.

The process continues until the parent node remains in a legal state or until the root node is reached.

Steps for Insertion

If after inserting the node into the appropriate position, no inner node is in an illegal state then the process is finished.
If some node has more than the maximum amount of child nodes then it is split into two nodes, each with the minimum amount of child nodes. This process continues action recursivly in the parent node.

The action stops when either the node is in a legal state or the root is split into two nodes and a new root is inserted.

Searching

Searching is performed very similar to a binary tree search, simply by following the separation values until the value is found or the end of the tree is reached.

Notes

Suppose L is the least number of children a node is allowed to have, while U is the most number. Then each node will always have between L and U children, inclusively, with one exception: the root node may have anywhere from 2 to U children inclusively, or in other words, it is exempt from the lower bound restriction, instead having a lower bound of its own (2). This allows the tree to hold small numbers of elements. The root having one child makes no sense, since the subtree attached to that child could simply be attached to the root. Giving the root no children is also unnecessary, since a tree with no elements is typically represented as having no root node.

Robert Tarjan proved that the amortized number of splits/merges is 2.

Inner node structures

Steps for Deletion

Steps for Insertion

Searching

Notes

External Links