Main Page | See live article | Alphabetical index

B-trees are tree data structures that are most commonly found in databases and filesystem implementations. B-trees keep data sorted and allow amortized logarithmic time insertion and deletions. Conceptually speaking B-trees grow from the bottom up as elements are inserted, whereas most binary trees generally grow down.

The idea behind B-trees is that inner nodes can have a variable number of child nodes within some pre-defined range. This causes B-trees to not need re-balancing frequently, unlike AVL trees. The lower and upper bounds on the number of child nodes are fixed for a particluar implementation. For example, in a 2-3 B-tree (often simply 2-3 tree), each node may have only 2 or 3 child nodes. A node is considered to be in an illegal state if it has an invalid number of child nodes.

Inner node structures

Generally speaking, the "separation values" can simply be the values of the tree.

Each inner node has separation values which divide its sub-trees. For example, if an inner node has 3 child nodes (or sub-trees) then it must have 2 separation values a1 and a2. All values less than a1 will be in the leftmost sub-tree, values between a1 and a2 will be in the middle sub-tree, and values greater than a2 will be in the rightmost sub-tree.

Steps for Deletion

The process continues until the parent node remains in a legal state or until the root node is reached.

Steps for Insertion

The action stops when either the node is in a legal state or the root is split into two nodes and a new root is inserted.

Searching

Searching is performed very similar to a binary tree search, simply by following the separation values until the value is found or the end of the tree is reached.

Notes

Suppose L is the least number of children a node is allowed to have, while U is the most number. Then each node will always have between L and U children, inclusively, with one exception: the root node may have anywhere from 2 to U children inclusively, or in other words, it is exempt from the lower bound restriction, instead having a lower bound of its own (2). This allows the tree to hold small numbers of elements. The root having one child makes no sense, since the subtree attached to that child could simply be attached to the root. Giving the root no children is also unnecessary, since a tree with no elements is typically represented as having no root node.

Robert Tarjan proved that the amortized number of splits/merges is 2.