The BST invariant
A binary search tree is a binary tree that satisfies the ordering property at every node: all keys in the left subtree are strictly less than the node's key, and all keys in the right subtree are strictly greater. This must hold not just for immediate children — it must hold for the entire subtree.
The payoff: to find a value, you never need to look at both sides. At each node, compare your target to the current value. Too small? Go left. Too large? Go right. The comparison halves the remaining candidates each step — the same logic as binary search, but the structure is dynamic: you can insert and delete without rebuilding the entire thing.
Search
Search follows the invariant mechanically. Start at the root. If the current node is NULL, the value isn't in the tree. If it matches, done. Otherwise recurse into the appropriate child.
Insert
Insertion finds the correct leaf position and attaches a new node there. The recursive pattern returns the (possibly new) subtree root, which lets parent pointers stay correct without passing a pointer-to-pointer.
Delete
Deletion is the hardest operation. Three cases:
- Leaf node — just free it, return NULL to parent.
- One child — replace the node with its only child.
- Two children — find the in-order successor (smallest node in right subtree), copy its value into the current node, then delete the successor from the right subtree.
The degenerate case
If you insert sorted data into a plain BST — say, 1, 2, 3, 4, 5 — every new node goes to the right. The tree becomes a right-leaning chain, height n, and every operation costs O(n). You've built an expensive linked list.
Self-balancing trees
A self-balancing BST automatically restructures itself on insert and delete to keep height O(log n) regardless of input order. The two most common variants are:
- AVL tree — after every insert or delete, checks the height difference between left and right subtrees at each affected node. If the difference exceeds 1, it performs a rotation — a local pointer rearrangement that reduces height while preserving the BST invariant. AVL trees are strictly balanced, so lookups are fast.
-
Red-black tree — each node is colored red or black, and a set of
color rules enforces that no path from root to leaf is more than twice as long as any other.
Red-black trees are less strictly balanced than AVL trees but require fewer rotations
on insert and delete. They're what
std::mapin C++ and Java'sTreeMapuse under the hood.
BST vs hash table
Hash tables give O(1) average lookup. BSTs give O(log n) worst-case. Why would you ever use a BST?
- Ordered iteration — in-order traversal yields sorted keys. Hash tables have no ordering.
- Range queries — find all keys between 10 and 50. A BST can do this efficiently; a hash table cannot.
- Predecessor / successor — find the largest key smaller than x. O(log n) in a BST, O(n) in a hash table.
- No hash function required — any comparable type works directly.
A BST is O(log n) only while balanced — insert sorted data and it degenerates to O(n). Use a self-balancing variant in practice.