Algorithms & data structures · lesson 01

Big-O notation

Big-O is not about speed. It's about how the work scales as the input grows. A function that takes 1ms on 10 items might take 1 second on 10,000 — or 1 hour. The shape of growth is what matters.

in progress
10 min interactive

The question Big-O answers

You wrote two sorting programs. Both produce the correct output. On 100 elements, both finish in under a millisecond. On 100,000 elements, one takes 200ms and the other takes 3 minutes. Why?

The answer isn't the hardware. It isn't the language. It's the shape of growth. One algorithm does work proportional to n·log(n) — if the input doubles, the work barely more than doubles. The other does work proportional to n² — if the input doubles, the work quadruples. At 100 elements the difference is invisible. At 100,000 elements it's the difference between useful and unusable.

Big-O notation is the language we use to describe that shape precisely, independent of machine speed, language, or constant-factor optimizations.

The formal meaning

When we write f(n) = O(g(n)), we mean: there exist positive constants c and n₀ such that for all n > n₀, f(n) ≤ c·g(n). In plain terms: beyond some threshold input size, f(n) never exceeds a fixed multiple of g(n).

This is an upper bound. It says "grows no faster than." There are two companion notations:

  • O(g(n)) — upper bound. f grows no faster than g. "At most this bad."
  • Ω(g(n)) — lower bound. f grows no slower than g. "At least this much work."
  • Θ(g(n)) — tight bound. f is both O(g) and Ω(g). "Exactly this growth rate."

In practice, when engineers say "this algorithm is O(n log n)" they usually mean Θ(n log n) — tight, not just an upper bound. Context makes it clear.

💡
Every algorithm has a lower bound too. Comparison-based sorting has a proven Ω(n log n) lower bound — no matter how clever the algorithm, if it works by comparing elements, it can't do better than n log n comparisons in the worst case. That's why merge sort and heapsort are considered optimal.

Why constants are dropped

An algorithm doing 3n operations and one doing 100n operations are both O(n). This seems wrong at first — 100n is 33× more work than 3n. Why ignore that?

Because the constant is fixed, and n grows. At n = 1,000:

text
Algorithm A: 100n ops at n=1,000 → 100,000 ops Algorithm B: n² ops at n=1,000 → 1,000,000 ops Algorithm A is 10× faster here. But at n=1,000,000: Algorithm A: 100n → 100,000,000 ops (~0.1s at 1GHz) Algorithm B: n² → 1,000,000,000,000 ops (~16 minutes)

The constant-factor advantage of B over A (if B had a smaller constant) would never overcome the quadratic growth. Hardware and compiler optimizations can shrink constants — they cannot change the exponent. Pick the right complexity class first.

Lower-order terms are dropped for the same reason. 5n² + 3n + 100 is O(n²) because at large n, the 3n and 100 become negligible compared to 5n². At n = 10,000: 5n² = 500,000,000, 3n = 30,000, 100 = 100. The 3n is 0.006% of the total. It doesn't affect the shape.

The common complexity classes

text
O(1) constant — array index, hash table lookup, stack push/pop O(log n) logarithmic — binary search, balanced BST operations O(n) linear — single pass, linear search, array copy O(n log n) linearithmic — merge sort, heapsort, most fast sorts O(n²) quadratic — nested loops, bubble/insertion/selection sort O(n³) cubic — naive matrix multiplication O(2ⁿ) exponential — brute-force subsets, recursive Fibonacci O(n!) factorial — brute-force permutations, travelling salesman

Deriving complexity from code, step by step

The complexity usually follows from the loop structure. Here's the pattern for each case, with the reasoning spelled out.

c
// O(1) — fixed work regardless of n int first = arr[0]; // one array access int sum = a + b + c; // two additions // n doesn't appear anywhere — the work is constant // O(n) — work grows linearly with n for (int i = 0; i < n; i++) { sum += arr[i]; // body runs exactly n times } // O(n) still — two loops, but sequential, not nested for (int i = 0; i < n; i++) process(arr[i]); // n ops for (int i = 0; i < n; i++) print(arr[i]); // n ops // total: 2n → O(n). Drop the constant. // O(n²) — nested loops, both run n times for (int i = 0; i < n; i++) // n iterations for (int j = 0; j < n; j++) // n iterations each compare(arr[i], arr[j]); // body runs n×n = n² times // O(log n) — halving the search space each step while (lo <= hi) { int mid = lo + (hi - lo) / 2; if (arr[mid] == target) return mid; if (arr[mid] < target) lo = mid + 1; else hi = mid - 1; // after each iteration the range halves: // n → n/2 → n/4 → ... → 1 (log₂n steps total) }

Tricky cases that trip people up

The triangular loop

When the inner loop starts at i+1 instead of 0, the body doesn't run n² times — it runs fewer. But the complexity is still O(n²).

c
for (int i = 0; i < n; i++) for (int j = i + 1; j < n; j++) // inner starts at i+1 compare(arr[i], arr[j]); // Iterations: (n-1) + (n-2) + ... + 1 = n(n-1)/2 // That's n²/2 - n/2. Drop lower terms and constants: O(n²).

This pattern — comparing every pair exactly once — shows up in bubble sort. It looks "more efficient" than the full n² nested loop, and it does half the work, but it's in the same complexity class.

Logarithmic loops

A loop that divides its counter rather than subtracting runs in O(log n) time, even if it doesn't look like a search.

c
for (int i = n; i >= 1; i /= 2) { // i halves each iteration process(i); } // i = n, n/2, n/4, ..., 1 → log₂n iterations for (int i = 1; i <= n; i *= 2) { // i doubles each iteration process(i); } // i = 1, 2, 4, ..., n → log₂n iterations. Same shape.

Nested loops where one bound is fixed

c
for (int i = 0; i < n; i++) for (int j = 0; j < 32; j++) // fixed, not n process(arr[i][j]); // Inner loop runs 32 times always — it's a constant. // Total: 32n operations → O(n). Not O(n²).
⚠️
Two loops don't automatically mean O(n²). Sequential loops add (O(n) + O(n) = O(n)). Nested loops multiply (O(n) × O(n) = O(n²)). The question is: does the inner loop run n times per outer iteration, or a fixed number of times?

Best, average, and worst case

The same algorithm can have different complexities depending on the input. These are three distinct analyses — don't conflate them.

  • Worst case — the input that forces the maximum work. Linear search through n elements: the target is at the end (or missing) → O(n).
  • Best case — the input that minimizes work. Linear search: target is at index 0 → O(1). This is rarely useful information.
  • Average case — expected work over a random input distribution. Linear search with target equally likely at any position → O(n/2) = O(n).

Quicksort illustrates why this matters: its average case is O(n log n) — roughly as fast as possible for a comparison sort. Its worst case is O(n²) — as slow as bubble sort. When people say "quicksort is fast" they mean average case. When they say "use merge sort for guaranteed performance" they mean worst case.

💡
Big-O without qualification usually means worst case. "This algorithm is O(n²)" means worst-case O(n²) unless otherwise stated. When someone specifies "average case O(n log n)" they're being careful — pay attention.

Recursive complexity: the recursion tree

Recursion makes complexity harder to see because the work is spread across call frames. The most reliable approach: draw the recursion tree, then count total work across all levels.

Merge sort on an array of n elements:

text
Level 0: [n elements] — merge costs n Level 1: [n/2] [n/2] — two merges, each n/2 → n total Level 2: [n/4][n/4][n/4][n/4] — four merges, each n/4 → n total ... Level k: 2ᵏ arrays of n/2ᵏ elements → n total work per level How many levels? We stop when n/2ᵏ = 1 → k = log₂n levels. Total work: n operations × log n levels = O(n log n).

The key insight: at every level of the recursion tree, we do O(n) work in total. The number of levels is log n (because we halve each time). So the total is O(n log n).

Contrast with naive recursive Fibonacci:

c
int fib(int n) { if (n <= 1) return n; return fib(n - 1) + fib(n - 2); } // Each call spawns 2 more calls. The tree has ~2ⁿ nodes. // fib(50) makes ~2^50 ≈ 10¹⁵ calls. Completely impractical. // With memoization, it becomes O(n). Same math, better memory.

Space complexity

Big-O applies to memory too. Count the extra memory used beyond the input itself.

  • O(1) space — a fixed number of variables regardless of n. Bubble sort: two pointers, one swap variable. Input array is not "extra."
  • O(n) space — auxiliary storage proportional to input. Merge sort's merge step: needs a temporary array of size n to merge into.
  • O(log n) space — recursion stack depth. Quicksort (average case): log n recursive calls on the stack simultaneously. Each frame is O(1), so total stack space is O(log n).
  • O(n) space (recursion) — recursion stack depth times frame size. Merge sort: log n levels deep, each O(n) for the merge buffer → O(n) total.
⚠️
Deep recursion uses stack space. A recursive function that calls itself n times deep uses O(n) stack space, even if each frame is tiny. On most systems the stack is 1–8 MB. Recursive descent into a 100,000-element tree can overflow it.

Amortized complexity

Sometimes a single operation is expensive, but it's rare enough that the average cost per operation stays low. This is amortized analysis — spreading the cost of expensive operations across many cheap ones.

The canonical example is a dynamic array (C++'s vector, Python's list). Appending to it is usually O(1) — write one element. But when the array fills, it reallocates to 2× the capacity, copying all existing elements. That realloc is O(n).

Expensive? Yes. But how often does it happen? Starting at capacity 1 and doubling: reallocs happen at sizes 1, 2, 4, 8, ..., n. Total copying: 1 + 2 + 4 + ... + n = 2n. Spread that 2n across n append operations → 2 copies per append on average. The amortized cost is O(1) per append.

💡
Amortized ≠ average case. Average case assumes a random input distribution. Amortized analysis is a worst-case guarantee spread over a sequence of operations — no matter what inputs arrive, the total cost stays bounded.

The real numbers

text
n=100 n=10,000 n=1,000,000 n=1,000,000,000 O(1) 1 1 1 1 O(log n) 7 13 20 30 O(n) 100 10,000 1,000,000 1,000,000,000 O(n log n) 700 130,000 20,000,000 30,000,000,000 O(n²) 10,000 100,000,000 10¹² 10¹⁸ O(2ⁿ) 10²⁷ (skip) skip skip skip At 1 billion ops/second (1GHz processor): O(n log n) on 1M elements → ~20ms ✓ usable O(n²) on 1M elements → ~16 min ✗ unusable O(n²) on 1B elements → ~31 years ✗ absurd

When complexity class doesn't tell the whole story

Big-O tells you the shape. Constants and cache behavior fill in the rest.

Insertion sort is O(n²). For arrays smaller than ~20–50 elements, it's faster than O(n log n) merge sort. Why? Merge sort has overhead: recursive calls, auxiliary array allocation, indirect memory access. Insertion sort does simple adjacent swaps in a tight loop — hardware prefetchers love it, and branch predictors handle it well.

This is why qsort in most standard libraries isn't pure quicksort — it switches to insertion sort for small partitions. The complexity class is O(n log n) in practice because the O(n²) parts only handle tiny n.

Algorithm selection in practice:
1. Pick the right complexity class first — this is the most important decision.
2. Then consider constants and cache behavior for the actual input sizes you'll see.
3. Profile before micro-optimizing. Your assumptions about which constant is larger are often wrong.

Growth curves

The chart below plots all five main classes on the same axes (n = 0–20, y capped for readability). O(n log n) and O(n²) exit the top — that's the point. At scale, they dwarf everything below them.

one-line takeaway

Big-O measures how work scales with input size — drop the constants, keep the shape, and always ask which case the analysis covers.