Merge Sort — Algorithms & DS

Divide and conquer

Merge sort splits the array in half, recursively sorts each half, then merges the two sorted halves into one sorted result. The key insight is that merging two sorted arrays is O(n) — you walk both simultaneously, always taking the smaller front element.

c

void merge(int *arr, int lo, int mid, int hi) {
    int n = hi - lo;
    int *tmp = malloc(n * sizeof(int));
    int i = lo, j = mid, k = 0;
    while (i < mid && j < hi)
        tmp[k++] = arr[i] <= arr[j] ? arr[i++] : arr[j++];
    while (i < mid) tmp[k++] = arr[i++];
    while (j < hi)  tmp[k++] = arr[j++];
    memcpy(arr + lo, tmp, n * sizeof(int));
    free(tmp);
}

void merge_sort(int *arr, int lo, int hi) {
    if (hi - lo <= 1) return;  // base case: 0 or 1 element
    int mid = lo + (hi - lo) / 2;
    merge_sort(arr, lo, mid);
    merge_sort(arr, mid, hi);
    merge(arr, lo, mid, hi);
}
          

Why O(n log n)

The recursion splits the array in half each time, producing log₂(n) levels. At each level, the total work across all merges is O(n) — every element is visited once during merging. Total: O(n) work × O(log n) levels = O(n log n). This holds in best, average, and worst case — the split is always even.

The O(n) space cost

Merge sort needs auxiliary memory for the merge step — you can't merge two sorted subarrays in-place without O(n²) work. The tmp buffer above allocates and frees on every merge call, which is fine for correctness but expensive in practice. Real implementations allocate one auxiliary buffer of size n upfront and reuse it.

✅

Merge sort is stable. The merge uses <= when comparing, which means equal elements from the left half appear before equal elements from the right half — their original order is preserved. This is why databases and spreadsheets often use merge sort variants.

one-line takeaway

Merge sort guarantees O(n log n) by splitting in half and merging in linear time — at the cost of O(n) extra space.