Systems programming · lesson 01

The process memory map

When your program runs, the OS gives it the illusion of owning all of memory. That memory is divided into named regions — text, data, BSS, heap, stack — each with different rules about lifetime, growth direction, and access permissions.

in progress
12 min interactive

One address space, many regions

A running process sees a large flat array of addresses — typically 48 bits on x86-64, giving 256 terabytes of virtual address space. But not all of that space is used the same way. The OS and linker carve it into regions, each with a specific purpose and protection flags.

You've been writing into these regions all along. Every global variable, every stack frame, every malloc call — each lands in a specific region. Understanding where things live explains why some memory errors crash immediately while others corrupt silently, and why some addresses are valid and others are not.

The text segment

The text segment (also called the code segment) contains the compiled machine instructions of your program. It's loaded from the executable at startup and is typically read-only and executable. Read-only because the program's code shouldn't change at runtime; executable because the CPU needs to fetch instructions from it.

Writing to the text segment causes a segfault. This is intentional — it prevents both accidental corruption and certain classes of code injection attacks.

The data and BSS segments

Global and static variables live in one of two segments depending on whether they have an initializer:

  • Data segment — initialized globals and statics. int x = 42; at file scope. The initial value is stored in the executable and copied in at load time.
  • BSS segment — uninitialized globals and statics. int y; at file scope. The executable records only the size; the OS zero-fills this region at load time. This is why global variables are guaranteed to be zero-initialized in C.
💡
BSS stands for "Block Started by Symbol" — a name from 1950s assembly. Zero-filling saves executable size: instead of storing millions of zeros in the binary, the OS just maps zeroed pages when it loads the BSS section.

The heap

The heap is where dynamically allocated memory lives — everything you get from malloc, calloc, or realloc. It starts just above the BSS segment and grows upward toward higher addresses as you allocate more.

The heap is managed by the C runtime library (libc), which itself uses the brk or mmap syscalls to request pages from the OS. malloc is not a syscall — it's a user-space library function that maintains its own free lists and only goes to the OS when it needs more pages.

⚠️
Heap memory has no automatic lifetime. Unlike stack frames that vanish when a function returns, heap memory stays allocated until you explicitly call free. Forget to free it: memory leak. Free it and keep a pointer: dangling pointer. Free it twice: undefined behavior that can corrupt the heap's internal bookkeeping.

The stack

The stack holds function call frames — local variables, function parameters, saved registers, and the return address. It starts at a high address and grows downward toward lower addresses with each function call. When the function returns, the frame is popped and that memory is available for the next call.

The stack is the fastest memory: no allocation calls, no bookkeeping, just a decrement of the stack pointer. But it has a fixed maximum size (typically 8 MB on Linux). Exceed it with deep recursion or large local arrays and you get a stack overflow — your stack collides with an unmapped guard page and the OS sends SIGSEGV.

c
int global_init = 42; // data segment int global_uninit; // BSS segment (zero-initialized) static int file_scoped = 7; // data segment, internal linkage int main(void) { int local = 10; // stack int *heap = malloc(64); // heap (the pointer itself is on stack) free(heap); return 0; }

Memory-mapped region and kernel space

Above the stack, the OS maps additional regions: shared libraries (libc, for example) are memory-mapped into your address space so all processes can share the same physical pages for read-only code. The mmap syscall lets your program also map files or anonymous regions here directly.

The top of the virtual address space is reserved for the kernel. On x86-64 Linux, the upper 128 TB is kernel space — your process cannot access it directly. Attempting to read or write a kernel address causes an immediate segfault. Kernel code runs at a different privilege level; the only way to cross that boundary is through a syscall.

💡
The heap and stack grow toward each other. Traditionally, the heap grows upward from low addresses and the stack grows downward from high addresses. In a flat enough address space, they could theoretically collide — but modern OS virtual memory makes this extremely unlikely in practice, since the address space is huge and the OS enforces limits.

Inspecting your own process map

On Linux, /proc/self/maps shows your process's current memory map at runtime. Each line shows an address range, permissions, and what it's mapped to:

shell
$ cat /proc/self/maps 55a3b2000000-55a3b2001000 r-xp ... /usr/bin/cat ← text (r-x: read+execute) 55a3b2200000-55a3b2201000 r--p ... /usr/bin/cat ← rodata 55a3b2201000-55a3b2202000 rw-p ... /usr/bin/cat ← data+BSS 55a3b4000000-55a3b4021000 rw-p ... [heap] 7fff9a000000-7fff9a021000 rw-p ... [stack]
one-line takeaway

Every variable in your program lives in a named region — text, data, BSS, heap, or stack — and each region has rules about lifetime, growth direction, and who manages it.