Systems programming · lesson 08

Threads

A thread is an independent execution context within a process. All threads in a process share the same heap, globals, and file descriptors — but each has its own stack and registers. That shared heap is both the power and the danger.

in progress
12 min

Threads vs processes

Forking a process creates a separate address space with copy-on-write pages. Communication between processes requires explicit IPC (pipes, sockets, shared memory). Creating a thread is cheaper: the new thread shares the same virtual address space, heap, and fds. It gets its own stack (carved from the address space) and its own set of registers (saved/restored on context switch).

The tradeoff: isolation vs. cost. Processes are isolated — a crash in one doesn't affect others. Threads are cheaper to create and communicate through shared memory, but one thread can corrupt data that other threads depend on.

Creating threads with pthreads

c
#include <pthread.h> #include <stdio.h> void *worker(void *arg) { int id = *(int *)arg; printf("thread %d running\n", id); return NULL; } int main(void) { pthread_t t1, t2; int id1 = 1, id2 = 2; pthread_create(&t1, NULL, worker, &id1); pthread_create(&t2, NULL, worker, &id2); pthread_join(t1, NULL); // wait for t1 to finish pthread_join(t2, NULL); // wait for t2 to finish return 0; } // compile with: gcc prog.c -lpthread

pthread_create takes a thread handle, optional attributes, a function pointer, and a single void * argument. The function runs in the new thread. pthread_join blocks until that thread exits and collects its return value. A thread that is never joined is a resource leak — its stack and state remain allocated.

⚠️
Don't pass a stack address to a thread that outlives the stack frame. In the example above, &id1 points to a local variable in main. This is safe only because main calls pthread_join before returning. If main returned first, id1 would be gone and the thread would read garbage.

What each thread owns vs shares

per-thread (private) shared across all threads
stack heap (malloc/free)
registers (including PC) global and static variables
errno file descriptors
signal mask memory mappings
💡
errno is per-thread on POSIX systems. If it were a true global, two threads doing I/O simultaneously would overwrite each other's error codes. The C runtime uses thread-local storage (TLS) to give each thread its own errno.

Thread-local storage

__thread (GCC/Clang) or _Thread_local (C11) declares a variable that has a separate copy per thread — initialized independently when each thread starts. This is useful for per-thread caches or per-thread error state.

c
static __thread int call_count = 0; void do_work(void) { call_count++; // each thread increments its own copy }

When to use threads vs processes

Use threads when you need shared memory for performance — parallel computations on a large dataset, a server that needs low latency between worker contexts. Use processes when you need isolation — a crash or bug in one should not corrupt others. Web servers like Nginx and Apache offer both models; Chrome uses processes per tab specifically for crash isolation.

one-line takeaway

Threads are cheap concurrent execution within a process — each has its own stack and registers, but all share the heap, so any unsynchronized access to shared state is a bug.