Building a Mini Shell — Systems

What a shell actually does

Strip away the fancy features — tab completion, history, job control — and a shell is this loop: read a line, parse it into a command and arguments, fork a child, exec the command in the child, wait for it to finish, repeat. Output redirection is just dup2 before exec. Pipes are two forks sharing a pipe fd. Built-in commands like cd run in the shell process itself (because forking and exec-ing cd would change the child's directory, not the shell's).

Part 1: the main loop

c

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <fcntl.h>
#include <sys/wait.h>

#define MAXARGS 64
#define MAXLINE 1024

int main(void) {
    char line[MAXLINE];

    while (1) {
        write(STDOUT_FILENO, "$ ", 2);

        if (!fgets(line, MAXLINE, stdin)) break;  // EOF (Ctrl+D)

        int len = strlen(line);
        if (len > 0 && line[len-1] == '\n')
            line[len-1] = '\0';
        if (line[0] == '\0') continue;

        run_command(line);
    }
    return 0;
}
          

Part 2: parsing and running a simple command

c

void run_command(char *line) {
    char *argv[MAXARGS];
    int   argc = 0;

    // tokenize by whitespace
    char *tok = strtok(line, " \t");
    while (tok && argc < MAXARGS - 1) {
        argv[argc++] = tok;
        tok = strtok(NULL, " \t");
    }
    argv[argc] = NULL;
    if (!argc) return;

    // built-in: cd
    if (strcmp(argv[0], "cd") == 0) {
        if (argv[1] && chdir(argv[1]) == -1)
            perror("cd");
        return;
    }

    pid_t pid = fork();
    if (pid < 0) { perror("fork"); return; }

    if (pid == 0) {
        execvp(argv[0], argv);
        perror(argv[0]);
        _exit(127);
    }

    waitpid(pid, NULL, 0);
}
          

This handles simple commands. execvp searches PATH automatically — so ls finds /bin/ls without you specifying the full path.

Part 3: output redirection

To handle cmd > file, scan the argument list for >, pull out the filename, open it, and dup2 it over fd 1 before calling exec.

c

// in the child, before execvp:
for (int i = 0; argv[i]; i++) {
    if (strcmp(argv[i], ">") == 0 && argv[i+1]) {
        int fd = open(argv[i+1],
                      O_WRONLY | O_CREAT | O_TRUNC, 0644);
        if (fd < 0) { perror(argv[i+1]); _exit(1); }
        dup2(fd, STDOUT_FILENO);
        close(fd);
        argv[i] = NULL;  // remove > and filename from argv
        break;
    }
}
          

Part 4: pipes

For cmd1 | cmd2, split the line on |, create a pipe, fork twice. The left child writes to the pipe's write end (via dup2); the right child reads from the read end.

c

void run_pipe(char *left, char *right) {
    int pfd[2];
    pipe(pfd);

    pid_t p1 = fork();
    if (p1 == 0) {
        dup2(pfd[1], STDOUT_FILENO);
        close(pfd[0]); close(pfd[1]);
        exec_line(left);   // parse + exec, no return
    }

    pid_t p2 = fork();
    if (p2 == 0) {
        dup2(pfd[0], STDIN_FILENO);
        close(pfd[0]); close(pfd[1]);
        exec_line(right);
    }

    close(pfd[0]); close(pfd[1]);
    waitpid(p1, NULL, 0);
    waitpid(p2, NULL, 0);
}
          

What this shell doesn't handle

Real shells are much more complex. This mini shell omits:

Job control — Ctrl+Z suspending a foreground job, fg/bg, managing process groups and terminal ownership.
Signal handling — the shell should ignore SIGINT in the parent so Ctrl+C only kills the foreground child, not the shell itself.
Multiple pipes — a | b | c requires chaining pipe fds across three processes.
Quoting and escaping — "hello world" as a single argument, \n in strings.
Environment variables, globbing, history — each of these is a project in itself.

💡

Reading real shell source code is instructive. dash (the Debian Almquist shell, used as /bin/sh on Ubuntu) is about 14,000 lines and handles essentially everything. It's well-organized and readable. Compare its forkshell() to your fork, and redirect() to your dup2 — the concepts are identical, just with all the edge cases filled in.

one-line takeaway

A shell is a loop of read → parse → fork → exec → wait, with pipes and redirection implemented entirely through file descriptor manipulation before exec.