Graphs vs trees
A tree is a connected graph with no cycles — exactly one path between any two nodes. A general graph drops both constraints. Nodes can have any number of edges. Edges can form cycles. The graph may not even be connected (some nodes may be unreachable from others).
Graphs model the real world: cities connected by roads, web pages connected by links, functions connected by call relationships, tasks connected by dependencies. The two traversal strategies — BFS and DFS — are how you explore any of these.
Representing a graph: adjacency list
The most common representation is an adjacency list: for each node, store a list of its neighbors. In C, this is typically an array of linked lists or an array of arrays.
matrix[u][v] = 1 if there's
an edge. It's O(1) to check if two nodes are connected, but O(n²) space regardless
of edge count. Adjacency lists use O(V + E) space — much better for sparse graphs
(few edges relative to nodes), which is the common case.
The visited set — why it's mandatory
On a tree, you can recurse without tracking visited nodes — there are no cycles, so you can never revisit a node. On a general graph, a cycle means you'll re-enter a node you've already processed, looping forever (or until stack overflow).
The fix is a visited[] boolean array. Mark a node before processing it,
and skip any neighbor that's already marked.
Breadth-first search (BFS)
BFS explores level by level. Start at the source node, visit all its neighbors, then all their unvisited neighbors, and so on. The key data structure is a queue — FIFO order ensures you finish the current level before moving to the next.
BFS finds the shortest path (in number of edges) from the source to any reachable node, because it visits nodes in order of their distance from the source.
Depth-first search (DFS)
DFS goes as deep as possible down one path before backtracking to explore alternatives. It naturally maps to recursion — the call stack acts as the implicit stack. It can also be written iteratively using an explicit stack.
DFS doesn't guarantee shortest paths, but it's excellent for problems where you need to explore all possibilities: detecting cycles, topological sorting, finding connected components, and solving mazes.
BFS vs DFS — which to use
The choice depends entirely on what you're looking for:
| goal | use | why |
|---|---|---|
| shortest path (unweighted) | BFS | visits by distance from source |
| cycle detection | DFS | back edge → cycle |
| topological sort | DFS | post-order gives reverse topo order |
| connected components | either | run from each unvisited node |
| maze / path existence | DFS | explores fully before backtracking |
Complexity
Both BFS and DFS are O(V + E) where V is the number of vertices and E is the number of edges. Every vertex is visited once, and every edge is examined once. The difference is memory: BFS can hold up to O(V) nodes in the queue at once (the entire frontier), while DFS's call stack depth is bounded by the longest path, which is at most O(V).
Topological sort with DFS
A topological sort of a directed acyclic graph (DAG) orders vertices so that every edge points from an earlier vertex to a later one. This is the order you'd need to build a project: dependencies before the things that depend on them.
The DFS-based approach: run DFS, and when a node is fully processed (all its descendants visited), push it onto a stack. The final stack, read top to bottom, gives the topological order.
BFS uses a queue and finds shortest paths; DFS uses a stack (or recursion) and explores depth-first — both are O(V + E) and both require a visited set.