CS 201 · GRAPH TRAVERSAL · BFS · CORMEN §22.2 · WORKING LECTURE

Graph Algorithms — Working Lecture

Breadth-First Search:
What the Textbook Says and What It Looks Like Running

You have Cormen on your shelf. This article makes you dangerous before you open it.

For Dr. Homer — who knew what the queue was for before we did. The terminal remembers.

§ 0

Before We Begin

This lecture is anchored to a real, working implementation: the pathfinding function inside kuule, a hex-grid memory game in the kaiku project. Twenty-two lines of JavaScript. It runs every time the player taps a station on the board. It is correct. It is fast enough for its problem. It was written by a human who understood BFS well enough to write it in this form, not the naive form.

By the end of this lecture: why every line is there, how you might have arrived at it yourself, and what equivalent code looks like in Python and Java. Cormen is waiting when you want it.

Here is the complete implementation.

// infrastructure
const DIR = [
  {q:1,r:-1,s:0}, {q:1,r:0,s:-1}, {q:0,r:1,s:-1},
  {q:-1,r:1,s:0}, {q:-1,r:0,s:1}, {q:0,r:-1,s:1},
];
const hexKey       = h => `${h.q},${h.r},${h.s}`;
const hexAdd       = (a,b) => ({q:a.q+b.q, r:a.r+b.r, s:a.s+b.s});
const hexNeighbors = h => DIR.map(d => hexAdd(h, d));

// the function we are studying
function bfsPath(start, end, valid) {
  const sk = hexKey(start), ek = hexKey(end);
  if (sk === ek) return [start];
  const queue = [[start]], visited = new Set([sk]);
  while (queue.length) {
    const path = queue.shift();
    const cur  = path[path.length - 1];
    for (const n of hexNeighbors(cur)) {
      const nk = hexKey(n);
      if (nk === ek) return [...path, n];
      if (!visited.has(nk) && valid.has(nk)) {
        visited.add(nk);
        queue.push([...path, n]);
      }
    }
  }
  return [start, end];
}

§ 1

The Problem

You are standing on a hex. You want to get to another hex. What is the shortest path?

"Shortest" means fewest steps. Not fewest miles — fewest moves. Each step goes to one neighboring hex. All steps cost the same. A hex is either passable or it isn't.

This is the canonical BFS problem. Unweighted graph. Find the shortest path. BFS solves it exactly.

Key constraint BFS requires that all edges have equal cost. If some steps cost more than others, use Dijkstra. If some steps can have negative cost, use Bellman-Ford. BFS is the right tool here because a hex step is a hex step. Cormen §22.2, §24.3, §24.1 — in that order of complexity.

§ 2

The Intuition: Water on a Grid

Pour water onto your start hex. Water spreads outward: one step per unit time, equally in all directions, through any open hex.

At time 0: water covers only the start hex.
At time 1: water covers all hexes one step from start.
At time 2: water covers all hexes two steps from start.
At time k: water covers all hexes exactly k steps from start.

The first moment water reaches the destination — that is the shortest path. The water does not overshoot. It cannot arrive by a longer route first because longer routes take more time, and time is distance here.

BFS is this process, implemented with a queue. The queue is what enforces the "one layer at a time" behavior. This is the whole algorithm. The rest is bookkeeping.

§ 3

The Two Data Structures

The Queue

A queue is FIFO: first in, first out. You add to the back. You remove from the front. In JavaScript: push adds to the back, shift removes from the front.

This FIFO property is not incidental. It is the mechanism that makes BFS produce shortest paths.

Imagine you're exploring the grid. You find a hex two steps from start. You add its neighbors — three-step hexes — to the back of the queue. Meanwhile, the front of the queue still holds unprocessed two-step hexes. All two-step hexes get processed before any three-step hexes. Water spreading.

What happens if you use a stack instead? A stack is LIFO: last in, first out. You explore deep in one direction before backtracking. That is Depth-First Search — DFS. DFS finds a path. Not necessarily the shortest one. The difference between BFS and DFS is the data structure, not the loop. Change shift() to pop() and you have DFS.

The Visited Set

Without the visited set, BFS loops forever. Node A's neighbors include B. B's neighbors include A. You enqueue A, then enqueue B, then A again, then B again.

The visited set records: we have already decided to explore this node. A node goes into visited the moment we decide to visit it — not after we visit it. This distinction matters. If you add to visited only after processing, you may enqueue the same node multiple times before you process it. Still correct, but wasteful. Add to visited when you enqueue.

Cormen §22.2 The textbook uses color codes: WHITE (undiscovered), GRAY (in queue), BLACK (fully processed). The visited set here maps to WHITE vs. non-WHITE. Gray and black are collapsed — once discovered, we don't track which stage. That simplification is correct for path-finding because we don't need to know whether a node has been fully processed, only whether it's been seen. Cormen uses the full coloring to reason about correctness and runtime. When you open §22.2, the mapping should be legible.

§ 4

Reading the Implementation, Line by Line

function bfsPath(start, end, valid) {

Three parameters. start and end are hex objects: {q, r, s} — cubic hex coordinates. valid is a Set of hex keys representing the traversable grid.

  const sk = hexKey(start), ek = hexKey(end);
  if (sk === ek) return [start];

Serialize both endpoints to string keys immediately. Comparing objects with === in JavaScript compares identity, not value — two different objects {q:0,r:0,s:0} are not === even if they're equal. String keys solve this. Early return if start equals end: the path is just the start node.

  const queue = [[start]], visited = new Set([sk]);

The queue holds paths, not nodes. Each element of the queue is an array of hexes from start to the current position. The queue starts with one path: the path containing only the start hex.

The visited set starts with the start key already marked. We have already "decided to visit" the start — we're there.

  while (queue.length) {
    const path = queue.shift();      // dequeue from the front (FIFO)
    const cur  = path[path.length - 1]; // current node is last in path

shift() removes and returns the first element of the array. This is the FIFO operation. It is what makes this BFS. cur is the last hex in that path — the node we are currently expanding.

    for (const n of hexNeighbors(cur)) {
      const nk = hexKey(n);
      if (nk === ek) return [...path, n];    // found it — early return

For each of the six hex neighbors of the current position: serialize it. If it is the destination, we're done. Return the current path extended by this neighbor. The spread [...path, n] creates a new array. This is the earliest possible exit — we don't enqueue the destination, we return immediately.

      if (!visited.has(nk) && valid.has(nk)) {
        visited.add(nk);
        queue.push([...path, n]);
      }

If the neighbor has not been seen yet, and it's a valid hex: mark it visited, enqueue a new path that extends the current one to include it. Mark visited before enqueueing, not after processing.

Note the two Set operations: visited.has(nk) and valid.has(nk). Both are O(1). This is why these are Sets, not arrays. If valid were an array, valid.includes(nk) would be O(n) — a linear scan on every neighbor of every node. On a small grid it doesn't matter. On a large map it does.

  }
  return [start, end]; // no path found — return direct segment as fallback
}

If the queue empties without finding the destination, there is no path. The fallback returns [start, end]: a two-node path that skips all intermediate steps. This is a silent failure — the caller gets something that looks like a path but teleports. Acceptable for this game (the grid is always connected) but worth knowing is there.

§ 5

How You Would Write It Yourself: Three Versions

The implementation above is not where you start. You start with the question, then iterate toward the answer. Here is that iteration.

Version 1: Does a path exist?

Strip everything except the core. Can we reach end from start? No path reconstruction. Just a yes/no answer.

def can_reach(start, end, neighbors_fn):
    if start == end:
        return True
    queue   = [start]        # a list used as a queue (deque is faster — see below)
    visited = {start}        # a set
    while queue:
        cur = queue.pop(0)  # pop from front — O(n), but clear
        for n in neighbors_fn(cur):
            if n == end:
                return True
            if n not in visited:
                visited.add(n)
                queue.append(n)
    return False

This is BFS with no trimmings. The queue holds individual nodes, not paths. The visited set prevents cycles. pop(0) removes from the front of a Python list — this is FIFO. It is O(n) per operation (Python shifts the whole list). We will fix this in a moment.

Version 2: Return the path — parent map approach

Now we want to know the path, not just whether one exists. The standard technique: store where each node came from. At the end, walk backwards from the destination to reconstruct the path.

from collections import deque

def bfs_path_v2(start, end, valid, neighbors_fn):
    if start == end:
        return [start]
    queue   = deque([start])  # deque: O(1) popleft
    visited = {start}
    parent  = {start: None}  # maps node -> how we got here
    while queue:
        cur = queue.popleft()
        for n in neighbors_fn(cur):
            if n not in visited and n in valid:
                visited.add(n)
                parent[n] = cur
                if n == end:
                    # reconstruct: walk parent chain backwards
                    path, node = [], end
                    while node is not None:
                        path.append(node)
                        node = parent[node]
                    return list(reversed(path))
                queue.append(n)
    return [start, end]

Why deque? list.pop(0) in Python shifts every element left — O(n). deque.popleft() is O(1): the deque is a doubly-linked list internally. For BFS, always use collections.deque. In JavaScript, Array.shift() has the same O(n) problem — acceptable for small grids, not for large graphs. A proper queue implementation would use a linked list or a circular buffer.

Version 3: Paths in the queue — the kuule approach

Instead of a parent map, store the full path in the queue. Simpler code. More memory.

def bfs_path_v3(start, end, valid, neighbors_fn):
    if start == end:
        return [start]
    queue   = deque([[start]])  # queue of paths, not nodes
    visited = {start}
    while queue:
        path = queue.popleft()
        cur  = path[-1]
        for n in neighbors_fn(cur):
            if n == end:
                return path + [n]
            if n not in visited and n in valid:
                visited.add(n)
                queue.append(path + [n])
    return [start, end]

This is the direct Python translation of the kuule implementation.

The tradeoff between v2 and v3 V2 (parent map): memory is O(V) for the parent map, where V is the number of nodes visited. Path reconstruction is O(path length) at the end. One copy of each node.

V3 (paths in queue): every path in the queue stores the full path from start. At peak, the queue may hold O(V) paths, each of length O(V). Memory can be O(V²) in the worst case.

For the kuule grid — roughly 30 hexes — the difference is irrelevant. For a map with 100,000 nodes, use the parent map. The kuule author chose legibility over memory efficiency, correctly, for this problem size.

§ 6

The Same Function in Java

Java makes the types explicit. This is useful. You can see exactly what each data structure holds. The logic is identical to the Python and JavaScript versions.

import java.util.*;
import java.util.function.Function;

public static <T> List<T> bfsPath(
    T start, T end,
    Set<T> valid,
    Function<T, List<T>> neighborsFn
) {
    if (start.equals(end)) return List.of(start);

    Queue<List<T>> queue = new LinkedList<>();
    queue.add(new ArrayList<>(List.of(start)));

    Set<T> visited = new HashSet<>(Set.of(start));

    while (!queue.isEmpty()) {
        List<T> path = queue.poll();        // poll() = dequeue from front
        T cur = path.get(path.size() - 1);

        for (T n : neighborsFn.apply(cur)) {
            if (n.equals(end)) {
                List<T> result = new ArrayList<>(path);
                result.add(n);
                return result;
            }
            if (!visited.contains(n) && valid.contains(n)) {
                visited.add(n);
                List<T> newPath = new ArrayList<>(path);
                newPath.add(n);
                queue.add(newPath);
            }
        }
    }
    return List.of(start, end);
}

Queue<List<T>> — a queue of lists of T. LinkedList implements Queue with O(1) enqueue and dequeue. poll() removes and returns the head — this is FIFO. HashSet gives O(1) contains, same as JavaScript Set and Python set.

JavaScript

queue.shift()     // dequeue
queue.push(x)     // enqueue
visited.has(k)    // O(1) lookup
valid.has(k)      // O(1) lookup

Python

queue.popleft()   # dequeue
queue.append(x)   # enqueue
n in visited      # O(1) lookup
n in valid        # O(1) lookup

Java

queue.poll()      // dequeue
queue.add(x)      // enqueue
visited.contains(n) // O(1)
valid.contains(n)   // O(1)

CLRS pseudocode

DEQUEUE(Q)        // dequeue
ENQUEUE(Q, v)     // enqueue
u.color = GRAY    // mark visited
u.π = v           // record parent

§ 7

Complexity

BFS visits each node at most once and processes each edge at most once. Let V be the number of nodes, E the number of edges.

Time: O(V + E). Every node enters the queue once (visited prevents re-entry). Every edge is examined once per endpoint. Total work proportional to graph size.

Space (parent-map version): O(V). The visited set, the parent map, and the queue each hold at most V entries simultaneously.

Space (paths-in-queue version): O(V · L) where L is the length of the longest path found. In the worst case L = V, giving O(V²). For the kuule grid with V ≈ 30 and L ≤ 10, this is irrelevant.

Cormen Theorem 22.5 When BFS terminates, for every node v reachable from source s, d[v] equals the shortest-path distance δ(s, v). The proof uses the queue's FIFO property to show that d values are assigned in non-decreasing order. If you change the queue to a stack, this theorem no longer holds. The data structure is doing mathematical work, not just bookkeeping.

§ 8

What to Notice About the kuule Implementation

It helps to know what the function is actually doing. kuule is a Simon-says memory game played on a radius-2 hex grid. Eight named stations occupy the outer ring. A ninth hex — Hiljaisuus, the silence — sits at the center. The player has a position on the grid. When they tap a station, BFS traces the shortest path from their current position to the tapped hex. That path is rendered as a faint violet trail on the board, showing the implied movement rather than teleporting the cursor. The BFS runs on every tap. The grid never changes. The function is called for its side-effect on the trail, not to make a routing decision.

One design decision in the kuule code is worth naming explicitly. The graph is represented implicitly — there is no adjacency list, no adjacency matrix, no edge objects. The neighbors of any node are computed on demand by hexNeighbors, which applies the six direction offsets to the current position.

The valid set serves as the graph boundary. It contains Hiljaisuus, the eight stations, and all intermediate hexes in the radius-2 grid — roughly 30 nodes total. Not a node in valid? Not traversable. This is efficient and clean for a regular grid — you don't need to precompute anything.

The hexKey function converts a coordinate object to a string. This is necessary because JavaScript Sets use reference equality for objects. The key is a canonical form: "q,r,s". Any two hexes with the same coordinates produce the same key. The Set now behaves correctly.

There is one silent failure: if end is not in valid, it will never be found, and the function returns [start, end]. The caller is expected to pass a valid destination — in kuule, only the eight named stations are tappable, and all eight are in valid, so the fallback never fires. In a larger system, this would be an assertion.

One thing the BFS does not know: the stations encode flag semaphore letters A–H. The birch-branch glyph on each hex is the signaller's body and arms, the two drooping branches set at the semaphore angles for that letter. The player navigates a cipher they have not been told is a cipher. The pathfinding is indifferent to this. It moves through the grid. What the grid means is somebody else's problem.

Coda

Open Cormen Now

Section 22.2. The BFS procedure on page 595 (4th edition). You will see ENQUEUE and DEQUEUE where we wrote push and shift. You will see u.color = GRAY where we wrote visited.add(nk). You will see u.π = s where the kuule implementation stores the full path in the queue element.

The pseudocode is a formal description of the same procedure. The vocabulary is different. The structure is the same.

Dr. Homer knew the queue had to come first. He was right.

Breadth-First Search: What the Textbook Says and What It Looks Like Running

Before We Begin

The Problem

The Intuition: Water on a Grid

The Two Data Structures

The Queue

The Visited Set

Reading the Implementation, Line by Line

How You Would Write It Yourself: Three Versions

Version 1: Does a path exist?

Version 2: Return the path — parent map approach

Version 3: Paths in the queue — the kuule approach

The Same Function in Java

Complexity

What to Notice About the kuule Implementation

Open Cormen Now

Breadth-First Search:
What the Textbook Says and What It Looks Like Running