A directed acyclic graph (DAG) is a directed graph with no cycles.
A topological sorting assembles the order of nodes so that all edges go from left to right.
Theorem: Any DAG can be topologically sorted.
There are 2 algorithms to topologically sort a DAG.
Claim: In any DAG there exists a node with no incoming edges.
Proof: If every node has some incoming edge, since the graph is finite, eventually there would be a path that connects back to the same node.
However, a DAG is acyclic, so this is a contradiction—there exists some node that has no incoming edges.
Algorithm to find DAG: Find all nodes that have no incoming edges. Remove this node from the graph and put it into the topologically sorted list. The node’s children potentially could not have any incoming edges—at least one of them does. Repeat the process.
We can use a priority queue on the number of incoming edges for each vertex. We can implement the priority queue using a min heap.
Definition: A heap is a nearly complete binary tree.
Grow the heap level by level, left to right.
We can access elements in a heap in array form. ($i$ is 1 indexed)
$\mathrm{parent}(i) = \left\lfloor\dfrac{i}{2}\right\rfloor$
$\mathrm{left\,child}(i) = 2i$
$\mathrm{right\,child}(i) = 2i + 1$
The only organizing principle in a min heap: the children are smaller than the parent.
Harmonic progression: $\sum_0^\infty \dfrac{n}{2^n} = 2$
Work done in a heap: each level there are $\dfrac{n}{2^h}$ nodes. So, total work is $\sum_{h=0}^{\log n} \dfrac{n}{2^h} \cdot h = n\sum \dfrac{h}{2^h} \leq n\sum^{\infty} \dfrac{h}{2^h} = 2n = O(n)$