English Walking in Code

Topological Sort: Complete Interview Guide with DFS and Kahn's Algorithm

Master topological sort with interactive visualizations. Learn both DFS and Kahn's algorithm approaches, solve LeetCode problems, and ace your FAANG interviews.

#algorithm #graph #topological-sort #dfs #bfs #interview #leetcode #visualization

Topological Sort: Complete Interview Guide

What is Topological Sort?

Topological Sort is an algorithm that produces a linear ordering of vertices in a Directed Acyclic Graph (DAG) such that for every directed edge (u,v)(u, v), vertex uu comes before vertex vv in the ordering.

Why Interviewers Love It

  • Fundamental graph algorithm used in scheduling, dependency resolution, and build systems
  • Tests understanding of graph traversal (DFS/BFS)
  • Appears in FAANG interviews in various forms (course schedule, task ordering, build dependencies)
  • Combines multiple concepts: graphs, cycles detection, ordering
  • Has two distinct approaches (DFS and Kahn’s algorithm)

When to Use Topological Sort

Use topological sort when you have:

  • Dependency relationships between tasks or items
  • Need to find a valid ordering that respects dependencies
  • A directed acyclic graph (must verify no cycles)
  • Problems involving prerequisites, course scheduling, or build order

Classic examples:

  • Course prerequisites (take course A before course B)
  • Package dependency management
  • Task scheduling with constraints
  • Compilation order of modules

Visual Intuition

Imagine you’re planning your morning routine:

  • Can’t wear shoes before socks
  • Can’t eat breakfast before preparing it
  • Can’t leave home before getting dressed

These dependencies form a directed graph:

         wake up
        /       \
       /         \
   prepare    get dressed
   breakfast    /     \
       \      socks  shirt
        \        \    /
         \     shoes
          \      /
         leave home

Topological sort gives you a valid order: wake up → prepare breakfast → get dressed → socks → shirt → shoes → leave home

Key Properties

  1. Only works on DAGs (cycles make ordering impossible)
  2. Multiple valid orderings may exist
  3. Time complexity: O(V+E)O(V + E) where VV = vertices, EE = edges
  4. Two main approaches: DFS-based and BFS-based (Kahn’s algorithm)

Interactive Visualization: Kahn’s Algorithm

Kahn’s algorithm uses BFS and in-degree counting. It repeatedly finds nodes with no incoming edges (in-degree = 0) and processes them.

Kahn's Algorithm Topological Sort

Step 0 of 18Calculate in-degree for each node
0in: 21in: 22in: 13in: 14in: 05in: 0
Current
In Result
Visited

In-Degree

Node 0:2
Node 1:2
Node 2:1
Node 3:1
Node 4:0
Node 5:0

Topological Order

Empty

Visited

{}

How it works:

  1. Calculate in-degree (number of incoming edges) for each node
  2. Enqueue all nodes with in-degree 0
  3. While queue is not empty:
    • Dequeue a node and add to result
    • For each neighbor, reduce its in-degree by 1
    • If neighbor’s in-degree becomes 0, enqueue it
  4. If result contains all nodes → success; otherwise → cycle detected

Interactive Visualization: DFS Approach

DFS-based approach explores as deep as possible, then adds nodes to result in reverse post-order.

DFS-based Topological Sort

Step 0 of 14Initialize DFS-based topological sort
012345
Current
In Result
Finished
Visited

Stack

Empty

Topological Order

Empty

Visited

{}

How it works:

  1. Mark all nodes as unvisited
  2. For each unvisited node, run DFS
  3. After visiting all descendants of a node, push it to a stack
  4. Reverse the stack to get topological order

General Pattern: Identifying Topological Sort Problems

Problem Characteristics

Use topological sort when you see:

  • “Prerequisites” or “dependencies” mentioned
  • “Order” or “sequence” needs to be determined
  • Directed graph with constraints
  • “Course schedule”, “task scheduling”, “build order”
  • Need to detect if ordering is possible (cycle detection)

Solution Steps

  1. Build the graph from input (adjacency list)
  2. Choose algorithm: Kahn’s (easier to detect cycles) or DFS (more intuitive)
  3. Run the algorithm to get topological order
  4. Check validity: result should contain all nodes
  5. Return result based on problem requirements

Code Templates

Kahn’s Algorithm (BFS-based)

Loading...

DFS-based Topological Sort

Loading...

The three-color marking method is a more elegant DFS implementation that uses a single state array instead of two boolean arrays. This approach is widely used in graph algorithms and is particularly well-suited for cycle detection.

Three Color States:

  • White (0): Unvisited nodes
  • Gray (1): Nodes currently being visited (in the current DFS path)
  • Black (2): Nodes that have finished processing

Key Intuition:

  • If you encounter a gray node during DFS → back edge found → cycle exists
  • If you encounter a black node → node and its subtree fully processed → safe to skip
  • If you encounter a white node → continue DFS exploration

Advantages:

  • Cleaner code (only one state array needed)
  • Easier to understand and explain
  • Classic pattern in graph algorithms
  • More intuitive cycle detection logic
Loading...

Three-Color vs Two Boolean Arrays:

AspectThree-Color Methodvisited + inStack
Number of Arrays1 state array2 boolean arrays
Space UsageSlightly less (single array)Slightly more (two arrays)
Code Clarity⭐ More intuitiveRequires understanding two flags
Cycle DetectionCheck for gray nodesCheck inStack
Industry Standard⭐ Classic graph algorithm patternEqually valid

Recommendation: If you’re comfortable with the three-color method, use it in interviews—it demonstrates deeper knowledge. It’s the standard approach taught in graph algorithm textbooks.

LeetCode Problem Examples

Problem 1: Course Schedule II (LC 210)

Problem: There are numCourses courses labeled from 0 to numCourses - 1. You are given an array prerequisites where prerequisites[i] = [ai, bi] indicates that you must take course bi first if you want to take course ai. Return the ordering of courses you should take to finish all courses. If impossible, return empty array.

Example:

Input: numCourses = 4, prerequisites = [[1,0],[2,0],[3,1],[3,2]]
Output: [0,2,1,3] or [0,1,2,3]

Explanation:
  Course 0 → Course 1 → Course 3
  Course 0 → Course 2 ↗

Solution using Kahn’s Algorithm:

Loading...

Complexity Analysis:

  • Time: O(V+E)O(V + E) where VV = number of courses, EE = number of prerequisites
  • Space: O(V+E)O(V + E) for the graph and auxiliary data structures

Key Insights:

  • Graph direction: prerequisites[i] = [a, b] means b → a (take b before a)
  • Use in-degree to track how many prerequisites remain
  • If final result has fewer courses than total, a cycle exists

Problem 2: Alien Dictionary (LC 269)

Note: This is a LeetCode Premium problem. You need a subscription to submit.

Problem: Given a sorted dictionary of an alien language, derive the order of characters in this language.

Example:

Input: words = ["wrt","wrf","er","ett","rftt"]
Output: "wertf"

Explanation:
  From "wrt" and "wrf", we know 't' < 'f'
  From "wrt" and "er", we know 'w' < 'e'
  From "er" and "ett", we know 'r' < 't'
  From "ett" and "rftt", we know 'e' < 'r'
  
  So order is: w → e → r → t → f

Solution:

Loading...

Complexity Analysis:

  • Time: O(C)O(C) where CC = total length of all words
  • Space: O(1)O(1) or O(26)O(26) for English alphabet

Key Insights:

  • Compare only adjacent words in the sorted dictionary
  • Only the first different character creates an ordering constraint
  • Edge case: if word1 is a prefix of word2 but longer → invalid input
  • Multiple valid orderings possible; any valid one is acceptable

Problem 3: Minimum Height Trees (LC 310)

Problem: A tree is an undirected graph in which any two vertices are connected by exactly one path. Given such a tree with n nodes labeled from 0 to n - 1, find all root labels that result in minimum height trees.

Example:

Input: n = 6, edges = [[3,0],[3,1],[3,2],[3,4],[5,4]]
Output: [3,4]

Tree structure:
     0  1  2
      \ | /
        3
        |
        4
        |
        5

Solution (Topological Sorting from Leaves):

Loading...

Complexity Analysis:

  • Time: O(n)O(n) where nn = number of nodes
  • Space: O(n)O(n) for the graph

Key Insights:

  • This is a topological sorting variant on an undirected tree
  • Trim leaves layer by layer until 1 or 2 nodes remain
  • The remaining nodes are the centroids of the tree
  • At most 2 nodes can be roots of minimum height trees

Edge Cases and Common Mistakes

Edge Cases to Handle

  1. Empty graph: No nodes → return empty result
  2. Single node: Only one node → return that node
  3. Cycle detection: If cycle exists → no valid topological sort
  4. Disconnected graph: Handle multiple components if required
  5. Self-loops: Usually invalid for topological sort
  6. Multiple edges: Between same pair of nodes

Common Mistakes

Mistake 1: Forgetting to check for cycles

// Wrong: assumes graph is always acyclic
if (result.size() != numNodes) {
    // Forgot to handle this case!
}

Correct:

if (result.size() != numNodes) {
    return new int[0]; // or throw exception
}

Mistake 2: Wrong edge direction

// Wrong: reversed edge direction
graph[prerequisites[i][0]].add(prerequisites[i][1]);

Correct: If [a, b] means “take b before a”:

graph[prerequisites[i][1]].add(prerequisites[i][0]);

Mistake 3: Not handling multiple components

// Wrong: only starts DFS from node 0
dfs(0);

Correct:

for (int i = 0; i < n; i++) {
    if (!visited[i]) {
        dfs(i);
    }
}

Mistake 4: Using visited alone for cycle detection in DFS

// Wrong: can't detect cycles properly
if (visited[neighbor]) return true; // This is wrong!

Correct: Use both visited and inStack:

if (!visited[neighbor]) {
    if (hasCycle(neighbor)) return true;
} else if (inStack[neighbor]) {
    return true; // Back edge = cycle
}

Interview Tips

How to Explain Topological Sort

Framework for explanation:

  1. Define the problem: “Topological sort orders nodes in a DAG so dependencies come first”
  2. State assumptions: “The graph must be directed and acyclic”
  3. Choose approach: “I’ll use Kahn’s algorithm with in-degree counting” or “I’ll use DFS with post-order traversal”
  4. Explain cycle detection: “If we can’t process all nodes, a cycle exists”
  5. Analyze complexity:O(V+E)O(V + E) time, O(V+E)O(V + E) space”

When Interviewer Asks for Optimizations

Common follow-ups:

  1. “Can you detect the cycle and return the cycle path?”

    • Use DFS with recursion stack tracking
    • When back edge found, reconstruct path from stack
  2. “What if there are multiple valid orderings? Return all of them.”

    • Use backtracking with DFS
    • Explore all possible orderings (exponential time)
  3. “What if the graph is very large and doesn’t fit in memory?”

    • Use external sorting techniques
    • Process graph in chunks
    • Use disk-based graph databases
  4. “Can you find the lexicographically smallest topological order?”

    • Use priority queue (min-heap) instead of regular queue in Kahn’s algorithm
    • Time becomes O(VlogV+E)O(V \log V + E)

Time & Space Complexity

Kahn’s Algorithm (BFS-based)

Time Complexity: O(V+E)O(V + E)

  • Building graph: O(E)O(E)
  • Calculating in-degrees: O(E)O(E)
  • Processing each vertex once: O(V)O(V)
  • Processing each edge once: O(E)O(E)
  • Total: O(V+E)O(V + E)

Space Complexity: O(V+E)O(V + E)

  • Adjacency list: O(E)O(E)
  • In-degree array: O(V)O(V)
  • Queue: O(V)O(V) worst case
  • Result array: O(V)O(V)
  • Total: O(V+E)O(V + E)

DFS-based Approach

Time Complexity: O(V+E)O(V + E)

  • DFS visits each vertex once: O(V)O(V)
  • DFS explores each edge once: O(E)O(E)
  • Total: O(V+E)O(V + E)

Space Complexity: O(V+E)O(V + E)

  • Adjacency list: O(E)O(E)
  • Visited array: O(V)O(V)
  • Recursion stack: O(V)O(V) in worst case
  • Result stack: O(V)O(V)
  • Total: O(V+E)O(V + E)

Best, Average, Worst Cases

Both algorithms have:

  • Best case: O(V+E)O(V + E) - linear chain of dependencies
  • Average case: O(V+E)O(V + E) - typical DAG structure
  • Worst case: O(V+E)O(V + E) - fully connected DAG

Note: Time complexity is always O(V+E)O(V + E) regardless of graph structure, making topological sort very efficient!

When NOT to Use Topological Sort

❌ Don’t use topological sort when:

  1. Graph has cycles

    • Topological sort requires a DAG
    • Use cycle detection algorithms instead
    • Or find strongly connected components (Tarjan’s/Kosaraju’s algorithm)
  2. Undirected graph

    • Topological sort requires directed edges
    • Consider: BFS, DFS, or spanning tree algorithms
  3. Need shortest path

    • Topological sort gives ordering, not distances
    • Use: Dijkstra’s, Bellman-Ford, or A*
    • Exception: Can use topological sort + DP for shortest path in DAG
  4. Need all possible orderings

    • Standard topological sort gives one ordering
    • Use: Backtracking with DFS (exponential time)
  5. Real-time updates to graph

    • Topological sort is not incremental
    • Consider: Dynamic topological ordering algorithms

Better Alternatives

Your NeedInstead of Topo SortUse This
Find cyclesTopological sortDFS with back edge detection
Shortest pathTopo sort + BFSDijkstra’s algorithm
All orderingsSingle topo sortBacktracking
Strongly connected componentsTopo sortTarjan’s or Kosaraju’s
Undirected graph traversalTopo sortBFS or DFS

Kahn’s vs DFS: Which to Choose?

AspectKahn’s AlgorithmDFS-based
ApproachBFS with in-degreePost-order DFS
IntuitionRemove nodes with no dependenciesFinish children before parents
Cycle DetectionEasier (count processed nodes)Requires extra inStack array
SpaceO(V+E)O(V + E)O(V+E)O(V + E) + recursion stack
Iterative vs RecursiveIterative (queue)Recursive (can be iterative)
Lexicographically smallestEasy (use min-heap)Harder
Interview preference⭐ More intuitive for beginners⭐ More elegant, less code

Recommendation:

  • Kahn’s algorithm if you need to detect cycles explicitly or want iterative solution
  • DFS-based if you’re comfortable with recursion and want cleaner code

Summary: Key Takeaways for Interviews

Pattern Recognition

Use topological sort when you see:

  • Dependencies, prerequisites, or ordering constraints
  • “Course schedule”, “task order”, “build dependencies”
  • Need to detect if valid ordering exists (cycle detection)
  • Directed acyclic graph (DAG) structure

Template to Memorize

Kahn’s Algorithm (4 steps):

  1. Build graph + calculate in-degrees
  2. Enqueue all nodes with in-degree 0
  3. While queue not empty: dequeue, add to result, reduce neighbors’ in-degrees
  4. Check if result.length == numNodes (cycle detection)

DFS Approach (4 steps):

  1. Build graph
  2. DFS from each unvisited node
  3. After visiting all descendants, push node to stack
  4. Reverse stack = topological order

Key Interview Points

  1. Always check for cycles - topological sort only works on DAGs
  2. Clarify edge direction - [a, b] means what? (a → b or b → a?)
  3. Handle multiple components - DFS from all unvisited nodes
  4. Time complexity: O(V+E)O(V + E) - very efficient!
  5. Space complexity: O(V+E)O(V + E) - need to store graph

Before You Leave the Interview

Make sure you’ve covered:

  • ✅ Built the graph correctly (check edge direction!)
  • ✅ Handled cycle detection
  • ✅ Considered edge cases (empty graph, single node, disconnected components)
  • ✅ Analyzed time and space complexity
  • ✅ Tested with a small example

Final tip: Topological sort appears in 5-10% of FAANG graph interviews. Master both approaches (Kahn’s and DFS), and you’ll be ready for any variation!