English Walking in Code

Heap Sort Algorithm: The Complete Visual Guide to Mastering Heaps

Master Heap Sort through interactive visualizations: from binary heap fundamentals to MAX-HEAPIFY, BUILD-MAX-HEAP, and complete sorting. Includes step-by-step animations and code in JavaScript, Java, Python.

#algorithm #sorting #heap #data-structure #visualization #binary-tree #priority-queue

Why Learn Heap Sort?

Picture this: you need a sorting algorithm that never slows down—not for sorted data, not for reverse-sorted data, not ever.

That’s Heap Sort.

Invented by J.W.J. Williams in 1964, heap sort achieves something remarkable: guaranteed O(n log n) performance in every scenario. No degradation. No worst-case surprises.

But heap sort offers more than just sorting. Master it, and you’ll unlock:

  • Priority queues — the backbone of task schedulers
  • Graph algorithms — Dijkstra, Prim, and beyond
  • Top-K problems — finding needles in massive haystacks
  • Real-time systems — where predictable performance matters

⚡ At a Glance

MetricValue
Time ComplexityO(n log n) — always
Space ComplexityO(1) — in-place sorting
StabilityUnstable
Best ForGuaranteed performance, memory-constrained systems

The secret behind heap sort? A beautiful data structure called the binary heap.

Let’s build our understanding from the ground up.


The Heap Data Structure

What Makes a Heap Special?

A heap is a complete binary tree with a twist: every node follows a specific ordering rule with its children.

🌳 Complete Binary Tree

A complete binary tree fills each level left-to-right before moving to the next level.

This structure is perfect for array storage—no pointers needed!

Height guarantee: O(log n)

The Array Trick

Here’s the elegant part. We can store a heap in a simple array using index math:

For any node at index i (0-based):
├── Parent:      ⌊(i - 1) / 2⌋
├── Left child:  2i + 1
└── Right child: 2i + 2

Visual example: Array [16, 14, 10, 8, 7, 9, 3, 2, 4, 1]

              [16]          ← index 0 (root)
             /    \
          [14]    [10]      ← indices 1, 2
          /  \    /  \
        [8]  [7] [9] [3]    ← indices 3, 4, 5, 6
        / \   |
      [2][4] [1]            ← indices 7, 8, 9

No wasted space. No pointer overhead. Just pure efficiency.


Max-Heap vs Min-Heap

Heaps come in two flavors, defined by their ordering rule.

Max-Heap: Largest on Top

Every parent is its children:

A[parent(i)] ≥ A[i]

Result: The maximum element sits at the root. Always.

Min-Heap: Smallest on Top

Every parent is its children:

A[parent(i)] ≤ A[i]

Result: The minimum element sits at the root.

💡 For ascending sort, we use a max-heap. Each extraction removes the current maximum and places it at the end.


MAX-HEAPIFY: The Core Operation

Before we can sort, we need to understand the fundamental operation: MAX-HEAPIFY.

The Problem

Imagine a node that’s smaller than one of its children. The heap property is violated!

MAX-HEAPIFY fixes this by sinking the offending node down to its rightful place.

The Algorithm

MAX-HEAPIFY(A, heap_size, i):
  1. Find the largest among: node i, left child, right child
  2. If the largest isn't node i:
     a. Swap node i with the largest
     b. Recurse on the swapped position

Implementation

Loading...

Watch It in Action

The visualization below shows MAX-HEAPIFY repairing the array [4, 14, 10, 8, 7, 9, 3, 2, 1].

Notice: only the root (4) violates the heap property. Everything below is already a valid max-heap. Watch how 4 “sinks” to find its correct position:

MAX-HEAPIFY Operation Demo

Example: Array [4, 14, 10, 8, 7, 9, 3, 2, 1], run MAX-HEAPIFY on root

Speed:
Step 1/0
i (current)
L (left child)
R (right child)
largest
swapping

Step-by-step breakdown:

  1. Compare: 4 vs children 14 and 10 → 14 wins
  2. Swap: 4 ↔ 14, node sinks to position 1
  3. Compare: 4 vs children 8 and 7 → 8 wins
  4. Swap: 4 ↔ 8, node sinks to position 3
  5. Compare: 4 vs children 2 and 1 → 4 wins
  6. Done! Heap property restored

Why Sinking Works (The Deep Dive)

A natural question: “When we sink a node, don’t we break something else?”

The answer lies in a critical precondition.

The Precondition

MAX-HEAPIFY assumes both subtrees are already valid max-heaps.

Only the root node might be out of place.

Why Nothing Breaks

When we swap with the largest child:

  1. The parent position becomes valid — we chose the largest value
  2. The untouched subtree stays valid — we never modified it
  3. The touched subtree maintains the precondition — only its root might violate, so we recurse
Before:           After swap:
    [4] ← bad         [14] ← now valid!
   /   \              /   \
[14]   [10]         [4]   [10] ← untouched

              recurse here

This elegant design ensures correctness at every step.

Time Complexity

O(log n) — we sink at most to a leaf, and tree height is log n.


Building a Heap from Scratch

Now the magic: transforming any array into a valid max-heap.

The Key Insight

Leaf nodes are already valid heaps! A single element trivially satisfies the heap property.

So we only need to fix the non-leaf nodes—and we do it bottom-up.

The Algorithm

BUILD-MAX-HEAP(A):
  FOR i FROM ⌊n/2⌋ - 1 DOWNTO 0:
      MAX-HEAPIFY(A, n, i)

Why This Works

Q1: Why start from ⌊n/2⌋ - 1?

Nodes at indices ⌊n/2⌋ to n-1 are all leaves.

Starting from ⌊n/2⌋ - 1 (the last non-leaf) skips half the array—instant 50% savings!

Array with n = 10:
Index:  0   1   2   3   4   5   6   7   8   9
        └───non-leaf───┘   └─────leaf─────┘
                           (already valid!)

Q2: Why iterate backwards?

This is crucial for correctness!

Backwards: When processing node i, its children (at 2i+1 and 2i+2) have larger indices, so they’re already processed. The precondition is satisfied! ✅

Forwards: When processing node i, its children haven’t been touched yet. The precondition fails! ❌

Q3: Why is the result guaranteed correct?

Loop invariant: After processing index i, every node with index ≥ i roots a valid max-heap.

By induction:

  • Base case: Leaves are trivially valid
  • Inductive step: When we process i, both children are valid heaps, so MAX-HEAPIFY produces a valid heap rooted at i
  • Termination: After processing index 0, the entire array is a valid max-heap

Implementation

Loading...

Surprising Complexity: Why O(n), Not O(n log n)?

At first glance, you might think: “n/2 nodes × O(log n) each = O(n log n)”

But the actual complexity is O(n)! 🎉

This is one of the most elegant results in algorithm analysis. Let’s prove it.

The Naive Analysis (Wrong)

The naive approach assumes:

  • We call MAX-HEAPIFY on n/2 non-leaf nodes
  • Each call costs O(log n) (tree height)
  • Total = n2×O(logn)=O(nlogn)\frac{n}{2} \times O(\log n) = O(n \log n)

But this overestimates because not all nodes have the same sinking distance!

The Tight Analysis (Correct)

The key insight: nodes near the bottom have less distance to sink.

In a complete binary tree with height h=log2nh = \lfloor \log_2 n \rfloor:

Level (from bottom)Number of NodesMax Sinking Distance
0 (leaves)≤ n/20 (skipped!)
1≤ n/41
2≤ n/82
k≤ n/2^(k+1)k
h (root)1h

The Mathematical Proof

Total work = sum over all levels of (nodes at level kk) × (max sinking distance kk):

T(n)=k=0hn2k+1kk=0hn2k+1k=n2k=0hk2kT(n) = \sum_{k=0}^{h} \left\lceil \frac{n}{2^{k+1}} \right\rceil \cdot k \leq \sum_{k=0}^{h} \frac{n}{2^{k+1}} \cdot k = \frac{n}{2} \sum_{k=0}^{h} \frac{k}{2^k}

The infinite series k=0k2k\sum_{k=0}^{\infty} \frac{k}{2^k} converges to 2:

k=0k2k=12+24+38+416+=2\sum_{k=0}^{\infty} \frac{k}{2^k} = \frac{1}{2} + \frac{2}{4} + \frac{3}{8} + \frac{4}{16} + \cdots = 2

Therefore:

T(n)n2×2=n=O(n)T(n) \leq \frac{n}{2} \times 2 = n = O(n)

Visual Intuition

                    [1]           ← 1 node, sinks up to h levels
                   /   \
                 [2]   [3]        ← 2 nodes, sink up to h-1 levels  
                / \    / \
              [4] [5] [6] [7]     ← 4 nodes, sink up to h-2 levels
              /\  /\  /\  /\
            [... n/4 nodes ...]   ← sink at most 1 level
            [... n/2 leaves ...]  ← 0 work (skipped!)

The bottom-heavy structure saves us:

  • Half the nodes are leaves → 0 work
  • Quarter of nodes sink at most 1 level → n4\frac{n}{4} work
  • Only 1 node (root) sinks logn\log n levels → logn\log n work

Most work is on nodes near the bottom, which have tiny heights!


The Sorting Algorithm

With BUILD-MAX-HEAP and MAX-HEAPIFY in hand, the sorting algorithm is beautifully simple.

Two-Phase Strategy

HEAPSORT(A):
  // Phase 1: Transform array into max-heap
  BUILD-MAX-HEAP(A)
  
  // Phase 2: Extract maximum repeatedly
  FOR i FROM n-1 DOWNTO 1:
      SWAP A[0] and A[i]      // Max goes to sorted region
      MAX-HEAPIFY(A, i, 0)    // Restore heap on remainder

How It Works

  1. Build the heap — maximum element rises to the root
  2. Swap root with last element — maximum moves to its final position
  3. Shrink the heap — sorted elements are excluded from future operations
  4. Re-heapify — restore heap property for the remaining elements
  5. Repeat — until only one element remains

The sorted array emerges from right to left!


Interactive Visualization

See heap sort come alive. Watch both phases unfold:

  1. Building the max-heap from an unsorted array
  2. Extracting maximums to produce the sorted result

Toggle between tree view and array view to understand how the heap structure maps to array indices.

1/0

Heap Structure (Complete Binary Tree)

Array Representation

In Heap
Current
Swapping
Sorted
Out of Heap

Complete Implementation

Here’s the full heap sort algorithm in three languages:

Loading...

Performance Analysis

Time Complexity Breakdown

OperationComplexityNotes
BUILD-MAX-HEAPO(n)Linear despite appearances
Single MAX-HEAPIFYO(log n)Height of tree
Extraction loopO(n log n)n-1 extractions × O(log n) each
TotalO(n log n)Asymptotically optimal for comparison sorts

Space Complexity

O(1) — Heap sort is an in-place algorithm. Only a constant number of variables needed.

How Does It Compare?

AlgorithmBestAverageWorstSpaceStable?
Heap SortO(n log n)O(n log n)O(n log n)O(1)
Quick SortO(n log n)O(n log n)O(n²)O(log n)
Merge SortO(n log n)O(n log n)O(n log n)O(n)
Insertion SortO(n)O(n²)O(n²)O(1)

Key insight: Heap sort’s worst case matches its best case. No degradation, ever.


Practice Problems

Heaps unlock elegant solutions to many classic problems. Here are two essentials:

Problem 1: Merge k Sorted Lists (LC 23)

Problem: Merge k sorted linked lists into one sorted list.

Approach: Use a min-heap to track the smallest element across all k list heads. Pop, append, advance. Repeat.

MetricValue
TimeO(N log k) where N = total nodes
SpaceO(k)
Loading...

Problem 2: Kth Largest Element (LC 215)

Problem: Find the kth largest element in an unsorted array.

Approach: Self-Built Min-Heap

Instead of using built-in PriorityQueue or heapq, we implement the heap from scratch — perfect practice for the concepts learned above!

Key insight: Maintain a min-heap of size k. As we process each element:

  • If heap size < k: insert the element
  • If element > heap top: replace the top (smallest of the k largest)

After processing all elements, the heap root is the kth largest!

MetricValue
TimeO(n log k)
SpaceO(k)
Loading...

Watch Sift Up & Sift Down in Action

The visualization below demonstrates the two core heap operations:

Sift Up (↑) — When inserting a new element:

  1. Add the element at the end of the heap
  2. Compare with parent; if smaller (min-heap), swap
  3. Repeat until heap property is restored

Sift Down (↓) — When replacing the root:

  1. Replace root with the new element
  2. Compare with children; swap with the smaller child
  3. Repeat until heap property is restored

Kth Largest Element - Sift Up / Sift Down Demo

Array [3, 2, 1, 5, 6, 4], find 2th largest

k =
Step 1/0

Input Array

Min-Heap (size ≤ 2)

Heap is empty
Root (min)
Sifting up
Sifting down
Swapping
Current element


Key Takeaways

When to Choose Heap Sort

✅ Advantages❌ Trade-offs
Guaranteed O(n log n) — no worst-case surprisesNot stable — equal elements may reorder
O(1) space — truly in-placeCache-unfriendly — scattered memory access
No stack overflow risk — bounded recursionSlower constants — usually loses to tuned quicksort

Real-World Applications

Beyond sorting, heap mastery enables:

  • Priority Queues — the standard implementation uses heaps
  • Top-K Problems — find the k largest/smallest efficiently
  • OS Schedulers — process priority management
  • Graph Algorithms — Dijkstra’s shortest path, Prim’s MST
  • External Sorting — multi-way merge of sorted files

The Bottom Line

💡 Heap sort trades raw speed for reliability. When you need guaranteed performance—not just average-case performance—heap sort delivers. And understanding heaps opens doors to priority queues, graph algorithms, and beyond.

Master the heap. It’s one of computing’s most versatile data structures.