Tri-Color Marking Algorithm: Complete Guide to Garbage Collection Marking
Master the tri-color marking algorithm with interactive visualizations. Learn how garbage collectors mark reachable objects using white, gray, and black colors with step-by-step animations.
Tri-Color Marking Algorithm: Complete Guide to Garbage Collection Marking
The tri-color marking algorithm is a fundamental technique in modern garbage collectors (GC). It elegantly solves the problem of identifying which objects in memory are still reachable and which can be safely collected. This algorithm is used in many production systems, from the Go runtime to the V8 JavaScript engine.
Drawing from my 11+ years of experience at Microsoft, Booking.com, and Alibaba working with high-performance systems, I’ll walk you through this algorithm with interactive visualizations that make the abstract concept concrete.
Interactive Visualization
Let’s start by seeing the algorithm in action. Try playing with different examples to understand how objects transition through the three colors:
What is the Tri-Color Marking Algorithm?
The tri-color marking algorithm is a marking strategy used during the mark phase of garbage collection. It categorizes objects into three sets using colors:
- White Set: Objects that haven’t been visited yet. After marking completes, white objects are considered garbage.
- Gray Set: Objects that have been discovered but whose references haven’t been fully explored yet.
- Black Set: Objects that have been fully explored—they are reachable, and all their references have been processed.
Key Invariant
The algorithm maintains a crucial invariant throughout execution:
No black object can point to a white object directly.
This invariant ensures correctness: if a black object (already scanned) pointed to a white object (not yet scanned), that white object might be incorrectly collected as garbage.
How the Algorithm Works
Phase 1: Initialization
All objects in the heap start in the white set. This represents the “innocent until proven guilty” approach—every object is potentially garbage until we prove it’s reachable.
// Pseudo-code for initialization
function initialize(heap) {
for (object in heap) {
object.color = WHITE;
}
}
Phase 2: Mark Roots
The root set contains all directly accessible objects—typically:
- Global variables
- Stack variables
- CPU registers
These roots are marked gray because we know they’re reachable, but we haven’t scanned their references yet.
function markRoots(roots) {
for (root in roots) {
root.color = GRAY;
graySet.add(root);
}
}
Phase 3: Process Gray Objects
This is the main loop of the algorithm. We repeatedly:
- Pick a gray object from the gray set
- For each reference in that object:
- If the referenced object is white, mark it gray
- Mark the current object black (fully processed)
function processGrayObjects() {
while (graySet.isNotEmpty()) {
let current = graySet.removeAny();
// Scan all references
for (reference in current.references) {
if (reference.color == WHITE) {
reference.color = GRAY;
graySet.add(reference);
}
}
// Mark current as fully processed
current.color = BLACK;
blackSet.add(current);
}
}
Phase 4: Sweep
After marking completes, all white objects are unreachable and can be collected:
function sweep(heap) {
for (object in heap) {
if (object.color == WHITE) {
// This object is garbage
freeMemory(object);
} else {
// Reset color for next GC cycle
object.color = WHITE;
}
}
}
Step-by-Step Example
Let’s trace through a simple example with objects A → B → C and an isolated cycle D ↔ E:
Initial State:
Root → A → B → C
D ↔ E (isolated cycle)
All objects: WHITE
Step 1: Mark root A as GRAY
A: GRAY
B, C, D, E: WHITE
Gray Set: [A]
Step 2: Process A, discover B
A: BLACK (scanned)
B: GRAY (discovered)
C, D, E: WHITE
Gray Set: [B]
Step 3: Process B, discover C
A: BLACK
B: BLACK (scanned)
C: GRAY (discovered)
D, E: WHITE
Gray Set: [C]
Step 4: Process C
A, B, C: BLACK
D, E: WHITE (unreachable!)
Gray Set: [] (empty)
Result: D and E remain white—they form an isolated cycle unreachable from roots. They can be collected as garbage.
Why Use Three Colors?
You might wonder: why not just use two colors (reachable/unreachable)?
The three-color scheme provides several advantages:
1. Incremental Collection
The gray set represents the “wavefront” of the marking process. We can:
- Process a few gray objects
- Pause to let the program (mutator) run
- Resume processing later
This enables incremental GC that doesn’t pause the program for long periods.
2. Concurrent Collection
With proper synchronization, the GC can run concurrently with the program:
- The program might create new objects or change references
- The gray set helps track what still needs scanning
- The invariant prevents collecting reachable objects
3. Clear Progress Tracking
The three sets clearly show:
- White: Not yet explored
- Gray: Currently exploring (work in progress)
- Black: Fully explored (work completed)
Handling Concurrent Mutations
When the GC runs concurrently with the program (mutator), the mutator might:
- Create new objects: New objects start white or gray
- Update references: This could violate our invariant!
The Problem
Consider this scenario:
Initial:
Black → Gray → White
Mutator updates:
Black → White (now points to white!)
Gray → null (removes original path)
Now a black object points to a white object, violating our invariant! The white object might be incorrectly collected.
Solutions
Two main approaches fix this:
Write Barrier (Dijkstra’s Algorithm)
When a black object gains a reference to a white object, mark the white object gray:
function writeBarrier(blackObject, newReference) {
if (blackObject.color == BLACK && newReference.color == WHITE) {
newReference.color = GRAY;
graySet.add(newReference);
}
}
Read Barrier (Yuasa’s Algorithm)
When a reference is removed, if it points to a white object, mark that object gray:
function readBarrier(object, oldReference) {
if (oldReference.color == WHITE) {
oldReference.color = GRAY;
graySet.add(oldReference);
}
}
Real-World Implementations
Go Runtime
Go’s garbage collector uses tri-color marking with:
- Concurrent marking (GC runs alongside goroutines)
- Write barriers to maintain invariants
- Pacer that adjusts GC frequency based on allocation rate
// Simplified Go GC pseudocode
func gcMark() {
// Stop the world briefly
stopTheWorld()
// Mark roots
scanRoots()
// Start the world, mark concurrently
startTheWorld()
// Process gray objects concurrently
for hasGrayObjects() {
processGrayObjects()
}
}
V8 JavaScript Engine
V8 uses tri-color marking in its:
- Major GC (Mark-Compact)
- Incremental marking
- Concurrent marking
Java HotSpot
Some Java collectors like G1 and ZGC use tri-color marking for concurrent and incremental collection.
Complexity Analysis
Time Complexity
- Marking phase: O(R + E) where:
- R = number of reachable objects
- E = number of references between reachable objects
Each object is visited once, each reference is traversed once.
Space Complexity
- Gray set: O(W) where W is the maximum “width” of the object graph
- In the worst case (a long chain), W = 1
- In the best case (a star pattern), W = R - 1
Advantages and Disadvantages
Advantages ✅
- Incremental: Can pause and resume marking
- Concurrent: Can run alongside the program
- Precise: Correctly identifies all reachable objects
- Scalable: Time proportional to reachable objects (not total heap)
Disadvantages ❌
- Overhead: Requires tracking three sets
- Barriers: Write/read barriers add execution cost
- Complexity: More complex than stop-the-world approaches
- Floating garbage: Concurrent marking might miss some garbage (collected next cycle)
Common Interview Questions
Q1: Why can’t we just use two colors?
Two colors (marked/unmarked) don’t provide enough information for incremental or concurrent collection. The gray color represents the “frontier” of objects we’re actively exploring, allowing us to pause and resume.
Q2: How does this handle cycles?
The algorithm naturally handles cycles. Even if objects reference each other, they’ll only become gray if reachable from roots. Isolated cycles remain white and are collected.
Q3: What’s the difference from reference counting?
Reference counting:
- Tracks number of references to each object
- Cannot handle cycles
- Immediate collection
Tri-color marking:
- Traces reachability from roots
- Handles cycles naturally
- Batch collection
Practice Problems
Problem 1: Trace the Algorithm
Given this object graph:
Root1 → A → C
Root2 → B → C
D → E → D (cycle)
Trace the algorithm step by step. Which objects are collected?
Solution
Initial: All WHITE
Step 1: Mark roots
A: GRAY, B: GRAY
Others: WHITE
Step 2: Process A
A: BLACK, C: GRAY (discovered from A)
B: GRAY, D, E: WHITE
Step 3: Process B
A, B: BLACK
C: already GRAY
D, E: WHITE
Step 4: Process C
A, B, C: BLACK
D, E: WHITE
Result: D and E are garbage (isolated cycle)Problem 2: Design a Write Barrier
Implement a write barrier for the tri-color marking algorithm that maintains the invariant “no black object points to a white object.”
Solution
function writeBarrier(
sourceObj: Object,
field: string,
targetObj: Object
) {
// Perform the write
sourceObj[field] = targetObj;
// If source is black and target is white,
// mark target gray to maintain invariant
if (sourceObj.color === 'BLACK' && targetObj.color === 'WHITE') {
targetObj.color = 'GRAY';
graySet.add(targetObj);
}
}Multi-Language Implementations
JavaScript/TypeScript
class TriColorMarking {
private whiteSet = new Set<number>();
private graySet = new Set<number>();
private blackSet = new Set<number>();
mark(roots: number[], graph: Map<number, number[]>): Set<number> {
// Initialize all as white
for (const node of graph.keys()) {
this.whiteSet.add(node);
}
// Mark roots gray
for (const root of roots) {
this.whiteSet.delete(root);
this.graySet.add(root);
}
// Process gray objects
while (this.graySet.size > 0) {
const current = this.graySet.values().next().value;
this.graySet.delete(current);
// Mark references
const refs = graph.get(current) || [];
for (const ref of refs) {
if (this.whiteSet.has(ref)) {
this.whiteSet.delete(ref);
this.graySet.add(ref);
}
}
// Mark current black
this.blackSet.add(current);
}
// Return garbage (remaining white objects)
return this.whiteSet;
}
}
Python
class TriColorMarking:
def __init__(self):
self.white_set = set()
self.gray_set = set()
self.black_set = set()
def mark(self, roots, graph):
"""
Mark reachable objects using tri-color algorithm.
Returns set of garbage (white) objects.
"""
# Initialize all as white
self.white_set = set(graph.keys())
# Mark roots gray
for root in roots:
if root in self.white_set:
self.white_set.remove(root)
self.gray_set.add(root)
# Process gray objects
while self.gray_set:
current = self.gray_set.pop()
# Mark references
for ref in graph.get(current, []):
if ref in self.white_set:
self.white_set.remove(ref)
self.gray_set.add(ref)
# Mark current black
self.black_set.add(current)
# Return garbage
return self.white_set
Java
public class TriColorMarking {
private Set<Integer> whiteSet = new HashSet<>();
private Set<Integer> graySet = new HashSet<>();
private Set<Integer> blackSet = new HashSet<>();
public Set<Integer> mark(
List<Integer> roots,
Map<Integer, List<Integer>> graph
) {
// Initialize all as white
whiteSet.addAll(graph.keySet());
// Mark roots gray
for (int root : roots) {
if (whiteSet.contains(root)) {
whiteSet.remove(root);
graySet.add(root);
}
}
// Process gray objects
while (!graySet.isEmpty()) {
int current = graySet.iterator().next();
graySet.remove(current);
// Mark references
List<Integer> refs = graph.getOrDefault(
current,
Collections.emptyList()
);
for (int ref : refs) {
if (whiteSet.contains(ref)) {
whiteSet.remove(ref);
graySet.add(ref);
}
}
// Mark current black
blackSet.add(current);
}
// Return garbage
return whiteSet;
}
}
Performance Optimization Tips
1. Gray Set Implementation
Use a work-stealing deque for the gray set in parallel collectors:
class WorkStealingDeque<T> {
private items: T[] = [];
pushBottom(item: T) {
this.items.push(item);
}
popBottom(): T | undefined {
return this.items.pop();
}
popTop(): T | undefined {
return this.items.shift();
}
}
2. Prefetching
Prefetch object data when processing gray objects to reduce cache misses:
// C++ example
while (!graySet.empty()) {
auto current = graySet.front();
graySet.pop();
// Prefetch next objects
for (auto& ref : current->references) {
__builtin_prefetch(ref);
}
// Process current
processObject(current);
}
3. Parallel Marking
Multiple threads can mark concurrently using thread-local gray sets:
function parallelMark(roots: number[], graph: Map<number, number[]>) {
const workers = [];
const localGraySets = [];
// Distribute roots among workers
for (let i = 0; i < NUM_WORKERS; i++) {
localGraySets[i] = new Set();
workers[i] = new Worker(() => {
while (hasWork()) {
processLocalGraySet(localGraySets[i]);
}
});
}
}
Conclusion
The tri-color marking algorithm is a elegant and powerful technique for garbage collection. Its ability to run incrementally and concurrently makes it ideal for modern systems that can’t tolerate long GC pauses.
Key takeaways:
- Three colors track marking progress: white (unvisited), gray (frontier), black (completed)
- Invariant: No black → white pointers prevents collecting reachable objects
- Barriers (write/read) maintain correctness during concurrent execution
- Widely used in Go, V8, Java, and many other production systems
The interactive visualization at the top of this article lets you explore different scenarios. Try creating your own object graphs to deepen your understanding!
Further Reading
- The Garbage Collection Handbook - Comprehensive GC reference
- Go GC Design - Real-world tri-color implementation
- V8 Blog: Concurrent Marking - How V8 uses tri-color marking
Want to explore more algorithm visualizations? Check out our other interactive tutorials on topological sort, binary search, and graph traversal!