Back to DSA Guide Catalog

Data Structureextra-hard

Disjoint Set Union (Union-Find) Complete Guide

Union-Find (Disjoint Set Union) tracks which items are in the same connected component.

It is extremely efficient for repeated connectivity checks with dynamic unions.

Path compression + union by rank/size makes operations nearly constant in practice.

Typical Complexity Baseline

MetricValue
Complexity Note 1Near O(1) amortized

Video Explainer

Prefer video learning? This explainer gives a quick visual walkthrough of the core idea before you dive into the detailed sections below.

Core Concepts

Learn the core building blocks and terminology in one place before comparisons, so the mechanics are clear and duplicates are removed.

Parent array

What it is: Points each node to representative parent.

Why it matters: Parent array is a required building block for understanding how Disjoint Set Union (Union-Find) stays correct and performant on large inputs.

Find

What it is: Resolve component representative.

Why it matters: Find is a required building block for understanding how Disjoint Set Union (Union-Find) stays correct and performant on large inputs.

Union

What it is: Merge two components.

Why it matters: Union is a required building block for understanding how Disjoint Set Union (Union-Find) stays correct and performant on large inputs.

Rank/size

What it is: Heuristic for shallow trees.

Why it matters: Rank/size is a required building block for understanding how Disjoint Set Union (Union-Find) stays correct and performant on large inputs.

Path compression

What it is: Flattens trees during find operations. Rewire node directly to root during find.

Why it matters: Path compression is a required building block for understanding how Disjoint Set Union (Union-Find) stays correct and performant on large inputs. Rewire node directly to root during find. In Disjoint Set Union (Union-Find), this definition helps you reason about correctness and complexity when inputs scale.

Representative

What it is: Canonical root id for component.

Why it matters: Canonical root id for component. In Disjoint Set Union (Union-Find), this definition helps you reason about correctness and complexity when inputs scale.

Union by rank

What it is: Attach smaller-depth tree under larger-depth root.

Why it matters: Attach smaller-depth tree under larger-depth root. In Disjoint Set Union (Union-Find), this definition helps you reason about correctness and complexity when inputs scale.

Amortized complexity

What it is: Average cost across operation sequence.

Why it matters: Average cost across operation sequence. In Disjoint Set Union (Union-Find), this definition helps you reason about correctness and complexity when inputs scale.

Disjoint sets

What it is: Non-overlapping groups of elements.

Why it matters: Non-overlapping groups of elements. In Disjoint Set Union (Union-Find), this definition helps you reason about correctness and complexity when inputs scale.

Putting It All Together

This walkthrough connects the core concepts of Disjoint Set Union (Union-Find) into one end-to-end execution flow.

Step 1

Parent array

Points each node to representative parent.

Before

parent=

012345
  • , rank=[0..0]
  • Nodes: 0..5

After

  • Disjoint sets ready
  • Union operations can begin

Transition

Each node starts in its own set
find(x)=x initially

Why this step matters: Parent array is a required building block for understanding how Disjoint Set Union (Union-Find) stays correct and performant on large inputs.

Step 2

Find

Resolve component representative.

Before

  • Union(1,2)
  • Roots: 1 and 2

After

parent

2
  • =1
  • Set {1,2} formed

Transition

Attach lower-rank root to higher-rank root
Update rank if needed

Why this step matters: Find is a required building block for understanding how Disjoint Set Union (Union-Find) stays correct and performant on large inputs.

Step 3

Union

Merge two components.

Before

  • Union(2,3)
  • find(2) follows parent to root 1

After

parent

2
  • =1, parent[3]=1
  • Future finds are faster

Transition

Path compression flattens chain
Attach root 3 under root 1

Why this step matters: Union is a required building block for understanding how Disjoint Set Union (Union-Find) stays correct and performant on large inputs.

Step 4

Rank/size

Heuristic for shallow trees.

Before

  • Need connectivity check for 1 and 3
  • Call find on both

After

  • Connected = true
  • Same representative root

Transition

find(1)=1
find(3)=1

Why this step matters: Rank/size is a required building block for understanding how Disjoint Set Union (Union-Find) stays correct and performant on large inputs.

How It Compares

vs Graph traversal per query

When to choose this: Choose Union-Find for many connectivity checks over many unions.

Tradeoff: Traversal better when full path details are required.

vs Adjacency matrix

When to choose this: Choose Union-Find for sparse dynamic edge streams.

Tradeoff: Matrix offers O(1) direct edge existence lookup but high memory.

vs Topological sort

When to choose this: Choose Union-Find for undirected connectivity grouping.

Tradeoff: Topological sort is for directed acyclic dependency ordering.

Real-World Stories and Company Examples

Networking platforms

Union-Find style component grouping helps reason about cluster and segment connectivity.

Takeaway: Connectivity grouping is a common infrastructure primitive.

GIS tools

Spatial region merging and connectivity analyses use disjoint-set logic.

Takeaway: Union-Find is practical for geometric grouping workflows.

Data quality pipelines

Entity resolution systems merge records into clusters of same real-world entity.

Takeaway: Union operations model transitive equivalence elegantly.

Implementation Guide

Maintain parent + rank arrays and compress path on every find.

Complexity: Amortized almost O(1)

Union-Find with path compression

class UnionFind {
  private readonly parentByNode: number[]
  private readonly rankByNode: number[]

  constructor(nodeCount: number) {
    this.parentByNode = Array.from({ length: nodeCount }, (_, index) => index)
    this.rankByNode = new Array<number>(nodeCount).fill(0)
  }

  find(nodeId: number): number {
    if (this.parentByNode[nodeId] !== nodeId) {
      this.parentByNode[nodeId] = this.find(this.parentByNode[nodeId])
    }
    return this.parentByNode[nodeId]
  }
}

Common Problems and Failure Modes

  • Skipping path compression and losing near-constant behavior.
  • Not handling out-of-range node ids.
  • Using recursion for find without stack consideration in some languages.
  • Confusing rank and size semantics during union.
  • Expecting Union-Find to return explicit path, not just connectivity.

Tips and Tricks

  • Use union-find when repeated connectivity checks occur in dynamic grouping problems.
  • Always implement path compression and union by rank/size for near-constant performance.
  • Initialize parent pointers clearly for each node/index domain.
  • Great for cycle detection in undirected graph edge streams.

When to Use

Use these signals to decide if this data structure/algorithm is the right fit before implementation.

Real-system usage signals

  • Fast connectivity queries in evolving graphs.
  • Simple implementation with strong practical performance.
  • Core building block for Kruskal MST and cycle detection.

LeetCode-specific tips (including pattern-identification signals)

  • Identification signal: repeated connectivity queries interleaved with merge/union operations.
  • Identification signal: you need to know whether two nodes belong to same component quickly many times.
  • Dynamic graph connectivity with incremental edges is a classic Union-Find trigger.
  • For Disjoint Set Union (Union-Find) questions, start by naming the core invariant before writing code.
  • Use the constraint section to set time/space target first, then pick the data structure/algorithm.
  • Solve one tiny example by hand and map each transition to your variables before implementing.
  • Run adversarial cases: empty input, duplicates, max-size input, and sorted/reverse patterns when relevant.
  • During interviews, explain why your approach is the right pattern for this prompt, not just why the code works.

LeetCode Progression (Easy to Hard)

Back to DSA Guide Catalog
#ProblemDifficultyTypical Complexity
1Number of ProvincesMediumO(n^2)
2Redundant ConnectionMediumO(n alpha(n))
3Accounts MergeMediumO(n alpha(n))
4Satisfiability of Equality EquationsMediumO(n alpha(n))
5Number of Connected Components in an Undirected GraphMediumO(V+E)
6Most Stones Removed with Same Row or ColumnMediumO(n alpha(n))
7Graph Valid TreeMediumO(V+E)
8Min Cost to Connect All PointsMediumO(n^2 log n)
9Number of Islands IIHardO(k alpha(n))
10Swim in Rising WaterHardO(n^2 log n)