Data Structures Mastery

Heaps & Advanced Structures

3 min read

Heaps, Tries, and Union-Find appear less frequently than arrays or trees but are crucial for specific problem categories. Knowing when to use them gives you a significant edge.

Heaps (Priority Queues)

A heap is a specialized tree that maintains the minimum (min-heap) or maximum (max-heap) element at the root. Python's heapq module provides a min-heap.

When to Use a Heap:

  • Find the K largest/smallest elements
  • Merge K sorted lists
  • Find the median of a stream
  • Task scheduling by priority
import heapq

# Top K frequent elements
def top_k_frequent(nums, k):
    freq = {}
    for num in nums:
        freq[num] = freq.get(num, 0) + 1
    # Use a min-heap of size k
    return heapq.nlargest(k, freq.keys(), key=freq.get)

# Merge K sorted lists
def merge_k_lists(lists):
    heap = []
    for i, lst in enumerate(lists):
        if lst:
            heapq.heappush(heap, (lst[0].val, i, lst[0]))

    dummy = ListNode(0)
    current = dummy
    while heap:
        val, i, node = heapq.heappop(heap)
        current.next = node
        current = current.next
        if node.next:
            heapq.heappush(heap, (node.next.val, i, node.next))

    return dummy.next
Operation Time Complexity
Insert O(log n)
Extract min/max O(log n)
Peek min/max O(1)
Build heap O(n)

Trie (Prefix Tree)

A Trie is a tree-like structure for storing strings, optimized for prefix-based operations.

When to Use a Trie:

  • Autocomplete/typeahead
  • Word search in a board
  • Spell checking
  • IP routing (longest prefix match)
class TrieNode:
    def __init__(self):
        self.children = {}
        self.is_end = False

class Trie:
    def __init__(self):
        self.root = TrieNode()

    def insert(self, word):
        node = self.root
        for char in word:
            if char not in node.children:
                node.children[char] = TrieNode()
            node = node.children[char]
        node.is_end = True

    def search(self, word):
        node = self.root
        for char in word:
            if char not in node.children:
                return False
            node = node.children[char]
        return node.is_end

    def starts_with(self, prefix):
        node = self.root
        for char in prefix:
            if char not in node.children:
                return False
            node = node.children[char]
        return True

Union-Find (Disjoint Set)

Union-Find tracks connected components efficiently. It uses two optimizations: path compression and union by rank.

When to Use Union-Find:

  • Connected components in a graph
  • Cycle detection in undirected graphs
  • Kruskal's minimum spanning tree
  • Number of islands (alternative to DFS)
class UnionFind:
    def __init__(self, n):
        self.parent = list(range(n))
        self.rank = [0] * n

    def find(self, x):
        if self.parent[x] != x:
            self.parent[x] = self.find(self.parent[x])  # Path compression
        return self.parent[x]

    def union(self, x, y):
        px, py = self.find(x), self.find(y)
        if px == py:
            return False  # Already connected
        # Union by rank
        if self.rank[px] < self.rank[py]:
            px, py = py, px
        self.parent[py] = px
        if self.rank[px] == self.rank[py]:
            self.rank[px] += 1
        return True

Quick Decision Guide

Problem Signal Data Structure Example Problem
"K largest" or "K smallest" Heap Top K frequent elements
"Prefix" or "autocomplete" Trie Implement autocomplete
"Connected" or "groups" Union-Find Number of provinces
"Median" or "stream" Two Heaps Find median from data stream
"Word dictionary" or "board" Trie + DFS Word search II

Interview Tip: If you can identify the right data structure in the first 2 minutes of a problem, you're already halfway to the solution.


Excellent! With data structures mastered, let's move on to the algorithm patterns that tie everything together. :::

Quiz

Module 2 Quiz: Data Structures Mastery

Take Quiz