Data Structures Mastery
Heaps & Advanced Structures
Heaps, Tries, and Union-Find appear less frequently than arrays or trees but are crucial for specific problem categories. Knowing when to use them gives you a significant edge.
Heaps (Priority Queues)
A heap is a specialized tree that maintains the minimum (min-heap) or maximum (max-heap) element at the root. Python's heapq module provides a min-heap.
When to Use a Heap:
- Find the K largest/smallest elements
- Merge K sorted lists
- Find the median of a stream
- Task scheduling by priority
import heapq
# Top K frequent elements
def top_k_frequent(nums, k):
freq = {}
for num in nums:
freq[num] = freq.get(num, 0) + 1
# Use a min-heap of size k
return heapq.nlargest(k, freq.keys(), key=freq.get)
# Merge K sorted lists
def merge_k_lists(lists):
heap = []
for i, lst in enumerate(lists):
if lst:
heapq.heappush(heap, (lst[0].val, i, lst[0]))
dummy = ListNode(0)
current = dummy
while heap:
val, i, node = heapq.heappop(heap)
current.next = node
current = current.next
if node.next:
heapq.heappush(heap, (node.next.val, i, node.next))
return dummy.next
| Operation | Time Complexity |
|---|---|
| Insert | O(log n) |
| Extract min/max | O(log n) |
| Peek min/max | O(1) |
| Build heap | O(n) |
Trie (Prefix Tree)
A Trie is a tree-like structure for storing strings, optimized for prefix-based operations.
When to Use a Trie:
- Autocomplete/typeahead
- Word search in a board
- Spell checking
- IP routing (longest prefix match)
class TrieNode:
def __init__(self):
self.children = {}
self.is_end = False
class Trie:
def __init__(self):
self.root = TrieNode()
def insert(self, word):
node = self.root
for char in word:
if char not in node.children:
node.children[char] = TrieNode()
node = node.children[char]
node.is_end = True
def search(self, word):
node = self.root
for char in word:
if char not in node.children:
return False
node = node.children[char]
return node.is_end
def starts_with(self, prefix):
node = self.root
for char in prefix:
if char not in node.children:
return False
node = node.children[char]
return True
Union-Find (Disjoint Set)
Union-Find tracks connected components efficiently. It uses two optimizations: path compression and union by rank.
When to Use Union-Find:
- Connected components in a graph
- Cycle detection in undirected graphs
- Kruskal's minimum spanning tree
- Number of islands (alternative to DFS)
class UnionFind:
def __init__(self, n):
self.parent = list(range(n))
self.rank = [0] * n
def find(self, x):
if self.parent[x] != x:
self.parent[x] = self.find(self.parent[x]) # Path compression
return self.parent[x]
def union(self, x, y):
px, py = self.find(x), self.find(y)
if px == py:
return False # Already connected
# Union by rank
if self.rank[px] < self.rank[py]:
px, py = py, px
self.parent[py] = px
if self.rank[px] == self.rank[py]:
self.rank[px] += 1
return True
Quick Decision Guide
| Problem Signal | Data Structure | Example Problem |
|---|---|---|
| "K largest" or "K smallest" | Heap | Top K frequent elements |
| "Prefix" or "autocomplete" | Trie | Implement autocomplete |
| "Connected" or "groups" | Union-Find | Number of provinces |
| "Median" or "stream" | Two Heaps | Find median from data stream |
| "Word dictionary" or "board" | Trie + DFS | Word search II |
Interview Tip: If you can identify the right data structure in the first 2 minutes of a problem, you're already halfway to the solution.
Excellent! With data structures mastered, let's move on to the algorithm patterns that tie everything together. :::