Heap / Priority Queue Fundamentals

Core Idea: A heap is a complete binary tree stored as an array that maintains a ordering property. It gives you O(1) access to the min (or max) and O(log n) insertion and removal.

Min-Heap vs Max-Heap:

Key Operations:

Push(val) → O(log n) add element, bubble up Pop() → O(log n) remove root, bubble down Peek/Top → O(1) read root (h[0]) Init(h) → O(n) heapify an existing slice

Go's container/heap Interface:

type IntHeap []int func (h IntHeap) Len() int { return len(h) } func (h IntHeap) Less(i, j int) bool { return h[i] < h[j] } // min-heap func (h IntHeap) Swap(i, j int) { h[i], h[j] = h[j], h[i] } func (h *IntHeap) Push(val any) { *h = append(*h, val.(int)) } func (h *IntHeap) Pop() any { old := *h; n := len(old) x := old[n-1]; *h = old[:n-1] return x }

Always use heap.Push/heap.Pop:

Call the heap package functions, not the methods directly. The package functions handle the bubbling; the methods are just the interface contract.

When to Use a Heap

Use a Heap When:

Min-Heap vs Max-Heap Choice:

Want the k LARGEST? → use a MIN-heap of size k Root = smallest of the k largest = the kth largest Want the k SMALLEST? → use a MAX-heap of size k Root = largest of the k smallest = the kth smallest

Why the opposite type? You use the heap as a gatekeeper. For "k largest": if a new element is bigger than the min-heap root, it belongs in the top k — push it in, pop the old root out. The root is always the answer.

Heap vs Sort:

Kth Largest Element in a Stream

O(n log k) Init / O(log k) Add

Goal: Maintain a stream of integers. After each addition, return the kth largest element.

What "kth largest" Means:

Sort all numbers from largest to smallest. The number at position k is the answer. Duplicates count as separate positions. [4, 5, 8, 2] sorted desc → [8, 5, 4, 2] 1st largest = 8 2nd largest = 5 3rd largest = 4 [5, 5, 4] sorted desc → [5, 5, 4] 1st largest = 5 2nd largest = 5 (duplicate still counts) 3rd largest = 4

Technique: Min-Heap of Size k

Key Insight: Keep only the k largest elements seen so far in a min-heap. The root of the heap is always the kth largest because every other element in the heap is larger than it.

Why a MIN-heap for the LARGEST? The heap holds the k largest elements. The root is the smallest of those k elements. That smallest-of-the-k-largest IS the kth largest. Example: k=3, stream so far = [1, 2, 3, 5, 8] Heap holds the 3 largest: [3, 5, 8] (min-heap) Root = 3 = 3rd largest ✓ Everything NOT in the heap (1, 2) is smaller than the root, so they're irrelevant.

How Add Works:

Add(val): 1. Push val onto the heap 2. If heap size > k, pop the root (the smallest) → this evicts the (k+1)th largest 3. Return heap root = kth largest Example: k=3, heap = [1, 2, 3] (3rd largest = 1) Add(4): push → [1, 2, 3, 4], size > k → pop 1 heap = [2, 3, 4], root = 2 ✓ (3rd largest) Add(0): push → [0, 2, 3, 4], size > k → pop 0 heap = [2, 3, 4], root = 2 ✓ (0 was too small)

Code:

type KthLargest struct { h IntHeap k int } func Constructor(k int, nums []int) KthLargest: kl := KthLargest{h: IntHeap{}, k: k} heap.Init(&kl.h) for _, n := range nums: heap.Push(&kl.h, n) if kl.h.Len() > k: heap.Pop(&kl.h) return kl func (this *KthLargest) Add(val int) int: heap.Push(&this.h, val) if this.h.Len() > this.k: heap.Pop(&this.h) return this.h[0] // root of min-heap = kth largest

Why Not Max-Heap?

A max-heap gives the 1st largest in O(1), but getting the kth largest requires popping k-1 elements and pushing them back = O(k log n) per query. A min-heap of size k gives the kth largest in O(1) with O(log k) maintenance.

Heap Size Cap:

Never let the heap grow past k elements. Always pop after pushing if size exceeds k.

Last Stone Weight

O(n log n) Time / O(n) Space

Goal: Repeatedly smash the two heaviest stones. If equal, both destroyed. If not, the lighter is destroyed and the heavier loses weight equal to the lighter. Return the last remaining stone (or 0).

Technique: Max-Heap Simulation

Why Max-Heap? Each step requires the two heaviest stones. A max-heap gives the heaviest in O(1) and the second heaviest after one O(log n) pop. Sorting would require re-sorting after every smash.

The Algorithm:

1. Push all stones into a max-heap 2. While heap has more than 1 stone: a. Pop twice → stone1, stone2 (the two heaviest) b. If equal → both destroyed (do nothing) c. If not → push |stone1 - stone2| back 3. Return heap root if 1 stone left, else 0

Worked Example:

stones = [2, 3, 6, 2, 4] Max-heap: [6, 4, 3, 2, 2] Round 1: pop 6, pop 4 → 6 != 4 → push 2 heap: [3, 2, 2, 2] Round 2: pop 3, pop 2 → 3 != 2 → push 1 heap: [2, 2, 1] Round 3: pop 2, pop 2 → equal → both destroyed heap: [1] Return 1 ✓

Code:

func lastStoneWeight(stones []int) int: h := &IntHeapMax{} heap.Init(h) for _, s := range stones: heap.Push(h, s) for h.Len() > 1: s1, s2 := heap.Pop(h).(int), heap.Pop(h).(int) if s1 != s2: heap.Push(h, max(s1, s2) - min(s1, s2)) if h.Len() == 0: return 0 return (*h)[0]

Why Not Min-Heap?

We need the two heaviest stones each round, not the lightest. Max-heap gives direct access to the largest element.

Zero Remaining:

If all stones cancel out (e.g. [3,3]), the heap is empty. Check h.Len() == 0 before accessing the root.

K Closest Points to Origin

O(n log k) Time / O(k) Space

Goal: Given n points, return the k closest to the origin (0, 0).

Technique: Max-Heap of Size k

Key Insight: This is "k smallest" (by distance). Same gatekeeper pattern as Kth Largest, but flipped:

Kth Largest → keep k largest → MIN-heap of size k root = smallest of the k largest → pop it to evict K Smallest → keep k smallest → MAX-heap of size k root = largest of the k smallest → pop it to evict The "opposite type" rule: the heap type matches what you want to EVICT, not what you want to KEEP.

Why Max-Heap for Closest (Smallest)?

We want to keep the k closest (smallest distance). The max-heap root is the FARTHEST point in our set. When a new point arrives: - Push it in (heap now has k+1 elements) - Pop the root = the farthest point in the set - What remains: the k closest points ✓ If we used a min-heap, the root would be the closest point — popping it would evict the BEST point, which is the opposite of what we want.

Worked Example:

points = [[0,2],[2,2],[2,0],[3,3]], k = 2 Distances: [0,2]=2, [2,2]=2.83, [2,0]=2, [3,3]=4.24 Max-heap ordered by distance (root = farthest): Push [0,2] (dist 2): heap = [[0,2]] Push [2,2] (dist 2.83): heap = [[2,2], [0,2]] size 2 = k, no pop Push [2,0] (dist 2): heap = [[2,2], [0,2], [2,0]] size 3 > k → pop root [2,2] (farthest) heap = [[2,0], [0,2]] Push [3,3] (dist 4.24): heap = [[3,3], [0,2], [2,0]] size 3 > k → pop root [3,3] (farthest) heap = [[2,0], [0,2]] Result: [[0,2], [2,0]] ✓

Custom Less — Ordering by Distance:

// Max-heap: "less" means FARTHER distance comes first func (pq PriorityQueue) Less(i, j int) bool: dist1 := x1*x1 + y1*y1 // skip sqrt — doesn't dist2 := x2*x2 + y2*y2 // change the ordering return dist1 > dist2 // > for max-heap

Skip the sqrt:

Comparing x²+y² directly gives the same ordering as comparing sqrt(x²+y²). Avoids floating-point cost and precision issues.

Code:

func kClosest(points [][]int, k int) [][]int: pq := &PriorityQueue{} heap.Init(pq) for _, point := range points: heap.Push(pq, point) if pq.Len() > k: heap.Pop(pq) // evict the farthest return *pq

Same Pattern as Kth Largest:

Push, cap at k, the heap root is always the one to evict. Only difference is heap type (max vs min) and what you're comparing (distance vs value).

Kth Largest Element in an Array

O(n log k) Time / O(k) Space

Goal: Given an unsorted array, return the kth largest element. Duplicates count as separate positions.

Technique: Min-Heap of Size k (Same as Stream)

Identical pattern to Kth Largest in a Stream. The only difference is that all values are available upfront instead of arriving one at a time. The algorithm is the same: push each element, pop when size exceeds k, and the root is the answer.

func findKthLargest(nums []int, k int) int: h := &IntHeapMin{} heap.Init(h) for _, num := range nums: heap.Push(h, num) if h.Len() > k: heap.Pop(h) return (*h)[0] // root = kth largest

Follow-Up:

Can you solve it without sorting? See QuickSelect below for an O(n) average-case approach.

QuickSelect Algorithm

O(n) Avg / O(n²) Worst / O(1) Space

Core Idea: A partial QuickSort. Instead of sorting both halves, only recurse into the half that contains the kth element. On average, this halves the work each time: n + n/2 + n/4 + ... = O(n).

Step 1: Understand the Goal

After a full sort (descending): [8, 5, 4, 3, 2, 1] ^ ^ k=1 k=3 "kth largest" = element at index k-1 in desc order = element at index n-k in ASC order QuickSelect finds the element that WOULD be at a target index, without fully sorting the array.

Step 2: The Partition (Lomuto Scheme)

Pick a pivot. Rearrange the array so that:

[ elements <= pivot | PIVOT | elements >= pivot ] ^ pivot's final index After partition, the pivot is in its CORRECT sorted position. We know exactly where it ranks.

How partition works step by step:

partition(nums, left, right): pivot = nums[right] // choose rightmost as pivot storeIdx = left // where to put next small element Scan from left to right-1: if nums[i] <= pivot: swap nums[i] with nums[storeIdx] storeIdx++ Swap pivot (nums[right]) with nums[storeIdx] return storeIdx // pivot's final position Example: partition([3, 1, 5, 2, 4], left=0, right=4) pivot = 4, storeIdx = 0 i=0: nums[0]=3 <= 4 → swap(0,0), storeIdx=1 [3, 1, 5, 2, 4] i=1: nums[1]=1 <= 4 → swap(1,1), storeIdx=2 [3, 1, 5, 2, 4] i=2: nums[2]=5 > 4 → skip i=3: nums[3]=2 <= 4 → swap(2,3), storeIdx=3 [3, 1, 2, 5, 4] Swap pivot into storeIdx: swap(3,4) [3, 1, 2, 4, 5] ^ pivot at index 3 (correct position!) Everything left of 4 is <= 4 ✓ Everything right of 4 is >= 4 ✓

Step 3: The Selection

After partition, compare pivot's index with the target:

target = n - k // convert "kth largest" to ascending index pivotIdx = partition(nums, left, right) if pivotIdx == target: Found it! Return nums[pivotIdx] if pivotIdx < target: Answer is in the RIGHT half → recurse right only if pivotIdx > target: Answer is in the LEFT half → recurse left only Key: we DISCARD one half entirely each time. Unlike QuickSort which recurses into both halves.

Full Worked Example:

nums = [3, 1, 5, 2, 4], k = 2 target = 5 - 2 = 3 (index 3 in ascending order) Round 1: partition([3,1,5,2,4], 0, 4) pivot = 4, result: [3,1,2, 4, 5], pivotIdx = 3 pivotIdx == target → FOUND! Return 4 ✓ Another example: nums = [3,1,5,2,4], k = 4 target = 5 - 4 = 1 (index 1 in ascending order) Round 1: partition → [3,1,2, 4, 5], pivotIdx = 3 pivotIdx 3 > target 1 → recurse LEFT [3,1,2] Round 2: partition([3,1,2], 0, 2), pivot = 2 result: [1, 2, 3], pivotIdx = 1 pivotIdx == target → FOUND! Return 2 ✓

Code:

func findKthLargest(nums []int, k int) int: target := len(nums) - k left, right := 0, len(nums)-1 for left <= right: pivotIdx := partition(nums, left, right) if pivotIdx == target: return nums[pivotIdx] else if pivotIdx < target: left = pivotIdx + 1 // search right half else: right = pivotIdx - 1 // search left half func partition(nums []int, left, right int) int: pivot := nums[right] storeIdx := left for i := left; i < right; i++: if nums[i] <= pivot: nums[i], nums[storeIdx] = nums[storeIdx], nums[i] storeIdx++ nums[storeIdx], nums[right] = nums[right], nums[storeIdx] return storeIdx

Why O(n) Average?

Each round, we expect the pivot to land roughly in the middle, halving the search space: Round 1: scan n elements Round 2: scan ~n/2 elements Round 3: scan ~n/4 elements ... Total: n + n/2 + n/4 + ... = 2n = O(n) Compare to QuickSort which recurses BOTH halves: n + 2*(n/2) + 4*(n/4) + ... = n * log(n)

Why O(n²) Worst Case?

If the pivot is always the min or max (e.g. already sorted array with rightmost pivot), we only shrink by 1 each time: n + (n-1) + (n-2) + ... = O(n²) Fix: randomize the pivot choice. randIdx := left + rand.Intn(right - left + 1) swap nums[randIdx] with nums[right] // then partition as usual

Kth Largest: Heap vs QuickSelect

Heap QuickSelect Time (avg): O(n log k) O(n) Time (worst): O(n log k) O(n²) Space: O(k) O(1) in-place Mutates input? No Yes (reorders array) Streaming? Yes No (needs all data) Predictable? Always Depends on pivot

Use Heap when: data arrives as a stream, you need stability, or k is small relative to n.

Use QuickSelect when: all data is available, mutating the array is OK, and you want O(n) average with no extra space.

Task Scheduler

O(n log 26) Time / O(1) Space

Goal: Schedule CPU tasks with a cooldown of n cycles between identical tasks. Minimise total cycles (including idle).

Technique: Max-Heap + Cooldown Queue

Greedy Insight: Always run the task with the highest remaining frequency. High-frequency tasks are the bottleneck — if you delay them, you create unnecessary idle time later. Running them first leaves more room to interleave other tasks during their cooldown.

Two Data Structures, Two Jobs:

Max-Heap (by freq): "Among AVAILABLE tasks, which should I run?" → The one with highest frequency. Cooldown Queue (FIFO): "Which tasks are on cooldown, and when do they become available?" → Tasks come off in the order they went on.

Algorithm:

1. Count frequencies, push all into max-heap 2. While heap or cooldown queue is non-empty: a. cycles++ b. If heap empty → fast-forward to next cooldown expiry (all idle until then) c. Else → pop highest-freq task, execute it - If freq > 0 after decrement, enqueue to cooldown with sched = cycles + n d. If front of cooldown queue has sched == cycles → task is available again, push back to heap

Worked Example:

tasks = [A,A,A, B,C], n = 3 Freqs: A=3, B=1, C=1. Heap: [3,1,1]. C1: pop 3(A). freq=2, sched=4. Heap:[1,1] C2: pop 1(B). freq=0. Heap:[1] C3: pop 1(C). freq=0. Heap:[] C4: heap empty → fast-forward=4. A(sched=4) ready → push 2. Heap:[2] C5: pop 2(A). freq=1, sched=8. Heap:[] C6-8: fast-forward to 8. A(sched=8) ready → push 1. Heap:[1] C9: pop 1(A). Done. Total: 9. Schedule: A B C _ A _ _ _ A

Code:

func leastInterval(tasks []byte, n int) int: freqs := map[byte]int{} for _, t := range tasks: freqs[t]++ h := &IntHeapMax{} heap.Init(h) for _, f := range freqs: heap.Push(h, f) q := CooldownQueue{} // []Task{freq, sched} cycles := 0 for h.Len() > 0 || len(q) > 0: cycles++ if h.Len() == 0: cycles = q[0].sched // fast-forward past idle else: freq := heap.Pop(h).(int) freq-- if freq > 0: q = append(q, &Task{freq, cycles + n}) if len(q) > 0 && q[0].sched == cycles: heap.Push(h, q[0].freq) q = q[1:] return cycles

Only Frequencies Matter:

We don't need task identifiers. The heap just holds frequency counts — the greedy choice depends solely on "how many are left," not which letter it is.

Why Not a Single Combined Heap?

A natural first instinct: put everything in one heap as Task{freq, sched} and let the comparator handle it all. This doesn't work, and understanding why reveals what a heap is actually good at.

What Was Tried:

type Task struct { freq int; sched int } // Single heap holding ALL tasks (available + on cooldown) // Comparator: sort by sched ascending, then freq descending // At each cycle: pop root, check if sched <= cycles, // execute if available, otherwise idle.

Why Every Comparator Fails:

Option 1: Sort by sched first, freq as tiebreaker Problem: Two tasks both past cooldown still get ordered by their OLD, stale sched values instead of by frequency. Picks wrong task. Option 2: Sort by freq first, sched as tiebreaker Problem: Picks highest-freq task even if it's still on cooldown and can't run yet. Option 3: Any hybrid weighting Problem: Same fundamental issue — a static comparator can't adapt to changing time.

The Root Cause: Filtering ≠ Ordering

This problem requires a two-step decision each cycle:

Step 1 — FILTER: "Which tasks are available?" This is a yes/no question based on current time. It changes every cycle as cooldowns expire. Step 2 — ORDER: "Among available tasks, which first?" This is a ranking question based on frequency. A heap does ORDERING. It gives a single total ordering over ALL elements using a static comparator. It cannot FILTER — it has no concept of "current time" and doesn't re-sort when external state changes. A cooldown queue does FILTERING. Tasks enter when they go on cooldown and exit when their time is up. It doesn't rank — it just answers "ready or not." Combining both into one heap forces you to collapse filter + order into a single comparator. But they are fundamentally different operations that depend on different inputs (time vs frequency).

Counterexample:

tasks = [A,A,A,A, B,B,B, C,C, D], n = 2 Freqs: A=4, B=3, C=2, D=1 Combined heap (sched asc, freq desc tiebreaker): C1: A(4,sched=0) ✓ → picks A (correct) C2: B(3,sched=0) ✓ → picks B (correct) C3: C(2,sched=0) ✓ → picks C (correct) C4: D(1,sched=0) and A(3,sched=4) BOTH available Heap picks D because sched 0 < 4 But A has freq=3 — should be prioritised! This wrong choice delays A, which cascades: → Combined heap: 11 cycles (1 unnecessary idle) → Correct approach: 10 cycles (no idle) The stale sched=0 on D outranked A's sched=4, even though "available since cycle 0" vs "available since cycle 4" is irrelevant — both are available NOW. Only frequency should decide.

The Lesson:

A heap's job is to order things, not to decide when things become eligible. When a problem involves eligibility (time-based filtering) AND priority (value-based ordering), use separate data structures for each concern. The cooldown queue handles "when," the heap handles "which."

Design Twitter

O(n * 10 * log 10) GetNewsFeed / O(1) other ops

Goal: Support postTweet, getNewsFeed (10 most recent from self + followees), follow, and unfollow.

The Core Insight: "Top 10 Most Recent" = Min-Heap of Size 10

Each user might follow many people, each with many tweets. We need the 10 most recent across all of them. This is the same "k largest" pattern — keep a min-heap of size 10 ordered by timestamp, evict the oldest when size exceeds 10.

Data Model:

Tweet { tweetID, tweetIdx } tweetIdx = global counter that increases with each post. Acts as a timestamp for ordering. Twitter { tweetCounter int64 // global clock tweets map[UserID][]*Tweet // per-user tweet list following map[UserID]FollowingSet // who they follow }

Key Design Decisions:

1. SELF-FOLLOW TRICK On initUser, add the user to their own following set. This means GetNewsFeed doesn't need a special case for "also include my own tweets" — iterating over followings already includes self. 2. CAP TWEETS AT 10 PER USER We only ever need the 10 most recent total, so no single user can contribute more than 10. On PostTweet, if a user has > 10 tweets, trim the oldest: tweets = tweets[1:] This bounds the inner loop to 10 per followee. 3. MIN-HEAP BY tweetIdx (oldest at root) Less = h[i].tweetIdx < h[j].tweetIdx Root is the OLDEST tweet in the heap. When size > 10, pop the root = evict oldest. What remains: the 10 most recent. Same gatekeeper pattern as Kth Largest.

GetNewsFeed — How It Works:

User 1 follows User 2 and User 3 (and self). User 1 tweets: [(A,1), (B,5), (C,9)] User 2 tweets: [(D,2), (E,6)] User 3 tweets: [(F,3), (G,7), (H,8)] Min-heap (cap 10), push all tweets from all followees: Push A(1), D(2), F(3), B(5), E(6), G(7), H(8), C(9) All fit within 10, no evictions needed. Pop all from heap (oldest first): A(1), D(2), F(3), B(5), E(6), G(7), H(8), C(9) But we need most-recent-first! So fill the result array from the BACK: result[7] = pop → A(1) // oldest result[6] = pop → D(2) ... result[0] = pop → C(9) // most recent Result: [C, H, G, E, B, F, D, A] (most recent first)

Code:

func (this *Twitter) PostTweet(userId, tweetId int): tweets = append(tweets, &Tweet{tweetId, this.tweetCounter}) if len(tweets) > 10: tweets = tweets[1:] // keep only last 10 this.tweetCounter++ func (this *Twitter) GetNewsFeed(userId int) []int: h := &TweetHeap{} // min-heap by tweetIdx for followeeID := range this.following[userId]: for _, tweet := range this.tweets[followeeID]: heap.Push(h, tweet) if h.Len() > 10: heap.Pop(h) // evict oldest result := make([]int, h.Len()) for i := h.Len() - 1; i >= 0; i--: result[i] = heap.Pop(h).(*Tweet).TweetID return result func (this *Twitter) Follow(followerId, followeeId int): this.following[followerId][followeeId] = true func (this *Twitter) Unfollow(followerId, followeeId int): if followerId == followeeId: return // can't unfollow self delete(this.following[followerId], followeeId) func (this *Twitter) initUser(uid UserID): if already exists: return this.tweets[uid] = [] this.following[uid] = {uid: true} // self-follow

Complexity Breakdown:

Let n = number of followees. Each followee has at most 10 tweets (capped). Heap is capped at size 10. GetNewsFeed: O(n * 10 * log 10) = O(n) For each followee, push up to 10 tweets into a size-10 heap. Each push/pop is O(log 10) = O(1). Without the cap, it would be O(n * m * log(n*m)) where m = tweets per user. The cap removes m.

Pattern Recognition:

"Top k from multiple sorted lists" → min-heap of size k as gatekeeper. Same as Kth Largest / K Closest, just applied to Tweet objects instead of integers.

Unfollow Self:

Since initUser adds self to following set, Unfollow must guard against removing self — otherwise a user stops seeing their own tweets.

A Heap is NOT a Sorted Array

A common misconception: "if I push numbers into a min-heap, the backing slice is sorted, so I can index into the middle for the median." This is wrong.

What a Heap Actually Guarantees:

Sorted array: every earlier element <= every later element Min-heap: every PARENT <= its CHILDREN These are very different guarantees. A heap is "partially ordered," not "fully sorted."

Concrete Example:

Insert 7, 3, 5, 1, 6, 2, 4 into a min-heap: Final heap array: [1, 3, 2, 7, 6, 5, 4] Sorted would be: [1, 2, 3, 4, 5, 6, 7] As a tree: 1 / \ 3 2 / \ / \ 7 6 5 4 Valid min-heap? Yes: 1<=3, 1<=2, 3<=7, 3<=6, 2<=5, 2<=4 ✓ Sorted? No: 3 before 2, 7 before 6, 6 before 5 ✗ heap[3] = 7, but the 4th smallest is 4. Indexing into the middle gives the WRONG answer.

Only the top is trustworthy.

A min-heap lets you trust h[0] is the smallest. After that, the rest is just organized enough for efficient insert/remove — not globally ordered.

Find Median from Data Stream

O(log n) AddNum / O(1) FindMedian

Goal: Maintain a stream of integers. After each addition, return the median of all elements seen so far.

Why One Heap Fails:

Naive idea: use a single min-heap, index into the middle of its backing slice for the median. Problem: a heap is NOT a sorted array. h[mid] is just some node in the tree's level-order storage, not the middle value. (See box above.) To get the true median from a single heap, you'd have to pop n/2 elements each time → O(n log n) per FindMedian call. Way too slow.

Key Insight: The Median Lives at a Boundary

If the data were sorted: [ lower half | upper half ] ^ median is HERE The median depends on exactly TWO values: - The largest value in the lower half - The smallest value in the upper half What gives O(1) access to the largest? → Max-heap What gives O(1) access to the smallest? → Min-heap So use TWO heaps, one for each half:

Technique: Two Heaps (Max-Heap + Min-Heap)

leftSide = Max-Heap (lower half of numbers) → root = LARGEST of the lower half rightSide = Min-Heap (upper half of numbers) → root = SMALLEST of the upper half These two roots sit right at the boundary. The median is either one of them or their average. Example: stream = [5, 2, 8, 3] Sorted: [2, 3, 5, 8] leftSide (max-heap): [3, 2] → root = 3 rightSide (min-heap): [5, 8] → root = 5 Median = (3 + 5) / 2 = 4.0 ✓

Why Not Just Track Global Min and Max?

The median is about the CENTER, not the extremes. [1, 2, 100] → min=1, max=100, median=2 [1, 50, 100] → min=1, max=100, median=50 Same bounds, very different medians. You need the boundary between the two HALVES.

AddNum — Placement + Rebalancing:

AddNum(num): 1. PLACE: Compare num to rightSide's root. - If num > rightSide root → belongs in upper half → push to rightSide (min-heap) - Otherwise → belongs in lower half → push to leftSide (max-heap) - If rightSide is empty, default to leftSide. 2. REBALANCE: The halves must differ by at most 1. - If leftSide has 2+ more than rightSide: → pop leftSide root, push to rightSide - If rightSide has 2+ more than leftSide: → pop rightSide root, push to leftSide Why rebalance? If one side grows too large, the roots no longer sit at the true middle boundary.

FindMedian — O(1) Peek:

FindMedian(): If sizes are equal (even total): → median = (leftSide root + rightSide root) / 2 If leftSide has one more (odd total): → median = leftSide root If rightSide has one more (odd total): → median = rightSide root The roots are always the two middle candidates. No iteration, no indexing — just peek at the tops.

Worked Example:

Stream: 1, 3, 2 Add 1: rightSide empty → push to left left=[1] right=[] median = 1.0 Add 3: 3 > nothing in right? right empty, so left. Actually: right is empty → default push to left left=[3,1] right=[] → left has 2 more → REBALANCE pop 3 from left, push to right left=[1] right=[3] median = (1+3)/2 = 2.0 Add 2: 2 < right root (3) → push to left left=[2,1] right=[3] Sizes differ by 1, OK. median = 2.0 ✓ left (max) right (min) root = 2 root = 3 Sorted: [1, 2 | 3] ^ median = 2

Code:

type MedianFinder struct { leftSide *IntHeapMax // lower half rightSide *IntHeapMin // upper half } func (this *MedianFinder) AddNum(num int): if rightSide.Len() > 0 && num > rightSide root: heap.Push(rightSide, num) else: heap.Push(leftSide, num) // rebalance if sizes differ by more than 1 if leftSide.Len() > rightSide.Len() + 1: heap.Push(rightSide, heap.Pop(leftSide)) else if rightSide.Len() > leftSide.Len() + 1: heap.Push(leftSide, heap.Pop(rightSide)) func (this *MedianFinder) FindMedian() float64: if leftSide.Len() > rightSide.Len(): return leftSide root if rightSide.Len() > leftSide.Len(): return rightSide root return (leftSide root + rightSide root) / 2.0

Pattern: Two Heaps = Split at the Middle

Whenever you need O(1) access to the middle of a dynamic dataset, consider splitting it into two heaps. The max-heap holds the lower half, the min-heap holds the upper half, and their roots give you the boundary.

Always Rebalance:

After every push, check that the heap sizes differ by at most 1. Without rebalancing, the roots drift away from the true median.

How Heap Push & Pop Work Internally

A heap is a complete binary tree stored as an array. The tree structure is implicit from the indices:

For node at index i: parent = (i - 1) / 2 left child = 2*i + 1 right child = 2*i + 2 Array: [1, 3, 2, 7, 5, 4, 6] Tree: 1 index 0 / \ 3 2 index 1, 2 / \ / \ 7 5 4 6 index 3, 4, 5, 6

Push — Bubble Up (Sift Up):

Add to the end, then swap upward until the heap property is restored.

Push(0) onto min-heap [1, 3, 2, 7, 5]: Step 1: Append to end [1, 3, 2, 7, 5, 0] 1 / \ 3 2 / \ / 7 5 0 ← new element at index 5 Step 2: Compare with parent (index (5-1)/2 = 2) 0 < 2? Yes → swap [1, 3, 0, 7, 5, 2] 1 / \ 3 0 ← swapped up / \ / 7 5 2 Step 3: Compare with parent (index (2-1)/2 = 0) 0 < 1? Yes → swap [0, 3, 1, 7, 5, 2] 0 ← now the root / \ 3 1 / \ / 7 5 2 Step 4: At root, done. O(log n) swaps max.

Pop — Bubble Down (Sift Down):

Swap root with last element, remove last, then swap the new root downward with its smaller child until restored.

Pop() from min-heap [1, 3, 2, 7, 5, 4]: Step 1: Swap root with last element [4, 3, 2, 7, 5, 1] Remove last → return value 1 [4, 3, 2, 7, 5] 4 ← out of place / \ 3 2 / \ 7 5 Step 2: Compare with children (index 1=3, index 2=2) Smaller child is 2 (index 2). 4 > 2? Yes → swap [2, 3, 4, 7, 5] 2 / \ 3 4 ← swapped down / \ 7 5 Step 3: Children of index 2 are index 5, 6 → none No children, done. O(log n) swaps max.

Why Swap With Smaller Child?

In a min-heap, swapping with the smaller child guarantees the new parent is ≤ both children. If you swapped with the larger child, the smaller child would be less than the new parent, violating the heap property.

What heap.Push/heap.Pop Actually Do:

// heap.Push (simplified): func Push(h Interface, x any) { h.Push(x) // YOUR Push: append to slice up(h, h.Len()-1) // LIBRARY: bubble up from last // heap.Pop (simplified): func Pop(h Interface) any { n := h.Len() - 1 h.Swap(0, n) // swap root with last down(h, 0, n) // bubble down new root return h.Pop() // YOUR Pop: remove & return last

Your methods vs library:

Your Push/Pop just append/remove the last element. The library's up/down functions handle all the swapping to maintain heap order. That's why you must call heap.Push/heap.Pop, not the methods directly.

Why *IntHeap for Push/Pop but Not Swap?

A Go slice is a small struct with three fields: pointer to the underlying array, length, and capacity. When you call a method, the receiver gets a copy of that struct.

Value Receiver (IntHeap) — Swap, Len, Less:

func (h IntHeap) Swap(i, j int) { h[i], h[j] = h[j], h[i] } Swap modifies ELEMENTS in the underlying array. Both the caller's slice and the method's copy point to the SAME array → writes to h[i] are visible to the caller. The slice header (ptr, len, cap) doesn't change, so a value receiver (copy of the header) is fine.

Pointer Receiver (*IntHeap) — Push, Pop:

func (h *IntHeap) Push(x any) { *h = append(*h, x.(int)) } Push calls append → may allocate a NEW array (changing the pointer) and ALWAYS changes the length. These are slice header changes. If this were a value receiver, the new header would be a local copy → caller never sees the change. *IntHeap lets us modify the caller's actual slice header.

The Rule:

Modifying elements (h[i] = x) → value receiver OK (shared underlying array) Modifying the slice itself → pointer receiver required (append, reslice h[:n-1])

Common Mistakes

1. Wrong Heap Type

"kth largest" → min-heap. "kth smallest" → max-heap. The opposite type acts as the gatekeeper.

2. Calling Methods Instead of Package Functions

// WRONG: skips bubbling h.Push(val) // RIGHT: maintains heap property heap.Push(&h, val)

3. Forgetting to Cap Heap Size

If you push without popping when size > k, the root stops being the kth largest.

4. Peeking After Pop

After heap.Pop, the root has changed. If you need the answer, peek before popping or peek at the new root.

5. Init vs Building One-by-One

// Option A: heap.Init on full slice → O(n) // But then you still need to trim to size k // Option B: Push one-by-one, pop when > k → O(n log k) // Simpler and directly gives you a size-k heap