Heaps

Christmas Heap (https://xkcd.com/221/)

A heap is an important data structure used for any operations involving prioritization. Examples include accessing or removing the largest or smallest priority object. Heaps will also support the insertion of new objects into the heap which maintain their behavior.

Learning Objectives

Github Classroom

You can clone the starter files for this lab at this GitHub Classroom link. Be sure to open the assignment in VSCode and setup up your virtual environment following these instructions.

Implementation

Heaps are commonly implemented as a binary tree, as you have seen in CSCI 0200. In this lab, you will first implement an array based heap, where the tree is embedded in an array. Below is an image of a heap and how a heap is stored within an array.

Example of a heap

MinHeap

In this lab you are going to implement a MinHeap, where the element at the root (index 0) has the lowest priority. Our implementation of MinHeap will store PriorityItems, which have an integer priority and an associated value of any type. This means that the smallest priority within the heap at any time will be at the root.

Example of a larger array based min-heap

Note that the above example is a max-heap, but the array embedding is similar for the min-heap that you will implement in this lab.

Task: Fill out the stencil code to complete the first few functions in array_heap.py:

Checkout the docstrings for more details.

Now that we can build the heap and index elements within the heap, we need to work on supporting functions that help us maintain heap property.

Task: Implement the following heap functions in array_heap.py:

Hint: The following pseudocode may help:

function sift_up(node):
    if node is root:
        return

    if node is greater than its parent:
        swap node with its parent
        recursively call sift_up on the parent

function sift_down(node):
    if node is a leaf:
        return

    if node is greater than at least one of its children:
        swap node with its smallest child
        recursively call sift_down on the child

Task: Implement the following core heap functions:

Self Checkpoint: Run the tests in test_heap.py to ensure that your implementation is correct.

Note: This also includes tests for the PointerHeap, so you can ignore those for now.

TA Checkpoint: Show a TA that all of the ArrayHeap tests in test_heap.py pass!

Pointer-Based Heaps

A heap can also be represented as a binary tree where each element is a node, and has access to its parent and children.

Task: Check out the Node class in pointer_heap.py. This class will be used to represent the nodes in the heap.

The pseudocode for most heap operations remain the same from the ArrayHeap, but now we need to be able to find the last node in the heap. To do this, we can look at the binary representation of the current length of the heap, and use this to tree-walk to the last node added,
which will become the root when we extract a node off the heap.

We can do this in the code using the find_last_node_at_index method of PointerHeap.

Task: Implement the find_last_node_at_index method in pointer_heap.py. The below pseudocode should help with find_last_node_at_index():

function find_last_node_at_index(index):
    let b = binary representation of index (number of nodes)
    Remove the most significant bit of b
    Start at the root node
    for each bit in b:
        if bit is 0:
            go to the left child
        else:
            go to the right child

    return the node you end up at after traversal

Task: Implement the sift_up and sift_down methods in pointer_heap.py.

These should follow a very similar structure to your implementations in array_heap.py, but with pointer semantics.

Task: Implement the insert, extract_min, and peek methods in pointer_heap.py.

These should also follow a very similar structure to your implementations in array_heap.py, but with pointer semantics.

Self Checkpoint: Run the tests in test_heap.py to ensure that your implementation is correct.

Note that now all of the tests in test_heap.py should pass, including the tests for the PointerHeap.

TA Checkpoint: Show a TA that all of the tests in test_heap.py pass!

Additional Info

Heaps / PriorityQueues in Python are contained in a built-in module in Python! Its documentation can be found here for future use so you don’t have to reimplement heaps yourself.

Bucket Heaps (Optional)

So far we have implemented the traditional MinHeap two different ways. Similar to different types of sorting like BucketSort/RadixSort, in certain cases we can achieve better performance by straying from conventional Heap mechanisms if we know something about our input.

Within this next part, we will implement a Bucket MinHeap which is a data structure where we restrict insertions to the heap to be monotonically increasing. You will further explore the uses of bucket-heaps later during Djikstra’s algorithm.

Consider the basic idea of a bucket heap:

Task: Implement the insert, extract_min, and peek methods of BucketHeap in bucket_heap.py!

Self Checkpoint: Run the tests in test_bucket_heap.py to ensure that your implementation is correct.

Final Checkpoint: Show a TA that all of the tests in test_bucket_heap.py pass!

If you have any feedback, you can fill out this anonymous Google Form!