Practical Guide to the Heap Data Structure

When it comes to understanding tree data structures, one of the most straightforward tree types is the Heap data structure. I like heaps for a number of reasons, namely:

It's typically clear when heaps should be used in a program.
You can spot in an instant if a data structure is a heap or not by following a few rules.

Heap Data Structure Requirements

When it comes to working with Heaps there are two main Heap types:

Max heap
Min heap

We'll be using Max heaps for our examples in this guide. In order to use a Min heap you would simply reverse the order of the node values.

In Max heap, the parent nodes are always greater than their child nodes. To create a Min heap the parent nodes are always less than their child nodes.

Looking at a Proper Heap

So what rules do heaps have to live by in order to be considered legitimate? Let's start by looking at a basic heap:

large

As you can see, a Heap is a binary tree. Remember that this means that a parent node can have no more than 2 child nodes.

Additionally, parent nodes in a Heap are always larger than their child nodes. This means that finding the greatest value in a Heap is incredibly easy because it's always the root node.

Lastly, also note that Heaps need to be balanced. Notice how no leaf nodes have a height difference great than 1?

Heaps Behaving Badly

To help reinforce the Heap rules, let's look at a couple of examples of Heaps that aren't properly constructed.

large

In this example, the 21 node is greater than it's parent. This breaks the tree since algorithms that work with Heaps will expect parent nodes to always be larger than any child nodes.

large

This second example shows a tree that is not balanced. In this tree the 21 leaf node is two levels higher than the 1 leaf node. Anytime you have a binary tree with leaves that have height values greater than 1, the tree is considered unbalanced.

Practical Usage of Heaps

So all this knowledge of Heaps is great, but how can we use them in real programs? I'm glad you asked! Heaps have unique behavior compared with other data structures that perfectly fit the requirements for creating Priority Queues.

You are already familiar with the Queue data structure. A Priority Queue is very similar, except that elements are given a priority. So instead of the default behavior of the first elements entering a queue being the first elements removed, a priority queue allows for elements to be removed based on their priority.

Real World Example of Priority Queues

During our discussion of the Queue data structure I gave the example that Queues worked like a line at Disneyland. In this example individuals would get into a ride line, and they would get on the ride in the same order that they got in line.

However this approach isn't the way lines at Disneyland work. Disneyland rides actually leverage a priority queue. For example, there are ways to jump ahead of others in a ride line:

If you have a FastPass ticket you can bypass the entire line.
Handicapped individuals are taken to the front of the line.
And single riders are able to get on rides faster than families because they can take up empty seats on a ride.

Notice how each of these circumstances gives a priority to the individuals vs other individuals in the ride line? This is essentially how the Priority Queue data structure operates in programming. The structure gives the ability to establish a priority to specific elements as opposed to pulling items out in order.

Code Example of Priority Queues

For our code example of implementing a Priority Queue we'll use the Scala programming language because I like how explicit the PriorityQueue class is in Scala. On the back end of the PriorityQueue class, Scala leverages the Heap data structure in order to accomplish its priority driven behavior.

In our program let's build a line simulator for Disneyland. This is actually something that I was hired to build by a client a few years back and I found this implementation worked nicely.

In this code we're going to first import the PriorityQueue class. From there we'll create a class called Line that will take in a priority and a rationale. The rationale is simply a text description of why an individual in line is given a specific priority.

Inside of the class we'll have a method called compare that will compare the priority values.

import scala.collection.mutable.PriorityQueue

case class Line(priority: Int, rationale: String) extends Ordered[Line] {
  def compare(that: Line)=that.priority compare this.priority
}

To feed this program with data we'll declare a new var called line and add in the individuals participating in the line simulation:

var line = PriorityQueue[Line]() ++ Seq(Line(4, "Regular"), Line(1, "Handicapped"), Line(4, "Regular"), Line(3, "Single Rider"), Line(2, "Fast Pass"), Line(1, "Handicapped"), Line(3, "Single Rider"), Line(2, "Fast Pass"))

Lastly we can run this program by iterating through the line collection and print the items out using the dequeue method.

while(line.nonEmpty) println(line dequeue)

After running the program you'll see that it outputs the following data:

Line(1,Handicapped)
Line(1,Handicapped)
Line(2,Fast Pass)
Line(2,Fast Pass)
Line(3,Single Rider)
Line(3,Single Rider)
Line(4,Regular)
Line(4,Regular)

As you see this program based the dequeue method order on the priority supplied in the data as opposed to the order of the items, such as a traditional queue.