CMSC 341 Spring 2004
Project 5

Assigned

Wednesday, April 21, 2004

Due

11:59 pm on Sunday May 9, 2004

Correction
26 April
"or" in algorithm insert(x), first line in step 2, should be "and".

 

Objectives

The purpose of this project is to give you exposure to different kinds of heap data structures. In particular, you are asked to implement one known as Interval Heap, which extends binary heap so that both findMin and findMax operations can be done efficiently in O(1). All other basic operations (insert, deleteMin, deletMax) are O(log N), and a Interval Heap can be built in O(N) time. The definition of the interval heap and its operations are given in the Description Section below. To fully understand this data structure and its advantages, you will need to read the original paper by J. Van Leeuwen and D. Wood (especially the first two sections) that introduced this data structure in 1987. 


Description

Let X be a totally ordered domain of values (e.g., a set of integers that specify the priorities, with the smaller value representing a higher priority). We are interested in a data structure that supports the following five operations over this domain:

  1. findMin – Determine the smallest value.
  2. findMax – Determine the largest value.
  3. insert(x) – Add a value x in X to the data structure.
  4. deleteMin – Remove the smallest value.
  5. deleteMax – Remove the largest value.

Many data structures have been developed in order to accomplish the five tasks effectively, most of which generalize the binary heap. As discussed in the class, binary heaps support only operations (1), (3) and (4), or, alternatively, the operations (2), (3) and (5) from the preceding list. Min-Max Heap was proposed by Atkinson to support all of the five operations, which was organized like a heap, except that a different ordering relation was employed (see pp. 247 -248 of the text for a brief description of Min-Max Heap). Another data structure that generalizes heaps in a more consistent way to support these operations is the Interval Heap.

 

An interval heap is very much like a binary heap, it can be seen as a complete binary tree (CBT), and it can be stored in an array. There are two main differences. 1) Each node in CBT, except perhaps the last one, stores two values rather than one. These two values, say a, b in X (and a <= b), represent an interval [a, b] in X. The last node, called the left-end-node, or simply L-node, may store either a single or a pair of values, depending on whether the total number of keys stored is odd or even. Therefore, an interval heap storing N values would have ceiling(N/2) nodes. 2) A different heap order is required: the interval at each node contains the intervals of its children.

 

In what follows, we give the definition of the interval heap and the algorithms (in pseudo code) of the five operations. Illustrative examples, in the form of CBT, are then given to help you to understand this data structure. We denote the interval specified by a node v as I(v).

 

Definition.

An interval heap is a heap structure that is either empty or satisfied the following three conditions (heap properties):

  1. For each node v different from the L-node, I(v) is an interval [a, b] (a, b in X).
  2. For the L-node v, I(v) is either a single value a in X, or an interval [a, b] (a, b in X).
  3. For all nodes v, w, if v is a child of w, then I(v) is a subset of I(w).

 

What we can see immediately from the definition is that, if I(root) = [a, b], then all values currently stored in the heap are within this interval, and a is thus the Min, b the Max among these values. Therefore, findMin and findMax would have O(1) time performance.

Note: as with binary heap, duplicate values are allowed for interval heaps.

 

Operations.

The pseudo code for the algorithm of each of the five operations is given next. If node v with I(v) = [a, b], we call a the lower endpoint and b the higher endpoint of the interval. Operations for interval heaps are similar to those for binary heaps, except 1) you need to determine which of the two endpoints is to be swapped in each step of percolateUp and percolateDown, and 2) special care must be given to the L-node since it may store either a single or a pair of values. 

 

findMin:

   if the heap is empty then throw exception;

   if root is the L-node and contains a single key a, return a;

      else return a where a is the lower endpoint of root.

 

findMax( ): Analogous to findMin.

 

deleteMin( )

   if the heap is empty then throw exception;

   if root is the L-node {

       if  it contains a single key a, delete a and remove L-node; /* heap becomes empty */

       else delete a where a is the lower endpoint of root;   /* L-node now contains a single key */

       }

   else /* root contains two keys: I(root) = [a, b] */

   {

       remove a and replace it with a hole;

       percolateDown: contunue to swap the hole with the smaller of the two lower endpoints of its two children until a leaf v is reached;

       if v is not L-node, swap hole with any value in L-node and order the two values now in v ;

       /* now the hole is in the L-node */

       if the L-node now contains no value, remove it from the heap;

   }

 

Note: percolateDown outlined here is slightly different from the author’s code for binary heaps. Here, the hole holds no value, but for binary heap, the hole is initialized with the value of the last node, and the percolateDown may stop before a leaf is reached. You may implement deleteMin by either following this algorithm or modifying the author's code. However, be aware the complications when directly modifying the author’s code (the value percolated down may become the lower endpoint in some nodes, and higher endpoint in others).

 

deleteMax( ): Analogous to deleteMin. PercolateDown now is with respect to the higher endpoint in each child node.

 

insert(x):

   if the heap is empty

      create a new node containing a single value x and insert this node into heap as its L-node;

   else {

      /*  step 1:  modify or create a new L-node */

      if L-node contain a single key y

           {insert x into L-node, the new L-node then has interval [x, y] if x <= y, or [y, x], other wise;}

      else

           {create a new node containing a single value x and insert this node into heap as its new L-node;}

      /*  step 2: percolateUp x */

     while x is not within [a, b], the interval of its parent, and x is not at root

        {if x < a, swap x with a; else swap x with b;}

     }

 

Note: Although you are not asked to implement the method to heapify an array of random intervals, it can be done by modifying the buildHeap() method for binary heaps.


Examples

Examples illustrate how these operations work are given below.

 

insert: successively insert the following keys 10, 50, 70, 30, 5, 60, 20, 80 into a previously empty interval heap.

 

         (1) [10]     (2) [10, 50]     (3) [10, 50]             [10, 70]

                                                         /           =>        /   

                                                    [70]                  [50]

                                   (new L-node created)        (swap 70 and 50)       

 

         (4)  [10, 70]          (5)     [10, 70]                    [5, 70]

                 /                               /       \         ==>        /     \  

          [30, 50]                 [30, 50]      [5]          [30, 50]   [10]

  (L-node modified)                            (swap 5 and 10)

 

(6)  [5, 70]            (7)           [5, 70]                               [5, 70]             

         /     \                              /     \                                   /     \  

       [30, 50]   [10, 60]            [30, 50]   [10, 60]   ==>    [20, 50]   [10, 60]  

                                                  /                                         /

                                             [20]                                   [30]

                                                        (swap 20 and 30)    

 

         (8)            [5, 70]                            [5, 80]             

                           /     \                               /     \  

                [20, 50]   [10, 60]           [20, 70]   [10, 60]  

                     /                                     /

            [30, 80]                          [30, 50]  

                     (swap 80 first with 50, then with 70)

 

findMin: returns 5.

findmax: returns 80.

 

printHeap: [5, 80]
                   [20, 70] [10, 60]
                   [30, 50].

 

deleteMin:

                  [__, 80]                         [10, 80]                            [10, 80]             

                   /     \                               /     \                                  /     \  

         [20, 70]   [10, 60]           [20, 70]   [__, 60]  ]           [20, 70]   [30, 60]  

             /                                     /                                          /

  [30, 50]                            [30, 50]                                [50]

  a hole created at root       hole is percolated down          hole is filled by 30 from

  after 5 is removed                                                          L-node

 

deleteMax:

                  [10, __]                            [10, 70]                         [10, 70]            

                     /     \                               /     \                               /     \

          [20, 70]   [30, 60]           [20, __]   [30, 60]           [20, 50]   [30, 60]  

               /                                    /

         [50]                                [50]

  a hole created at root            50 is percolated down         hole is filled by 50 from L-node,

  after 80 is removed                                                          the old L-node is removed from the heap.


Your Tasks

findMin()

findMax()

deleteMin()

deleteMax()

insert(x)

printHeap()

You may implement these functions according to the algorithms outlined earlier in this description or with your own algorithms. In any case, the heap order must be maintained.

/afs/umbc.edu/users/y/p/ypeng/pub/CMSC341/Proj5/341-Spring04-p5_questions.txt

These questions are worth 10% of the project grade.


Command Line and Command File

Project 5 will be invoked with a command line that consists of one argument - the name of a command file. Your program should validate the command line, ensure that the specified file can be opened, then read and process each of the commands in the file. The commands will specify operations to be performed. All commands read from a command file are executed on a single interval heap which is initially empty.

Each line in the command file gives one command together with its argument(s). The commands you are ask to execute and their formats are listed below:

·                INSERT X – Insert the integer X into the heap. All integers inserted will be between 0 and 99, inclusive.

·                FINDMIN – Find and print the minimum value in the heap.

·                FINDMAX – Find and print the maximum value in the heap.

·                DELETEMIN – remove the minimum value from the heap.

·                DELETEMAX – remove the maximum value from the heap.

·                PRINT – Print the heap in level-order of its CBT. Your print must start with a new line with each level, and print 8 nodes/intervals per line if there are more than 8 nodes at a given level. (See sample output below for the format of our output.)

Commands in the command file are case sensitive (they should all be in upper case). Your output should echo each command read from the command file before printing whatever is required by the command.


Example Command File and Sample Out put

The command file may look like this (following the examples given earlier):

PRINT

INSERT 10

INSERT 50

INSERT 70

INSERT 30

INSERT 5

INSERT 60

INSERT 20

INSERT 80

PRINT

FINDMIN

FINDMAX

DELETEMIN

PRINT

DELETEMAX

PRINT

 

The output for this example command file should look like this:

 

PROJECT 5: INTERVAL HEAPS

 

COMMAND: PRINT

     The heap is empty

 

COMMAND: INSERT 10

 

COMMAND: INSERT 50

 

COMMAND: INSERT 70

 

COMMAND: INSERT 30

 

COMMAND: INSERT 5

 

COMMAND: INSERT 60

 

COMMAND: INSERT 20

 

COMMAND: INSERT 80

 

COMMAND: PRINT

     The current heap content:

     [5, 80]
     [20, 70] [10, 60]
     [30, 50]

 

COMMAND: FINDMIN

     The minimum value in the heap is 5.

 

COMMAND: FINDMAX

     The maximum value in the heap is 80.

 

COMMAND: DELETEMIN

 

COMMAND: PRINT

     The current heap content:

     [10, 80]
     [20, 70] [30, 60]
     [50]

 

COMMAND: DELETEMAX

 

COMMAND: PRINT

     The current heap content:

     [10, 70]
     [20, 50] [30, 60]

 


Program Requirements

1.             Your code must follow all project guidelines established for this course. See the project organization and project policies web pages.  You are not required to modify the author's code to follow these guidelines.

2.             You are free to name your main program file, but your executable must be named Proj5.

3.             Your code must handle (try/throw/catch) any exceptions thrown by the author's code or by your own classes.


Files To Be Submitted

Submit any files which you have written and modified:

·                file that contains your main( );

·                files for your IntervalHeap class;

·                any other files you write for this project;

·                your makefile -- which must be named makefile or Makefile;

·                the question file with your answers inserted.

Submit the files using the procedure given to you for your section of the course.
If your makefile is set up correctly, you should be able to execute the command make submit.


Submit Tools

There are a number of tools available to you to check on your submittal. It is your responsibility to ensure that the submittal is correct and will result in a successful compilation of your project. Do not wait till the last minute to submit your files. Give yourself enough time to check that your submittal is correct.

If you don't submit a project correctly, you will not get credit for it. Why throw away all that hard work you did to write the project? Check . Make sure they work. Do this before the due date.

Documentation for the submit program is on the web at http://www.gl.umbc.edu/submit/ . One of the tools provided by the submit program is
submitls. It lists the names of the files you have submitted.

Additionally, there are two programs for use only by CMSC-341 students (not part of the UCS submit program). They are in the directory
/afs/umbc.edu/users/d/e/dennis/pub/CMSC341/ and are named submitmake and submitrun. You can use these programs to make or run your submitted projects.

The syntax is similar to that for submit:

submitmake <class> <project>

Example:  submitmake cs341 Proj4

This makes the project, and shows you the report from the make utility. It cleans up the directory after making the project (removes .o and ii_files), but leaves the
executable in place.

submitrun <class> <project> [command-line args]

Example:  submitrun cs341 Proj5 test.txt

This runs the project, assuming there is an executable (i.e. submitmake was run successfully).


Grading and Academic Integrity

Your project will be tested using a variety of command lines, some of which will be invalid.
Your project will also be tested using a variety of command files which will test various conditions which your code should handle.

Project grading is described in the Project Policy handout.
Cheating in any form will not be tolerated. Please re-read the Project Policy handout for further details on honesty in doing projects for this course.

Remember, the due date is firm. Submittals made after midnight of the due date will not be accepted. Do not submit any files after that time.