CMSC-341 Fall 2006

Project 4
Leftist Heap Construction Experiment

Assigned Wed Nov 8, 2006
Due Wed Nov 22, 2006 by 11:57PM
Updates


Background

In class, we discussed ways to build Leftist Heaps (LHs) from a set of data. The naive, brute force approach is to create a LH from each data item and merge it into the final LH. The more sophisticated approach is to create a LH from each data item and place the LHs on a queue. Pairs of LHs are dequeued, merged, and the resulting LH is enqueued. This process continues until only one LH remains.

In this project, you will create a small experiment to gather performance data for each method of building the LH. You will output your experiment's data, then analyze the data to determines if the experimental data matches the theoretical expected performance.


Description

Your project will read integers from several files, building two LHs for each file. One will be built by the "insertion method", the other by the "pair-wise merging method".

Your project will print one line of data for each file of integers consisting of seven (7) columns. See sample output below.

  1. The number of nodes in the LHs (call this N)
  2. The number times merge( ) is called for the "insertion based method" (call this M1)
  3. The value of M1/N with 3 decimal places.
  4. The value of M1/N*lgN with 3 decimal places.
  5. The number of times merge( ) is called for the "pair-wise merging method" (call this M2)
  6. The value of M2/N with 3 decimal places.
  7. The value of M2/N*lgN with 3 decimal places.
Note that the C++ math library (#include <cmath>) provides the function log10(double x) that returns the log of X, base 10. To compute lgX (the log of X, base 2), note that log X base 2 = log X base 10 / log 2 base 10.

Your project will be invoked with a single command line argument which is the name of a file. That file contains the names of other files (separated by whitespace) which contain the unique random integers (separated by whitespace) from which the LHs are built. You should run your experiment and report the data for each file of integers.

Mr. Frey's public directory for this project is /afs/umbc.edu/users/d/e/dennis/pub/CMSC341/Proj4.

Your primary tasks are

  1. Copy the file LHeap.h from Mr. Frey's public directory and edit this file as necessary.
  2. Write main( ) in Proj4.cpp which builds the LHs and reports the experimental data from each file of integers. You may assume that no file will contain more than 30000 values.
  3. Copy the question file 341-Fall06-proj4_questions.txt from Mr. Frey's public directory to your local directory and answer the questions. For this project, questions count 20% of your project grade. The questions ask you to analyze your experimental data. Your answers must include a copy of your project's output. You should capture your project's output into a file using UNIX redirection, then insert this file with the answers to the questions.
  4. Write and submit a makefile for this project.
The author's array-based Queue.h is available in Mr. Frey's public directory. This file will not require any changes, and so should not be submitted. Your makefile should reference this file.

Sample Output

This sample is an example of the required format and gives you some idea as to the range of the data values. linux2[265]% Proj4 files.dat ------- Insert Based ------ ------ PairWise ----------- Nodes(N) Merges(M1) M1/N M1/N*lgN Merges(M2) M2/N M2/N*lgN -------- ---------------------------- ---------------------------- 500 xxxxx x.yyy x.yyy xxxxx x.yyy x.yyy 1000 xxxxx x.yyy x.yyy xxxxx x.yyy x.yyy 1500 xxxxx x.yyy x.yyy xxxxx x.yyy x.yyy 2000 xxxxx x.yyy x.yyy xxxxx x.yyy x.yyy 2500 xxxxx x.yyy x.yyy xxxxx x.yyy x.yyy 3000 xxxxx x.yyy x.yyy xxxxx x.yyy x.yyy 3500 xxxxx x.yyy x.yyy xxxxx x.yyy x.yyy 4000 xxxxx x.yyy x.yyy xxxxx x.yyy x.yyy 4500 xxxxx x.yyy x.yyy xxxxx x.yyy x.yyy 5000 xxxxx x.yyy x.yyy xxxxx x.yyy x.yyy 5500 xxxxx x.yyy x.yyy xxxxx x.yyy x.yyy 6000 xxxxx x.yyy x.yyy xxxxx x.yyy x.yyy 6500 xxxxx x.yyy x.yyy xxxxx x.yyy x.yyy 7000 xxxxx x.yyy x.yyy xxxxx x.yyy x.yyy 7500 xxxxx x.yyy x.yyy xxxxx x.yyy x.yyy 8000 xxxxx x.yyy x.yyy xxxxx x.yyy x.yyy 8500 xxxxx x.yyy x.yyy xxxxx x.yyy x.yyy 9000 xxxxx x.yyy x.yyy xxxxx x.yyy x.yyy 9500 xxxxx x.yyy x.yyy xxxxx x.yyy x.yyy 10000 xxxxxx x.yyy x.yyy xxxxx x.yyy x.yyy ..... 20000 xxxxxx x.yyy x.yyy xxxxx x.yyy x.yyy

Test Data Files

Test data files are available in Mr. Frey's public directory. Use these files to test your program's functionality, then to create output to use when answering the project questions. In that directory you will find the following files
  1. files.dat -- this file contains the list of filenames which contain random integers from which the LHs are built. The name of this file (files.dat) is the one and only command line parameter for this project.
  2. 500.ints, 1000.ints, 1500.ints, ... 20000.ints
    Each file contains random unique integers used to construct the LHs. The number of integers in each file is specified by its filename (e.g. 3500.ints contains 3500 unique random integers). These filenames appear in files.dat, one per line.
NOTE -- Even with a very large number of integers the time to construct the LHs should be very short.

Project Submission

Because you are submitting your own makefile, you are free to name your files whatever you like, except that the name of your executable MUST BE Proj4 (upper-case 'P'). The submitrun program and the scripts that we use to grade your files assume that you follow this convention for naming your executable.

DO NOT submit any .h file (e.g. Queue.h) provided for you unless you have modified the file. If you have not modified the file, then your makefile should reference the file(s) as in previous projects.

Submit Tools

There are a number of tools available to you to check on your submittal. It is your responsibility to ensure that the submittal is correct and will result in a successful compilation of your project. Do not wait till the last minute to submit your files. Give yourself enough time to check that your submittal is correct.

If you don't submit a project correctly, you will not get credit for it. Why throw away all that hard work you did to write the project? Check your submittals. Make sure they work. Do this before the due date.

Documentation for the submit program is on the web at http://www.gl.umbc.edu/submit/. One of the tools provided by the submit program is submitls. It lists the names of the files you have submitted.

Additionally, there are two programs for use only by CMSC-341 students (not part of the UCS submit program). They are in the directory
/afs/umbc.edu/users/d/e/dennis/pub/CMSC341/ and are named submitmake and submitrun. You can use these programs to make or run your submitted projects.

The syntax is similar to that for submit:

submitmake <class> <project>

Example:  submitmake cs341 Proj2

This makes the project, and shows you the report from the make utility. It cleans up the directory after making the project (removes .o files), but leaves the executable in place.

submitrun <class> <project> [command-line args]

Example:  submitrun cs341 Proj2 test.dat

This runs the project, assuming there is an executable (i.e. submitmake was run successfully).


Grading and Academic Integrity

Your project will be tested using a variety of command lines, some of which will be invalid.
Your project will also be tested using a variety of command files which will test various conditions which your code should handle.

Project grading is described in the Project Policy handout.

Your answers to 341-fall06-proj4_questions.txt are worth 20% of your project grade.

Cheating in any form will not be tolerated. Please re-read the Project Policy handout for further details on honesty in doing projects for this course.

Remember, the due date is firm. Submittals made after the due date will not be accepted. Do not submit any files after that time.