CMSC 341 Project 5

Assigned Wednesday, May 2, 2007
Due Tuesday May 15, 2007 at 11:59pm


Description

A SkipList is a data structure which deals with sorted data in logarithmic time. It's a less complex alternative to Red-Black trees.

In this project you will implement the SkipList class and run some tests to determine if the theoretical performance discussed in class is realized in practice.

An important part of this project is for you to develop an understanding of SkipLists by experimenting with the maximum node size and probability parameters of the list.

You will also use an auxiliary class known as a Comparator to assist in capturing the test data.
Mr. Frey's public directory for this project is /afs/umbc.edu/users/f/r/frey/pub/CMSC341/Proj5

Details of the Project

Here are the tasks you must complete for this project:
  1. Read and understand the SkipList material in the lecture notes, and in Dr. Pugh's paper.
  2. Implement a suitable SkipList ADT. You may start with the author's List class (provided in Mr. Frey's public directory) or implement your own SkipList. Pseudo code for insert and find is provided in lecture notes and in Dr. Pugh's paper.
  3. Write a main function that produces the type of output shown in 341-Spring07-p5-sample_output.txt in Mr. Frey's public directory for this project.

    The main function takes the following command-line arguments in this order:

    1. a seed value for the drand48 random number generator, a positive int
    2. the maximum node size to be allowed in the list, a positive int
    3. the probability associated with the list, a double between 0.0 and 1.0
    4. the number of insertions to do (i.e., the size of the list), a positive int
    5. the name of a data file that contains a sufficient number of unique random integers to insert, a string with no spaces
    6. the maximum number of elements to print, a positive int

    Notes on the sample output:

    1. All floating point values should be output with 4 decimal places
    2. Since the "Expected" column under "Distribution of node sizes" is printed as an integer, there is some truncation (especially in the higher node sizes). Therefore the sum of the "Expected" column is DOES NOT equal the number of inserts.

    A code fragment is given below that shows one way to insert integers read from a file into your SkipList and then reread the file to find them.

  4. Run your program to collect lots of fascinating data on the effect of probability, maximum node size, etc. on performance. Study your data to see if it makes sense in terms of your understanding of the theory.
  5. Answer the questions posed in the file 341-Spring07-proj5_questions.txt  in Mr. Frey's public directory.

    Copy the file to your own directory and edit it to provide your answers to the questions. Do not change the name of the file since the grading script looks for it under that name. The questions count 20% of your project grade.


Definition of the SkipList ADT

The ADT for SkipList is similar to the textbook's List ADT in Chapter 3.

Some differences are:

  1. Your SkipList may have some methods that are not in List. Check the class notes for these methods. You may implement any of these methods you feel are necessary for this project.
  2. Since you must report the number of nodes of each size, you will probably want some kind of accessor(s) to retrieve this information.
  3. The author's List ADT does not support the find( ) method.
  4. Since the SkipList is sorted, many List methods (e.g. push_font( ), push_back( ), etc) are not relevant for SkipList. Feel free to remove any List methods which are not required.
  5. You will need a new (or modified) print function that prints only the first N elements of the SkipList with their node sizes in the format [ <value>, <size> ].
Some potential differences in the implementation of the SkipList and the implemenation of the List are
  1. For this project, the SkipList need not be doubly linked.
  2. Iterators are not required for this project (i.e. main( ) can be implemented without them).
  3. Since you must keep track of the number of data comparisons performed for inserting and finding items in your SkipList, a Comparator object will be passed to your SkipList constructor. Be sure to store a reference to the comparator, not a copy of it.
  4. In a SkipList, the tail node actually contains meaningful data (see Dr. Pugh's paper). The data in the tail is an object that is "larger" than any data that will be inserted into the SkipList. This "max object" should be passed to your SkipList constructor.
  5. Since the SkipList uses a random number generator, the seed for the RNG should be passed to the SkipList constructor.
  6. Objects stored in List need not support any comparison operators, but objects stored in the SkipList must support operator<

Code fragment showing insertion of integers read from a file followed by find operations for each element of the list. Alternatively (without iterators), you could reread the file and perform find on each integer as it's read. To reread a stream, you may close the stream and then reopen the stream.
  Comparator<int> comparator;
  SkipList<int> sl(maxNodeSize, probability, INT_MAX, comparator, seed);
  int val;

  ifstream inFile( argv[5] );

  // do the insertions
  for (i = 0; i < nrInserts; i++)
    {
      inFile >> val;
      sl.insert(val);
    }
  inFile.close();
  // count the avg. number of comparisons made to build the list
  int avgBuildCompares =  static_cast<double>(compartor.GetCompares())/nrInserts;

  // reset the number of comparisons, then re-open the file
  comparator.Reset();
  inFile.open( argv[5]);
  for (i = 0; i < nrInserts; i++)
    {
      inFile >> val;
      sl.Find(val);
    }
  inFile.close();

  // count the avg. number of comparisons made doing the find ops
  avgFindCompares = static_cast<double>(comparator.GetCompares())/nrInserts;

The Comparator ADT

As the name implies, a Comparator class is a class template used to compare two objects of the same type. The Comparator is also used to keep appropriate statistics. In this project we will use the Comparator to count the number of times any data comparison is made.

Using a comparator object is advantageous because

  1. It's reusable
  2. It relieves the SkipList from data collection creating a clear separation of responsibilities
  3. It can be changed to collect other data (e.g. how many times a was less than b) without affecting the SkipList code
  4. It makes it easy to count comparisons in complex code.
    For example, in this code if (ptr != NULL && ptr->data < x) it's difficult to increment a counter in the right place because the comparison is not always executed (when p == NULL). Incrementing a counter either before or after the 'if' statement yields an incorrect count.

    Using an comparator, we would rewrite the code above as

    if (ptr != NULL && (comparator.IsLessThan(ptr->data, x)) Now we are assured of an accurate count of the number of times the comparison is actually performed.
The Comparator class has a single data member which is the number of comparisons performed. The operations on a Comparator are Recall that objects stored in the SkipList need only support operator<

Miscellaneous


Grading and Academic Integrity

Project grading is described in the Project Policy handout.

Your answers to 341-Spring07-proj5_questions.txt are worth 20% of your project grade. Write your name on this file. Don't change the name of the file.

Cheating in any form will not be tolerated. Please re-read the Project Policy handout for further details on honesty in doing projects for this course.

Remember, the due date is firm. Submittals made after midnight of the due date will not be accepted. Do not submit any files after that time.


Files To Be Submitted

You should submit only the files you have written, a makefile, and the file containing your answers to the questions. Make sure you submit all the files needed to successfully compile your project.

Do not change the name of the questions file. Make sure you have put your name on it.

Your makefile should successfully compile your program when the grader types make Proj5.   Please remember that the grader does not do this manually; it is done by an automatic script. Therefore, make sure your makefile will proceed successfully without human intervention after it is started.

Use submitls to check on the files you have submitted. Leave yourself enough time to do it right - don't wait till the last minute to do the submittals. A good practice is to have only the files in your directory that you will be submitting. If this is the case, then if you can successfully make the project, the grader will be able to also. Just clean up the directory (remove executables, .o files and data files) and submit everything else.

Sorry! We could not find what you were looking for

Try going back to the main page here

 
4
4
0
0
0
0