- CMSC 341: Project 4

Project 4

Assigned Oct. 30, 2000
Due Nov. 20, 2000
Preview Nov. 1, and Nov. 2, 2000


Background

Your mission in this assignment it to revisit the concordance you implemented in Project 2 and reimplement it using trees. As before, the user interface for your digital concordance will include functions such as:

Objectives

This project will deepen your skills in coding both unbalanced and balanced binary search trees. In addition, you will experiment with the performance of trees on various output to gain a deeper understanding of the incentives and complications for tree balancing.

What to do

Implement a Tree class with the standard operations. In addition, you will find that you need an inorder iterator. There should be some way for your tree class to take an argument which indicates whether the tree will be balanced (using the RedBlack method) or unbalanced (standard binary search tree). The constructor is one option, but due to scoping issues, you may elect to write a method that allows you to toggle this behavior after the tree is created. You can start from the BST and RedBlackTree ADTs as described in Chapters 4 and 12 to implement these classes. Be sure to acknowledge any code that you don't generate yourself.

The main function
Your main function should do these tasks:
  1. read the command line to obtain the name of the text file for which it will build the concordance
  2. build the concordance
  3. display the concordance on cout. The display should be in word order.
  4. for each word, on how many lines does it appear?
  5. for the most frequently appearing word, print all lines on which it appears. This is called "word in context." Exclude the following overly common words: the, a, an, and, or.
  6. Perform measurements that show:
    1. The number of nodes in the tree
    2. The depth of the deepest node
    3. The average depth of a node
  7. Your program should take two additional command line arguments -Proj2 that runs it in "Proj2 compatibility mode" and -unbalanced that runs your program using unbalanced trees rather than balanced ones. If you try the sample program, you should get a feel for the use of these arguments.
The Concordance class
The concordance is a binary search tree of entries. Like other binary search trees, it has the property that each element in the compares compares > the elements in the subtree rooted at its left child at compares < the elements in the subtree rooted at its right child. Your class should allow trees to be either balanced or unbalanced, depending on arguments to the constructor. The implementation of your class should include all necessary balancing operations where requested. Design your own class, including the interface contained in the relevent .H file.
The ConcordanceEntry class
This class is unchanged from Proj2 (i.e., it's still a type of sorted list). If you did not get your ConcordanceEntry class working in Proj2, you can use the one provided in /afs/umbc.edu/users/r/h/rheingan/pub/CMSC341/Proj4. The SortedList class from which ConcordanceEntry is derived is likewise unchanged and also available.

Grading and Academic Integrity

Project grading is described in the Project Policy handout. Please remember that you must provide good documentation as described in the handout.

Cheating in any form will not be tolerated. Please re-read the Project Policy handout for further details on honesty in doing projects for this course.

Remember, the due date is firm. Submittals made after midnight of the due date will not be accepted. Do not submit any files after that time.


Files To Be Submitted

You should submit only the files you have written or modified, and a makefile. Include the answers to the written questions, found in /afs/umbc.edu/users/r/h/rheingan/pub/CMSC341/Proj4/341-Fall00-p4-questions.txt and submitted in a file of the same name. Please do not submit any of the files provided to you unless you have modified them.

Submit the files using the procedure given to you for your section of the course.


Sample Output

Sample output is available for your study at
/afs/umbc.edu/users/r/h/rheingan/pub/CMSC341/Proj4/sample_output.txt
It was obtained by executing Proj4 on umbc9 with a command-line argument of fox_frag.txt.

A copy of the executable is available for you to try out. It is

/afs/umbc.edu/users/r/h/rheingan/pub/CMSC341/Proj4/Proj4
Other sample input files can be found in
/afs/umbc.edu/users/r/h/rheingan/pub/CMSC341/Proj4/*.txt