UMBC CMSC 202
UMBC CMSC 202 CSEE | 202 | current 202

CMSC 202 Fall 2003
Project 3

Spellchecking

Assigned Monday Oct 20, 2003
Design Due Sunday Oct 26, 2003 11:59pm
Program Due Sunday Nov 2, 2003 11:59pm
Updates Monday Oct 27th
One of the TAs noted that the dictionary file usr/share/dict/words is not in strict alphabetical order. Since many students are using this file to test their programs and we are not at liberty to change this file, you should NOT assume that the dictionary file is in alphabetical order.

Objectives


Project Description

In this project you will design and implement a number of classes to implement a basic document spell checker. Some classes are related via the "has a" relationship. Other classes exhibit the "uses a" a relationship. A command file will require you to print various information about the document.

Your program will implement the following classes which are described in more detail below. Proj3.cpp will contain main( ). Proj3Aux.cpp should include any functions called from main( ) and their prototypes should be in Proj3Aux.h.

  1. A Document class -- a document has zero or more pages and uses a dictionary for spell checking.
  2. A Page class -- a page has zero or more lines of text.
  3. A Line of text class -- a line of text has zero or more words.
  4. A Dictionary class -- a dictionary has zero or more words.

Your program will have three command line arguments in this order

  1. The name of a text file from which a Document is created.
  2. The name of text file from which a Dictionary is created.
  3. The name of a command file.
Your program will create a Document and a Dictionary from the specified text files and then process the commands in the command file. The files are described below.

The classes

The information and operations of each class required for this project are listed below. All data in all classes must be private. All aspects of the functions are left to you. You will specify all functions names, parameter lists, return types, pre- and post-conditions in your design assignment. Your design must be in compliance with course coding standards. The functions described for each class are the only public methods permitted. You may create any private methods you see fit.

Some notes on class design and implementation

The Document class

The Document class contains the following information.
  1. The title of the document
  2. The author of the document
  3. Zero or more pages of text
  4. A pointer to a Dictionary

    and supports the following operations

  5. A default constructor
  6. Other constructors that you deem necessary
  7. A mutator to store the Dictionary pointer
  8. Accessors for
    1. title
    2. author
    3. number of pages in the document
    4. the I-th page of the document
  9. The overloaded operator+= that appends a new page to the document.
  10. The ability to print the page numbers and line numbers on which a given word occurs.
    Word searching is case-sensitive -- "For" and "for" are different for this function.
  11. The ability to print the I-th word on the J-th line of the K-th page.
  12. The ability to print all misspelled words indicating each word and where it occurs (page and line number). If the same misspelled word appears multiple times, each occurrence is listed separately. Spellchecking is not case senstive. "For" and "for" are the same for this function.
  13. The ability to print itself to the standard output stream.

The Page class

The Page class contains the following information
  1. the lines of text

    and supports the following operations

  2. A default constructor
  3. Other constructors that you deem necessary
  4. Accessors for
    1. number of lines on the page
    2. number of words on the page
    3. the I-th line on the page
  5. The overloaded operator+= that appends a line of text to the page.
  6. The ability to print itself to the standard output stream.

The Line-of-Text class

The Line of Text class contains the following information
  1. Zero or more words that make up the line of text

    and supports the following operations

  2. A default constructor
  3. Other constructors which you deem necessary
  4. Accessors for
    1. The number of words on the line
    2. The K-th word on the line
  5. The ability to print itself to the standard output stream.
  6. The overloaded operator+= that appends a word to the line of text.

The Dictionary class

The Dictionary class contains the following information
  1. A list of properly spelled words (which may be empty)

    and supports the following operations

  2. A default constructor
  3. Other constructors which you deem necessary
  4. The ability to determine if a given word is misspelled.

Sample Output

This sample output is shown as an acceptable format. Other, well formatted, readable formats are acceptable, but all data shown must be present. This sample output was created using the text file textfile.dat and command file commands.dat in Mr. Frey's public directory /afs/umbc.edu/users/d/e/dennis/pub/CMSC202/p3 and the dictionary /usr/share/dict/words linux3[5]% Proj3 textfile.dat /usr/share/dict/words commands.dat COMMAND: PRINT Title: NOW IS THE TIME Author: Bob Smith Nr Pages: 5 Page 1 ( 2 lines, 9 words ) ------------------------------- Now is the time for all Good men now Page 2 ( 0 lines, 0 words ) ------------------------------- Page 3 ( 2 lines, 12 words ) ------------------------------- Now is the time to come to the aid of their party Page 4 ( 1 lines, 6 words ) ------------------------------- aid your party now now now Page 5 ( 1 lines, 2 words ) ------------------------------- xyzzy plugh COMMAND: FIND Now "Now" found on Page: 1 Line 1 Page: 3 Line 1 COMMAND: FIND now "now" found on Page: 1 Line 2 Page: 4 Line 1 Page: 4 Line 1 Page: 4 Line 1 COMMAND: PAGE 4 aid your party now now now COMMAND: LINE 1 2 for all Good men now COMMAND: WORD 3 2 4 party COMMAND: SPELLCHECK "xyzzy" is misspelled on page 5, line 1 "plugh" is misspelled on page 5, line 1
  1. Commands and their parameters must be displayed as they are being processed.
  2. See the command file description for required output for each command.

The Files

The text file

The dictionary file

The command file

The command file consists of one command per line. The format of the commands and the action taken by your program in response to each command is given below. You may assume that all non-blank lines in the file are formatted properly and the command names are valid, but the command parameters may not be valid for the document. Each command and its parameters must be displayed when processed.

Free Advice and Information

  1. Be sure to check that all files are opened properly.
  2. Be sure to correctly check for EOF when reading the text file so that blank lines at the end do not cause you to reprocess the last real data line.
  3. It's easiest if operator+= is overloaded as member function for the Document, Page and Line classes.
  4. If you pass a stream to a function, it must be passed by reference.
  5. Copy the makefile from project 2 and modify it for project 3.
  6. Develop your classes one at a time and test them. Then try to put them together.
  7. After all your classes are done, implement and test the commands one at a time.
  8. Don't forget that from the user's point of view (in the command file and in the output), page numbers, line numbers and word numbers are 1-based. That means that some loops in your code may run from 1 to N rather than the usual 0 to N-1.
  9. Code for some commands (PAGE, LINE, WORD) is very similar. Get one working, then use it as a model for the others.
  10. Code for some methods (Document::Find and Document::SpellCheck) are also similar.
  11. Don't forget that variables can be declared locally inside of loops and "if" statements with braces. That will make your code cleaner and even allows you to reuse the same variable name with a different type. Self-sufficient "if" statements are easily converted to functions by cutting and pasting the code.
  12. Since the document and the dictionary have capitalized words, spellchecking is NOT case sensitive as described above. Basically this means that if you find a misspelled word, covert all its characters to lowercase and check again. Only if this check also fails is the word truly misspelled. Use the tolower() function to change the case of a character (there is no function to change an entire word all at once.) To use tolower(), #include <cctype>
  13. Well structured code and classes will handle empty lines, pages, documents and dictionaries with little or no extra coding.

Project Design Assignment

Your project design document for project 3 must be named p3design.txt. Be sure to read the
design specification carefully. Submit your design in the usual way: submit cs202 Proj3 p3design.txt

Project Makefile

For this project, you will be responsible for providing your own makefile. Typing "make" should compile and link all files for your project. Your makefile should also support the commands "make clean" and "make cleanest". If you start with the makefile for project 2, the changes for project 3 are straightforward.


Grading

The grade for this project will be broken down as follows. A more detailed breakdown will be provided in the grade form you recieve with your project grade.

85% - Correctness

This list may not be comprehensive, but everything on this list will be verified by the graders.

15% - Coding Standards

Your code adheres to the
CMSC 202 coding standards as discussed and reviewed in class.

Project Submission

For this project, you will create and submit the following files
  1. Proj3.cpp that contains main()
  2. Proj3Aux.cpp that contains any functions used by main()
  3. Proj3Aux.h that contains the prototypes for the functions found in Proj3Aux.cpp
  4. Document.h -- the Document class definition
  5. Document.cpp -- the Document class implemenation
  6. Page.h -- the Page class definition
  7. Page.cpp -- the Page class implementation
  8. Line.h -- the Line of text class definition
  9. Line.cpp -- the Line of text class implementation
  10. Dictionary.h -- the Dictionary class definition
  11. Dictionary.cpp -- the Dictionary class implementation
  12. your makefile
To submit your files, use the usual command submit cs202 Proj3 <list of files> The order in which the files are listed doesn't matter and not all files must be submitted at the same time. However, you must make sure that all files necessary to compile your project (using the makefile) are submitted before the project deadline.

You can check to see what files you have submitted by typing

submitls cs202 Proj3

More complete documentation for submit and related commands can be found here.

Remember -- if you make any change to your program, no matter how insignificant it may seem, you should recompile and retest your program before submitting it. Even the smallest typo can cause compiler errors and a reduction in your grade.

Avoid unpleasant surprises!
Be sure to use the submitmake and submitrun utilities provided for you to compile, link and run your program after you've submitted it.


Last Modified: Saturday, 01-Nov-2003 13:05:41 EST