What's In a Design Document ?
- Use emacs to create a design1.txt file in your
projects/proj1 directory.
- For each of the files you will have, in this case: proj1.py, stack.py
and queue.py, the design file must contain :
- a complete file header comment
- a function header comment for each function you plan to have in
that file
- the function headers for each function, showing the parameter names
- The portion of the design document that deals with proj1.py file
must also include:
- all of the constants you plan to use.
- either paragraphs or a pseudocode outline that explains how your
project will work. This should be in sufficient detail that I
could draw a design diagram from your explanation.
- Here's an example design document written for
your HW8.
- Do NOT submit design diagrams. The design document must be a
plain text document.
- The design document is a great starting point for your proj1.py file.
Just copy it, change the function headers to stubs, and change the
psuedocode outline to the code for main(). The only documentation to
be added are the in-line comments.
Background
XHTML
Hypertext Markup Language (HTML) is a notation used to describe how to
display the contents of web pages and to focus on how the page looks.
Extensible Markup Language (XML) is a notation used to describe data on
web pages and to focus on what the data mean. A fairly recent HTML standard
is XHTML, which combines both HTML and XML. Web browsers read HTML or XHTML
to determine how web pages should be displayed. Other software can be used
to read the same webpages and interpret the information contained there,
if XML tags that program recognizes have been used in the document. HTML
tags are enclosed in angle brackets (<>). In XHTML, tags generally
appear in start-tag, end-tag combinations.
A start tag has the form: <name attributes>. The matching end tag
contains that name preceeded by a "/". For example, a paragraph of text
might be formatted as follows:
<p>This is a paragraph. </p>
XHTML also allows self-closing tags of the form: <name attributes />.
In a proper XHTML file, the tags will occur in properly nested pairs. Each
start tag is matched by a corresponding end tag, and one structure may be
embedded inside another, but they cannot overlap. This is like the
matching of curly braces in a Java or C program.
For example:
<p>...<ol>...</ol>...</p> is OK, but
<p>...<ol>...</p>...</ol> is not.
A self-closing tag acts as a self-contained start-end pair.
The Task
You should become familiar with both HTML and XHTML first. You can use the
w3schools.com website for both the
HTML and
XHTML tutorials,
or search for other tutorials that you like better.
Your task is to write a program that checks html files, (web pages) to
determine whether the embedded XHTML tags are balanced. Balanced tags are
necessary for the file to be a valid XHTML file as explained above.
Your program will read from the file and print an analysis of the file to
the screen. The sequence of tags in the file will be echoed to the output;
any other text in the file is to be ignored. If there is a tag balancing
error or the program reaches the end of the file while in the middle of a
tag, the program must quit and print an error message. If the end of file
is reached without any errors, print a message to that effect.
Program Requirements
- Your program must use command line arguments. At the command line the
user must enter
(in this order):
- the name of the executable file,
- the name of the html file to be checked
- You must store the tags read in from the file in a queue and then use
a stack to check the balancing. Since you're required to use both a
queue and a stack, you MUST write the functions, enqueue() and
dequeue() as well as push() and pop().
- A stack can be used to check for balancing by pushing each
start tag onto the stack. When an end tag is encountered, a start tag
is popped off the stack and the two are compared. If the tags are
matching start/end tags, processing continues; otherwise, the program
should issue an error message and stop.
- In addition to your proj1.py file, which contains main() and
printGreeting(), minimally, you must also have a queue.py file and
a stack.py file. The queue.py file should have generic queue code that
will work with queues of anything. The stack.py file can be the one
shown in the ADTs lecture or your own, but it should have generic
stack code that will work with a stacks of anything. You will need to
import this code into your proj1.py file.
- You MUST close any file that you have opened.
The Phases
You must use both a stack and a queue to solve this problem. To divide up
the effort of this problem, consider the program as consisting of two
phases.
- The first phase takes care of reading the file and finding all of
the tags from the file. The first phase will print out the tags as
it finds them and also save them into a queue of tags. If the
input file ends in the middle of a tag, the program will report
the error and end.
- The second phase of the program takes the queue of tags generated
in phase one and analyzes the sequence of tags to make sure that
they are properly balanced. This phase will make use of the
queue and a stack, as described on the previous slide. As the
tags are matched, you should print the matching tags. If the
tag is self-closing, you should print it and also that it is
self-closing.
More Details
- Although XHTML tags may have attributes, your project does not
have to handle them. I guarantee there will be no attributes in
the tags in any of the files used to test your project.
-
- All tags used in this project will be 80 characters or less.
Although a tag pair may span lines, the beginning of a tag, the <
sign, and the ending of a tag, the > sign, will always be on the
same line.
- Although strict XHTML does not allow uppercase characters in tags,
transitional XHTML does allow this and your program should accept them.
- There will be no extraneous '<' or '>' signs mixed into the text. They will only
indicate the beginning and ending of tags.
- Similarly, we will NOT include a < followed by a < followed
by a > in any test file.
Data File &
Sample Run
The data file used to create the sample output for this project is called
xhtml.dat. You may get a copy of this file from my pub
directory. Do NOT cut and paste from this link.
Here's the data file
This file doesn't really test your program very well, it's just to give
you an idea of what the output should look like. Your program should
properly handle files with mismatched tags, EOF before a complete tag, etc.,
as described above.
Here's the sample run
As always, your output need not look exactly like mine, but it should
contain all of the same information.
Submitting your work
You must be logged into your account and in the same directory
as the file you are trying to submit.
To submit your design, type the following at the linux prompt:
submit cs201 Proj1 design1.txt
To submit your project, type the following at the linux prompt:
submit cs201 Proj1 proj1.py queue.py stack.py
To verify that your file was submitted, you can execute the
following command at the Unix prompt. It will show all files that
you submitted in a format similar to the Unix 'ls' command.
submitls cs201 Proj1