Homework 2

Describing Syntax

out: 9/19, due: 9/28

You should prepare your assignment electronically as a single pdf file and submit it as hw2.pdf using the submit system before the end of Wednesday September 28. If you do not have easy access to a docment preparation system that can produce pdf output you might try the free web-based google docs applications.

For this homework and future ones, we will accept homeworks up to three days late with a penalty of 10% for each day of lateness. This will allow us to get graded homework back sooner.

(1) recognizing simple languages (30)

For each of the following grammars, briefly describe the language it defines in a sentence or two. Assume that the start symbol is S for each and that any symbol found only on the right hand side of a production is a terminal symbol. We've done the first one for you as an example. Hint: if it's not obvious by inspection, try writing down sentences in the language until you can see the patterns emerging. If it is obvious by inspection, write down some sentences generated by the grammar to verify that they match your expectations and fit your english description. Please be as precise as possible in describing the language.

(example) 0 points.

S -> a S a
S -> b
Answer: This grammar defines the language consisting of strings N a's (where N >= 0) followed by one b followed by N a's

(1a) 10 points.

S -> a X
X -> S b
X -> b

(1b) 10 points.

S -> A B C
A -> a
A -> aA
B -> Bb
B -> bc
C -> cCc
C -> cc

(1c) 10 points.

S -> A B C D
A -> a | aA
B -> Bb | b
C -> bC
C -> b
D -> a
D -> aD

(2) Derivation and parse tree (55 points)

prog -> assign | expr
assign -> id = expr
id -> A | B | C
expr -> expr + term | expr - term | term
term -> factor | factor * term
factor -> ( expr ) | id | 0 | 1 | 2 | 3

(a) what is the associativity of the * operator? (5 points)

(b) What is the associativity of the + operator? (5 points)

(c) For the two operators, do they have the same precedence, does the * operator have greater precedence than +, or does + have greater precedence than *. (5 points)

(d) Using this grammar show a leftmost derivation and a parse tree for the strings in d.1 and c.2. Show the parse tree in two forms: as an indented tree and as a bracketed expression. For the bracketed tree notation, a leaf is represented as its symbol and a non-leaf node is represented as an open bracket, the symbol for the node, a sequence of bracketed representations of the node's children, and a close bracket. We've done the first one for you as an example of the desired output format. (20 points)

(d.0) 1 + 2

prog => expr
=> expr + term
=> term + term
=> factor + term
=> 1 + term
=> 1 + factor
=> 1 + 2

prog            # note we corrected a problem
..expr # in this example 9/20
....expr # you can see a graphical version
......term # here.
........factor
..........1
....+
....term
......factor
........2 [prog [expr [expr [term [factor 1]]] + [term [factor 2]]]]

(d.1) A + B * 2

(d.2) A = (3 + B) * C

(e) Modify the grammar to add two new operators as follows. (20 points)

  • A unary minus operator (-) that has precedence higher than either * or +.
  • A binary exponentiation operator (**) that has precedence higher than unary minus and is right associative.
For example, the string "1 - 4 * - 3 ** - 2" would be interpreted as 1 - (4 * (-(3 ** (-2)))) which would evaluate to 1.44444444... Note that to make this example work, we have modified the grammar to had a binary - operator. This does not effect how you should modfity the grammer to include a unary minus operator and an exponentiation operator.

(3) EBNF to BNF (15 points)

The following EBNF grammar defines a language of signed decimal numbers. In this notation, inline alternatives can be delimited by parentheses and separated by vertical bars, optional elements are in square brackets and a sequence in curly braces can be repeated any number of times or, if immediately followed by a + symbol, one or more times.

S -> [(-|+)] {D}+ [ . {D}+ ] 
D -> 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

Give a regular BNF grammar for this language, i.e., one that does not use the extra notation for inline alternatives, optional elements, or repetitions.