# Chapter 6: Syntax

### 6.1 Lexical Structure of Programming Languages

• Token categories
• reserved words (keywords)
• literals or constants
• special symbols
• identifiers
• principle of longest substring takes the longest possible substring of nonblank characters as a single token.
• Regular expression
• + mean one or more
• ? mean one or none
• * mean none or more

### 6.2 Context-Free Grammars and BNF’s

• Elements of CFG
• start symbol is nonterminal that stands for the entire, top-level phrase being defined.
• nonterminals are broken down into further phrase structures
• terminals cannot be broken down into further phrase structures
• productions are rules use for the derivation.
• CFG defines a language.

### 6.3 Parse Trees and Abstract Syntax Trees

• A Parse Tree is labeled by nonterminal at interior nodes and terminals at leaves. The structure of a parse tree is completely specified by the grammar rules of the language and a derivation of a particular sequence of terminals.
• Abstract syntax tree abstract the essential structure of the parse tree.
Example: arithmetic expression grammar
expr   → expr + expr | expr * expr | (expr) | number
number → number digit | digit
digit  → 0 | 1 | 2 | 3 | 4| 5 | 6 | 7 | 8 | 9

Example: parse tree for 234
number
/     \
number    digit
/     \      |
number  digit   4
|       |
digit     3
|
2

Example: abstract syntax tree for 234
4
/
3
/
2

### 6.4 Ambiguity, Associativity, and Precedence

• A grammar which two distinct parse trees are possible for the same string is ambiguous.
• left-recursive rule for an operation causes it to left-associate, right-recursive causes right-associate.
3 + 4 + 5

left-recursive/associate
expr → expr + term

expr
/ | \
expr  +  term
/ | \      |
expr  +  term  5
|        |
3        4

right-recursive/associate
expr → term + expr

expr
/ | \
term  +  expr
|       / | \
3   term  +  expr
|        |
4        5

### 6.5 EBNFs and Syntax Diagrams

• EBNF
• {} stands for “zero or more repetition of”
• [] indicate optional part of the structure

### 6.6 Parsing Techniques and Tools

• Bottom-up parser match an input with the right-hand sides of the grammar rules, when match occurs, the right-hand side is replaced by, or reduced to, the nonterminal on the left.
• Top-down parsers expand the nonterminal to match incoming tokens and directly construct a derivation.
• Recursive-decent parser turns the nonterminals into a group of mutually recursive procedures whose actions are based on the right-hand sides of the BNF’s.
• predictive parser requirement
• able to distinguish between choices in a grammar rule in term of First sets or the First sets of any two choices must not have any token in common (where First($\alpha$) to be the set of tokens that can begin the string $\alpha$).
• for any optional part, no token beginning the optional part an also come after the optional part.