I’m working on a software project (more soon) that involves a notation that is interpreted by computers. As a way of specifying the language formally, I’m trying out parsing expression grammars, a relatively new alternative to the methods that have been traditionally used to define the syntax of programming languages, like context-free grammars. I’ve been reading the original paper in which Bryan Ford introduces PEGs, and something struck me about the way in which it builds up to the mathematical definition of the idea. The paper begins with an “informal” explanation that starts with an example of a PEG written in ASCII text, like you would use as the input to a program:
# Hierarchical syntax Grammar <- Spacing Definition+ EndOfFile Definition <- Identifier LEFTARROW Expression Expression <- Sequence (SLASH Sequence)* Sequence <- Prefix* Prefix <- (AND / NOT)? Suffix Suffix <- Primary (QUESTION / STAR / PLUS)? Primary <- Identifier !LEFTARROW / OPEN Expression CLOSE / Literal / Class / DOT[…]
Although the paper explains what it means in a very prosaic way, placing it in historical context and comparing PEG’s practical implications with those of other types of grammar, this bit of ASCII text seems intuitively like the most formal thing in the paper. The mathematical definition of the construct is set off much less from the text of the article than the ASCII example, which is in a fixed-width font and embedded as “Figure 1.” The definition begins:
Definition: A parsing expression grammar (PEG) is a 4-tuple G=(VN, VT, R, eS), where VN is a finite set of nonterminal symbols, VT is a finite set of terminal symbols, R is a finite set of rules, eS is a parsing expression termed the start expression, and VN ∩ VT = ∅. Each rule r ∈ R is a pair (A, e), which we write A ← e, where A ∈ VN and e is a parsing expression. For any nonterminal A, there is exactly one e such that A ← e ∈ R. R is therefore a function from nonterminals to expressions, and we write R(A) to denote the unique expression e such that A ← e ∈ R.
One reason why the informal explanation is informal in comparison with this is that it describes the syntax of PEGs using a PEG, making it a circular definition. But two things jump out at me.
- The mathematical notations in the formal definition are interpolated into a paragraph of written English, while the informal definition describes the syntax of the system in a way that a computer could understand.
- It would be much harder to see what the formal definition is doing without reading the informal one first. If the paper had started talking about 4-tuples right off the bat, it would be unclear in what sense the objects it defines could be considered “rules” and “parsing expressions.” There is something, a sort of mathematical anamnesis, that the reader takes away from the circular definition at the beginning of the paper that makes it possible to see the meaning of the more rigorous math that follows, in a sense of the word “meaning” that is not yet clear to me.