Parsing with YACC
Volume Number: 6>replyPtr Issue Number: 4>replyPtr$ !Column Tag: Language Translation>replyPtrC *Introduction to Parsing and Inside YACC >replyPtrV &By Clifford Story, Mount Prospect, ILÄy U Note: Source code files accompanying article are located on MacTech CD-ROM or>replyPtr source code disks.>replyPtr Part I. Intoduction to Parsing>replyPtr A. Introduction>replyPtr$UThis is the first of a series of articles on language translation. This is commonly>replyPtr [called “parsing”, and I will no doubt lapse from time to time into this usage, but it is>replyPtr‚ 9better to reserve the word for one phase of translation.>replyPtr $QThe techniques I will discuss are used in compilers and interpreters, of course,>replyPtr˙ Qbut also in many more-common circumstances. For example, when you add a formula>replyPtr Lto a speadsheet, the application must translate the formula into a form it can use. >fixBtnName NMany database programs let you specify formats for input fields -- these must>fixBtnName Utranslate your format, and then use the result to translate the input. Consider the>replyPtr Pexpression search in MPW. I guarantee you there’s a translator operating there. And,>fixBtnName6 Mof course, my own program Idealiner uses a translator when you specify topic>replyPtr numbers.>replyPtr $PAll the code examples in this series of articles will be MPW tools, to minimize>fixBtnNameZ [interface considerations. I doubt if you’ll see a single Macintosh trap before the tenth>replyPtr Iarticle. I will focus instead on the internals of language translation.>replyPtr $KLanguage translation can be divided into three stages: first, the LEXICAL>replyPtr WANALYZER (also called the “scanner”) divides the input text into individual symbols,>replyPtr Qpieces like identifiers, operators and constants; then the PARSER interprets the>replyPtr Nsequence of symbols according to the language’s grammar; and lastly the CODE>fixBtnName NGENERATOR responds in an appropriate fashion to the statements the parser has>fixBtnName Vprocessed. These three phases operate simultaneously, the parser calling the lexical>fixBtnName Danalyzer for input, and then calling the code generator for output.>fixBtnName A(1). Parsing>replyPtr $;The first three parts in the series will focus on parsing.>replyPtr $MThe first part of the article (this very one) will introduce parsing and the>replyPtr Oparser-generator YACC (Yet Another Compiler Compiler). YACC converts a set of>replyPtr SGRAMMAR RULES into C source for a parser that will accept that grammar. You still>replyPtr Thave to write the lexical analyzer and the code generator (although YACC provides aapplication! Xframework for code generation) but YACC takes care of the most tedious part of the job.application-$NThe second part will go beneath the surface and explore the inner workings ofapplication9 VYACC. This will be somewhat technical, and may seem unnecessary, but it will provideapplicationE Xan essential foundation for later topics. The article will cover deriving parse tablesapplicationQ Tfrom YACC output, parsing expressions by hand (using the parse tables), and YACC’sapplication] debug output.applicationi$PThe next article will discuss ambiguities, association and error detection. Anapplicationu Uunderstanding of the first two topics are essential to writing correct and efficient>replyPtr Qgrammars. The third covers detecting and reporting errors by type and location.>replyPtr A(2). Lexical Analysisapplication $RWith the third article, I will return to the first stage of language translation,application Tlexical analysis. The article will begin w
scussion of state machines, thenapplication 6present a simple but effective table-driven analyzer.application $PThe fourth article will be an excursion into a seemingly unrelated field: hashapplication‹ Ytables and binary trees. The idea is to develop some tools to increase the power of the>replyPtrË +lexical analyzer in the following article.>replyPtrÙ$SThe fifth article will extend the analyzer by adding a symbol table. The routines>replyPtrcustomGetFileWdeveloped in the fifth article will give us a way to save symbols; the example program>replyPtr 3will be an improved version of the MPW Canon tool.>replyPtr&