The Generalized LR parsing algorithm (some call it "Tomita’s algorithm") was originally developed in 1985 as a part of my Ph.D thesis at Carnegie Mellon University. When I was a graduate student at CMU, I tried to build a couple of natural language systems based on existing parsing methods. Their parsing speed, however, always bothered me. I sometimes wondered whether it was ever possible to build a natural language parser that could parse reasonably long sentences in a reasonable time without help from large mainframe machines. At the same time, I was always amazed by the speed of programming language compilers, because they can parse very long sentences (i.e., programs) very quickly even on workstations. There are two reasons. First, programming languages are considerably simpler than natural languages. And secondly, they have very efficient parsing methods, most notably LR. The LR parsing algorithm first precompiles a grammar into an LR parsing table, and at the actual parsing time, it performs shift-reduce parsing guided deterministically by the parsing table. So, the key to the LR efficiency is the grammar precompilation; something that had never been tried for natural languages in 1985. Of course, there was a good reason why LR had never been applied for natural languages; it was simply impossible. If your context-free grammar is sufficiently more complex than programming languages, its LR parsing table will have multiple actions, and deterministic parsing will be no longer possible.
Masaru Tomita
Generalized LR Parsing [PDF ebook]
Generalized LR Parsing [PDF ebook]
Cumpărați această carte electronică și primiți încă 1 GRATUIT!
Limba Engleză ● Format PDF ● ISBN 9781461540342 ● Editor Masaru Tomita ● Editura Springer US ● Publicat 2012 ● Descărcabil 3 ori ● Valută EUR ● ID 4602132 ● Protecție împotriva copiilor Adobe DRM
Necesită un cititor de ebook capabil de DRM