On Thu, Feb 28, 2013 at 04:09:11PM -0500, Tom Lane wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
> > A whole lot of those state transitions are attributable to states
> > which have separate transitions for each of many keywords.
>
> Yeah, that's no surprise.
>
> The idea that's been in the back of my mind for awhile is to try to
> solve the problem at the lexer level not the parser level: that is,
> have the lexer return IDENT whenever a keyword appears in a context
> where it should be interpreted as unreserved. You suggest somehow
> driving that off mid-rule actions, but that seems to be to be probably
> a nonstarter from a code complexity and maintainability standpoint.
>
> I believe however that it's possible to extract an idea of which
> tokens the parser believes it can see next at any given parse state.
> (I've seen code for this somewhere on the net, but am too lazy to go
> searching for it again right now.) So we could imagine a rule along
I believe tokenizing of typedefs requries the lexer to peek at the
parser state:
http://calculist.blogspot.com/2009/02/c-typedef-parsing-problem.html
The well-known "typedef problem" with parsing C is that the standard Cgrammar is ambiguous unless the lexer
distinguishesidentifiers bound bytypedef and other identifiers as two separate lexical classes. Thismeans that the
parserneeds to feed scope information to the lexerduring parsing. One upshot is that lexing must be done concurrently
withparsing.
-- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB
http://enterprisedb.com
+ It's impossible for everything to be true. +