Re: Parser Cruft in gram.y - Mailing list pgsql-hackers

From Robert Haas
Subject Re: Parser Cruft in gram.y
Date
Msg-id CA+TgmoYORJbk+bKs=n834u2bQncQ=gnrKNGtWYLpkePJrowekA@mail.gmail.com
Whole thread Raw
In response to Re: Parser Cruft in gram.y  (Dimitri Fontaine <dimitri@2ndQuadrant.fr>)
Responses Re: Parser Cruft in gram.y
Re: Parser Cruft in gram.y
Re: Parser Cruft in gram.y
List pgsql-hackers
On Tue, Dec 18, 2012 at 4:33 AM, Dimitri Fontaine
<dimitri@2ndquadrant.fr> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> And on the other hand, if you could get a clean split between the two
>> grammars, then regardless of exactly what the split was, it might seem
>> a win.  But it seemed to me when I looked at this that you'd have to
>> duplicate a lot of stuff and the small parser still wouldn't end up
>> being very small, which I found hard to get excited about.
>
> I think the goal is not so much about getting a much smaller parser, but
> more about have a separate parser that you don't care about the "bloat"
> of, so that you can improve DDL without fearing about main parser
> performance regressions.

Well that would be nice, but the problem is that I see no way to
implement it.  If, with a unified parser, the parser is 14% of our
source code, then splitting it in two will probably crank that number
up well over 20%, because there will be duplication between the two.
That seems double-plus un-good.

I can't help but suspect that the way we handle keywords today is
monumentally inefficient.  The unreserved_keyword products, et al,
just seem somehow badly wrong-headed.  We take the trouble to
distinguish all of those cases so that we an turn around and not
distinguish them.  I feel like there ought to be some way to use lexer
states to handle this - if we're in a context where an unreserved
keyword will be treated as an IDENT, then have the lexer return IDENT
when it sees an unreserved keyword.  I might be wrong, but it seems
like that would eliminate a whole lot of parser state transitions.
However, even if I'm right, I have no idea how to implement it.  It
just seems very wasteful that we have so many parser states that have
no purpose other than (effectively) to convert an unreserved_keyword
into an IDENT when the lexer could do the same thing much more cheaply
given a bit more context.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Kohei KaiGai
Date:
Subject: Re: [v9.3] writable foreign tables
Next
From: Peter Eisentraut
Date:
Subject: Re: Parser Cruft in gram.y