Thread: Yacc / Bison difficulties

Yacc / Bison difficulties

From

Mark Butler

Date:

11 April 2001, 21:55:16

I was trying to make a minor change today to the gram.y file to make
PostgreSQL recognize "DOUBLE" as a data type the way DB2 does.  I ran into
reduce / reduce conflicts using both of the methods I tried.  

Having fought extensively with  Bison before on a SQL oriented language
translation project, I am amazed that you were able to get a grammar as
complex as PostgreSQL to work without major difficulty.

I was wondering about what the sense of the list would be to someday accepting
a rewrite using a hand-coded LL(k) recursive descent parser.  Anyone?
- Mark Butler

Re: Yacc / Bison difficulties

From

Bruce Momjian

Date:

11 April 2001, 23:31:05

> 
> I was trying to make a minor change today to the gram.y file to make
> PostgreSQL recognize "DOUBLE" as a data type the way DB2 does.  I ran into
> reduce / reduce conflicts using both of the methods I tried.  
> 
> Having fought extensively with  Bison before on a SQL oriented language
> translation project, I am amazed that you were able to get a grammar as
> complex as PostgreSQL to work without major difficulty.
> 
> I was wondering about what the sense of the list would be to someday accepting
> a rewrite using a hand-coded LL(k) recursive descent parser.  Anyone?

Interesting.  What advantages would there be?

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026

Re: Hand written parsers

From

Mark Butler

Date:

12 April 2001, 00:55:09

Bruce Momjian wrote:

> Interesting.  What advantages would there be?

As any one who has ever attempted to build a C++ parser using Yacc or Bison
can attest, it is very difficult to get an LALR based parser to correctly
parse a sophisticated grammar.  The advantages of using a hand written
recursive descent parser lie in four areas:

1) ease of implementing grammar changes 
2) ease of debugging
3) ability to handle unusual cases
4) ability to support context sensitive grammars

Context sensitivity is useful for handling things like embedded programming
languages without having to escape whole procedures as string literals, for
example.  We could support procedural language plugins without the current
limitations on syntax.

Another nice capability is the ability to enable and disable grammar rules at
run time - you could add run time options to disable all non SQL-92 grammar
rules for application portability testing or emulate Oracle's outer join
syntax, as a couple of examples.
- Mark Butler

Re: Re: Hand written parsers

From

Ian Lance Taylor

Date:

12 April 2001, 01:45:25

Mark Butler <butlerm@middle.net> writes:

> Bruce Momjian wrote:
> 
> > Interesting.  What advantages would there be?
> 
> As any one who has ever attempted to build a C++ parser using Yacc or Bison
> can attest, it is very difficult to get an LALR based parser to correctly
> parse a sophisticated grammar.  The advantages of using a hand written
> recursive descent parser lie in four areas:
> 
> 1) ease of implementing grammar changes 
> 2) ease of debugging
> 3) ability to handle unusual cases
> 4) ability to support context sensitive grammars
> 
> Context sensitivity is useful for handling things like embedded programming
> languages without having to escape whole procedures as string literals, for
> example.  We could support procedural language plugins without the current
> limitations on syntax.
> 
> Another nice capability is the ability to enable and disable grammar rules at
> run time - you could add run time options to disable all non SQL-92 grammar
> rules for application portability testing or emulate Oracle's outer join
> syntax, as a couple of examples.

On the other hand, recursive descent parsers tend to be more ad hoc,
they tend to be harder to maintain, and they tend to be less
efficient.

I believe that yacc based parsers support context sensitivity just as
well as recursive descent parsers.

I'm not sure that C++ is a fair example.  The problem with parsing C++
is the result of a convoluted declaration syntax, so much so that some
constructs are completely syntactically ambiguous no matter how you
look at them and must be decided by semantic rules.  It's true that in
such a case recursive descent is easier, because it is easier to mix
semantic rules and syntax rules.  But the SQL syntax is much simpler;
is this really a problem with SQL?  And I note that despite the
difficulties, the g++ parser is yacc based.

Ian

---------------------------(end of broadcast)---------------------------
TIP 462: A mushroom cloud has no silver lining.

Re: Re: Hand written parsers

From

ncm@zembu.com (Nathan Myers)

Date:

12 April 2001, 04:50:16

On Wed, Apr 11, 2001 at 10:44:59PM -0700, Ian Lance Taylor wrote:
> Mark Butler <butlerm@middle.net> writes:
> > ...
> > The advantages of using a hand written recursive descent parser lie in
> > 1) ease of implementing grammar changes 
> > 2) ease of debugging
> > 3) ability to handle unusual cases
> > 4) ability to support context sensitive grammars
> > ...
> > Another nice capability is the ability to enable and disable grammar
> > rules at run time ...
>
> On the other hand, recursive descent parsers tend to be more ad hoc,
> they tend to be harder to maintain, and they tend to be less
> efficient.  ...  And I note that despite the
> difficulties, the g++ parser is yacc based.

Yacc and yacc-like programs are most useful when the target grammar (or 
your understanding of it) is not very stable.  With Yacc you can make 
sweeping changes much more easily; big changes can be a lot of work in 
a hand-coded parser.  Once your grammar stabilizes, though, hand coding 
can provide flexibility that is inconceivable in a parser generator, 
albeit at some cost in speed and compact description.  (I doubt parser 
speed is an issue for PG.)

G++ has flirted seriously with switching to a recursive-descent parser,
largely to be able to offer meaningful error messages and to recover
better from errors, as well as to be able to parse some problematic
but conformant (if unlikely) programs.

Note that the choice is not just between Yacc and a hand-coded parser.
Since Yacc, many more powerful parser generators have been released,
one of which might be just right for PG.

Nathan Myers
ncm@zembu.com

Re: Yacc / Bison difficulties

From

Peter Eisentraut

Date:

12 April 2001, 14:12:17

Mark Butler writes:

> I was trying to make a minor change today to the gram.y file to make
> PostgreSQL recognize "DOUBLE" as a data type the way DB2 does.  I ran into
> reduce / reduce conflicts using both of the methods I tried.

See attached patch.

-- 
Peter Eisentraut      peter_e@gmx.net       http://yi.org/peter-e/

From

"Howard Williams"

Date:

12 April 2001, 14:35:47

set nomail

Re: Yacc / Bison difficulties

From

Mark Butler

Date:

12 April 2001, 15:17:02

Thanks. I didn't realize the need to move the DOUBLE token from the TokenId to
the ColId production.  Will this patch be integrated into the head branch?
- Mark Butler

Peter Eisentraut wrote:
> 
> Mark Butler writes:
> 
> > I was trying to make a minor change today to the gram.y file to make
> > PostgreSQL recognize "DOUBLE" as a data type the way DB2 does.  I ran into
> > reduce / reduce conflicts using both of the methods I tried.
> 
> See attached patch.

Re: Yacc / Bison difficulties

From

Peter Eisentraut

Date:

12 April 2001, 18:20:28

Mark Butler writes:

> Thanks. I didn't realize the need to move the DOUBLE token from the TokenId to
> the ColId production.  Will this patch be integrated into the head branch?

Not sure.  It's not a standard type, but at least two other RDBMS have it
and the name does make sense.  Any comments?

>
>  - Mark Butler
>
> Peter Eisentraut wrote:
> >
> > Mark Butler writes:
> >
> > > I was trying to make a minor change today to the gram.y file to make
> > > PostgreSQL recognize "DOUBLE" as a data type the way DB2 does.  I ran into
> > > reduce / reduce conflicts using both of the methods I tried.
> >
> > See attached patch.
>
>

-- 
Peter Eisentraut      peter_e@gmx.net       http://yi.org/peter-e/

Re: Yacc / Bison difficulties

From

Bruce Momjian

Date:

12 April 2001, 21:19:03

> Mark Butler writes:
> 
> > Thanks. I didn't realize the need to move the DOUBLE token from the TokenId to
> > the ColId production.  Will this patch be integrated into the head branch?
> 
> Not sure.  It's not a standard type, but at least two other RDBMS have it
> and the name does make sense.  Any comments?

That's a tough call.  We already have some duplicate type symbols, but
this is not a standard SQL type.  I would see if we can get others to
say it is a good idea.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026

Re: DOUBLE synonym for DOUBLE PRECISION

From

Mark Butler

Date:

13 April 2001, 01:25:17

Bruce Momjian wrote:

> That's a tough call.  We already have some duplicate type symbols, but
> this is not a standard SQL type.  I would see if we can get others to
> say it is a good idea.

But the DOUBLE keyword is already reserved by ANSI for use in the "DOUBLE
PRECISION" type, so a "DOUBLE" synonym for it shouldn't make much of a
difference. Right?
- Mark Butler

Re: Yacc / Bison difficulties

From

Tom Lane

Date:

13 April 2001, 02:35:52

Peter Eisentraut <peter_e@gmx.net> writes:
> Mark Butler writes:
>> Thanks. I didn't realize the need to move the DOUBLE token from the TokenId to
>> the ColId production.  Will this patch be integrated into the head branch?

> Not sure.  It's not a standard type, but at least two other RDBMS have it
> and the name does make sense.  Any comments?

Seems like a reasonable change (for 7.2, not now).
        regards, tom lane

Re: Re: Hand written parsers

From

Tom Lane

Date:

13 April 2001, 07:08:32

ncm@zembu.com (Nathan Myers) writes:
> Yacc and yacc-like programs are most useful when the target grammar (or 
> your understanding of it) is not very stable.  With Yacc you can make 
> sweeping changes much more easily; big changes can be a lot of work in 
> a hand-coded parser.

And, in fact, this is precisely the killer reason why we will not switch
to a handwritten parser anytime in the foreseeable future.  Postgres'
grammar is NOT stable.  Compare the gram.y files for any two recent
releases.  I foresee changes at least as large in upcoming releases,
btw, as we implement more of SQL92/99 and drop ancient PostQuel-isms.

I have some interest in proposals to switch to a better parser-generator
tool than yacc ...  but yacc has the advantages of being widely
available and widely understood.  You'd need a pretty significant
improvement over yacc to make it worth switching.
        regards, tom lane

Re: Re: Hand written parsers

From

Bruce Guenter

Date:

15 April 2001, 09:35:54

On Fri, Apr 13, 2001 at 03:12:57AM -0400, Tom Lane wrote:
> I have some interest in proposals to switch to a better parser-generator
> tool than yacc ...

There are tools to produce efficient top-down parsers as well.  Those
doing such parsers "by hand" may be interested in looking at PCCTS
(http://www.polhode.com/pccts.html) or ANTLR (http://www.antlr.org/).
--
Bruce Guenter <bruceg@em.ca>                       http://em.ca/~bruceg/