Thread: Re: [GENERAL] A rare error

Re: [GENERAL] A rare error

From
"Kevin O'Gorman"
Date:
pgsql-hackers-owner@hub.org wrote:

I've been looking into this.  I thought it would be easy, but it
doesn't want an easy fix, because it's worse than it at first
appeared.

Were you aware that this is legal: (select avg(a),b from dummy group by b) order by b;
but this is not: (select avg(a),b from dummy) group by b order by b;
and if we just allowed lots of nested parens, we would allow: ((((select avg(a),b from dummy group by b)))) order by
b;
which just seems silly.

I for one don't like any of these, although I prefer the
one that is currently disallowed -- and it appears to me
that the parens being allowed in select_clause are being
introduced at the wrong level.  I'm going to experiment
some with the concept of a "select_term" which will be the
thing that can be parenthesized (in some contexts).

BTW: yacc accepts LALR grammars, which are fairly restricted.
Thus the shift/reduce complaints and such don't mean it's
ambiguous, just that it's pushing the envelope of the LALR
paradigm.  A lot of yacc grammars do just that, and work
just fine, but of course you have to know what you're doing.
I don't, at least not to that level of wizardry, so I'll stay
away from shift/reduce, etc.  I mention it just to say that
we're not treading on ambiguity here, just the limits of yacc.

Finally, I hereby solicit comments.  Because of yacc's limits,
it may turn out to be difficult to do things like have
unlimited parens around a "select_term" and also have
constructs like     NOT IN (select_term)
because yacc might not know how to parse the outer parens.
OTOH, maybe we don't want NOT IN (((SELECT foo FROM bar))).
OTOOH, maybe we do because there could be program-generated
SQL out there that would like that freedom.  What do the
readers think?

I don't know yet where the problems could be.  I may need
help figuring out what's important to provide and what is
less so.  Does anyone know if there are a lot of parens in
the regression tests?

++ kevin


> 
> Subject: Re: [GENERAL] A rare error
> Date: Wed, 25 Oct 2000 12:28:35 -0400
> From: Tom Lane <tgl@sss.pgh.pa.us>
> To: DaVinci <bombadil@wanadoo.es>
> CC: Lista PostgreSql <pgsql-general@postgresql.org>,
>      pgsql-hackers@postgresql.org
> References: <20001025104758.A7643@fangorn.net>
> 
> DaVinci <bombadil@wanadoo.es> writes:
> > An extrange behavior with PostgreSql 7.0.2:
> >       select * from foo where exists
> >       (select * from foo)
> > works fine. But:
> >       select * from foo where exists
> >       ((select * from foo))
> > shows an error:
> >       ERROR: parser: parse error at or near "("
> > Is this a bug?
> 
> I was fooling around with exactly that point a couple weeks ago.  You'd
> think it would be easy to allow extra parentheses around a sub-select,
> but I couldn't figure out any way to do it that didn't provoke shift/
> reduce conflicts or worse.
> 
> The main problem is that if parentheses are both part of the expression
> grammar (as they'd better be ;-)) and part of the SELECT grammar then
> for a construct like
>         select (((select count(foo) from bar)));
> it's ambiguous whether the extra parens are expression parens or part
> of the inner SELECT statement.  You may not care, but yacc does: it does
> not like ambiguous grammars.  AFAICS the only solution is not to allow
> parentheses at the very top level of a SELECT structure.  Then the above
> is not ambiguous because all the extra parens are expression parens.
> 
> This solution leads directly to your complaint: the syntax is
>         EXISTS ( SELECT ... )
> and you don't get to insert any unnecessary levels of parenthesis.
> 
> We could maybe hack something for EXISTS in particular (since we know
> a parenthesized SELECT must follow it) but in the general case there
> doesn't seem to be a way to make it work.  For example, in current
> sources this is OK:
>         select * from foo where exists
>         ((select * from foo) union (select * from bar));
> but not this:
>         select * from foo where exists
>         ((select * from foo) union ((select * from bar)));
>         ERROR:  parser: parse error at or near ")"
> 
> If there are any yacc hackers out there who think they can improve on
> this, please grab gram.y from current CVS and have at it.  It'd be nice
> not to have an artificial restriction against redundant parentheses in
> SELECT structures.
> 
>                         regards, tom lane
> 
-- 
Kevin O'Gorman  (805) 650-6274  mailto:kogorman@pacbell.net
Permanent e-mail forwarder:  mailto:Kevin.O'Gorman.64@Alum.Dartmouth.org
At school: mailto:kogorman@cs.ucsb.edu
Web: http://www.cs.ucsb.edu/~kogorman/index.html
Web: http://trixie.kosman.via.ayuda.com/~kevin/index.html

"There is a freedom lying beyond circumstance,
derived from the direct intuition that life can
be grounded upon its absorption in what is
changeless amid change"   -- Alfred North Whitehead




Re: Re: [GENERAL] A rare error

From
Tom Lane
Date:
"Kevin O'Gorman" <kogorman@pacbell.net> writes:
> Were you aware that this is legal:
>   (select avg(a),b from dummy group by b) order by b;
> but this is not:
>   (select avg(a),b from dummy) group by b order by b;

The reason for that is that SQL doesn't think that "order by" should
be allowed in subqueries, only in a top-level SELECT.

That restriction makes sense in pure SQL, since tuple order is
explicitly *not* part of the computational model.  In the eyes of the
SQL spec, the only reason ORDER BY exists at all is for prettification
of final output.

However, once you add the LIMIT clause, queries likeSELECT * FROM foo ORDER BY bar LIMIT 1
suddenly become quite interesting and useful as subqueries
(this query gives you the whole row associated with the minimum
value of bar, which is something you can't easily get in pure SQL).

As the sources stand tonight, you can have such a query as a subquery,
but only if you hide the ORDER/LIMIT inside a view definition.  You'll
get a syntax error if you try to write it in-line as a subquery.
There is no longer any good implementation reason for that; it is
solely a grammar restriction.

So I'm coming around to the idea that we should abandon the SQL
restriction and allow ORDER + LIMIT in subqueries.  The trouble is
how to do it without confusing yacc.

> BTW: yacc accepts LALR grammars, which are fairly restricted.
> Thus the shift/reduce complaints and such don't mean it's
> ambiguous, just that it's pushing the envelope of the LALR
> paradigm.  A lot of yacc grammars do just that, and work
> just fine, but of course you have to know what you're doing.

Right.  Also, I believe it's possible that such a grammar will behave
differently depending on which yacc you process it with, which would be
bad.  (We have not yet taken the step of insisting that pgsql's grammar
is bison-only, and I don't want to.)  So ensuring that we get no shift/
reduce conflicts has been a shop rule around here all along.

Anyway, the bottom line of all this rambling is that if you can get
rid of the distinction between SelectStmt and select_clause altogether,
that would be fine with me.  You might consider looking at whether you
can write two nonterminals: a SELECT construct that has no outer parens,
and then an additional construct
subselect: SelectStmt | '(' subselect ')'

which would be used for all the sub-select nonterminals in SelectStmt
itself.

> OTOH, maybe we don't want NOT IN (((SELECT foo FROM bar))).

If we can't do that then we're still going to get complaints, I think.
The original bug report in this thread was specifically that the thing
didn't like redundant parentheses; we should try to remove that
restriction in all contexts not just some.
        regards, tom lane


Re: Re: [GENERAL] A rare error

From
Ian Lance Taylor
Date:
Date: Thu, 26 Oct 2000 20:49:22 -0400  From: Tom Lane <tgl@sss.pgh.pa.us>
  Right.  Also, I believe it's possible that such a grammar will behave  differently depending on which yacc you
processit with, which would be  bad.  (We have not yet taken the step of insisting that pgsql's grammar  is bison-only,
andI don't want to.)  So ensuring that we get no shift/  reduce conflicts has been a shop rule around here all along.
 

Actually, even the earliest version of yacc had very simple rules,
which are inherited by all versions.  In a shift/reduce conflict,
always shift.  In a reduce/reduce conflict, always reduce by the rule
which appears first in the grammar file.  shift/shift conflicts
indicate a grammer which is not LALR(1).

I'm pretty sure that all versions of yacc also support %left, %right,
and %nonassoc, which are simply techniques to eliminate shift/reduce
conflicts in arithmetic and other expressions.

I believe it is always possible to rewrite a grammer to eliminate all
conflicts.  But the rewrite can require an explosion in the number of
rules.

Reduce/reduce conflicts can be risky because it is easy to
accidentally change the ordering of the rules while editing.  But
shift/reduce conflicts are not risky.  The C parser in gcc, for
example, written and maintained by parser experts, has 53 shift/reduce
conflicts.

Ian