Re: pg_get_viewdef 7.4 et al - Mailing list pgsql-hackers

From Tom Lane
Subject Re: pg_get_viewdef 7.4 et al
Date
Msg-id 6908.1049900408@sss.pgh.pa.us
Whole thread Raw
In response to pg_get_viewdef 7.4 et al  (Andreas Pflug <Andreas.Pflug@web.de>)
Responses pg_dump and indexes  ("John Liu" <johnl@emrx.com>)
List pgsql-hackers
Andreas Pflug <Andreas.Pflug@web.de> writes:
> I believe we have to discuss this a little more in-depth. 
> Parenthese-usage needs theroretical proof, since not all cases can be 
> tested. So I list the  assumptions I made.

Quite aside from any errors in this analysis, the thing that is
bothering me about this approach is its fragility.  You've got a lot
of case-by-case considerations here that could fall apart after any
minor rearrangement of the parsetree representation.  Who's going to
think to re-examine the entire ruleutils logic every time they add a
parse node type?

Also, the penalty for errors seems dire.  If ruleutils fails to
parenthesize something that should be parenthesized, when are we going
to find out about it?  Not until someone's rule or view malfunctions
after being dumped and reloaded; which will probably mean that the error
has been out in the field for a full release cycle.

I'm not really eager to take such risks just to make the output a little
prettier ...


> T_Var, T_Const, T_Param, T_Aggref, T_ArrayRef, T_FuncExpr, 
> T_DistinctExpr, T_SubLink, T_SubPlan, T_FieldSelect, T_NullIfExpr, 
> T_NullTest, T_BooleanTest, T_CoerceToDomainValue can be handled as a 
> simple argument.

Let's see, how many mistakes?

T_DistinctExpr may need to be parenthesized, since IS has lower precedence
than some operators.  For example, assuming someone had defined a +
operator for booleans:bool_var + (a IS DISTINCT FROM b)
Omitting the parens would cause this to parse as(bool_var + a) IS DISTINCT FROM b
which is wrong.  (But, if a and b are themselves boolean, the mistake
would pass undetected until someone tracks down why their application is
malfunctioning.)

NullTest and BooleanTest have the identical issue.

Treating CoerceToDomainValue as a unit is risky because the coercion may
not show up in the output at all; you might get just the bare
subexpression which is not necessarily unitary.  It might be okay if you
further check that the coercion type is EXPLICIT.

ArrayRef is not always a simple argument.  As a counterexample consider
the following (which only started working as of CVS tip, but it works):

regression=# create table t1 (p point[]);
CREATE TABLE
regression=# insert into t1 values(array['(3,4)'::point,'(4,5)','(6,7)']);
INSERT 1409638 1
regression=# select p[2] from t1;  p
-------(4,5)
(1 row)

regression=# select (p[2])[0] from t1;p
---4
(1 row)

The parentheses are required here because p[2][0] means something quite
different, viz subscripting in a multi-dimensional point array.

> - T_CoerceToDomain and T_RelabelType:
> Both only take simple arguments, no complex expressions and thus don't 
> need parentheses.

Wrong, you simply haven't tried hard enough.

> - T_OperExpr:
> Arguments only need parentheses if they are of type T_OperExpr or 
> T_BoolExpr.

I don't think this is entirely correct either.  IN subexpressions, for
example, would need to be parenthesized.

The general point here is that getting this right requires extremely
close analysis of the grammar's precedence rules, and any small change
in the grammar could break it.
        regards, tom lane



pgsql-hackers by date:

Previous
From: Stephan Szabo
Date:
Subject: Question about simple function folding optimization
Next
From: Shridhar Daithankar
Date:
Subject: [OT][Announce] Availability of OAS Server pakcages