Thread: Misleading errors with column references in default expressions andpartition bounds

Hi all,

I have just committed a fix for a crash with the handling of partition
bounds using column references which has been discussed here:
https://www.postgresql.org/message-id/15668-0377b1981aa1a393@postgresql.org

And while discussing on the matter with Amit, the point has been
raised that default expressions with column references can lead to
some funny error messages with the context.  For example, take that
with an undefined column:
=# create table foo (a int default (a.a));
ERROR:  42P01: missing FROM-clause entry for table "a"
LINE 1: create table foo (a int default (a.a));

This confusion is old I think, and reproduces down to 9.4 and older.
If using directly a reference from a column's table then things get
correct:
=# create table foo (a int default (foo.a));
ERROR:  42P10: cannot use column references in default expression
LOCATION:  cookDefault, heap.c:2948
=# create table foo (a int default (a));
ERROR:  42P10: cannot use column references in default expression
LOCATION:  cookDefault, heap.c:2948

We have the same problem for partition bounds actually, which is new
as v12 as partition bound expressions now use the common expression
machinery for transformation:
=# CREATE TABLE list_parted (a int) PARTITION BY LIST (a);
CREATE TABLE
=# CREATE TABLE part_list_crash PARTITION OF list_parted
     FOR VALUES IN (somename.somename);
ERROR:  42P01: missing FROM-clause entry for table "somename"
LINE 2:   FOR VALUES IN (somename.somename)

One idea which came from Amit, and it seems to me that it is a good
idea, would be to have more context-related error messages directly in
transformColumnRef(), so as we can discard at an early stage column
references which are part of contexts where there is no meaning to
have them.  The inconsistent part in HEAD is that cookDefault() and
transformPartitionBoundValue() already discard column references, so
if we move those checks at transformation phase we can simplify the
error handling post-transformation.  This would make the whole thing
more consistent.

While this takes care of the RTE issues, this has a downside though.
Take for example this case using an expression with an aggregate and
a column reference:
=# CREATE TABLE part_bogus_expr_fail PARTITION OF list_parted
   FOR VALUES IN (sum(a));
-ERROR:  aggregate functions are not allowed in partition bound
+ERROR:  cannot use column reference in partition bound expression

So this would mean that we would first complain of the most inner
parts of the expression, which is more intuitive actually in my
opinion.  The difference can be seen using the patch attached for
partition bounds, as I have added more test coverage with a previous
commit.  We also don't have much tests in the code for default
expression patterns, so I have added some.

The docs of CREATE TABLE also look incorrect to me when it comes to
default expressions.  It says the following: "other columns in the
current table are not allowed".  However *all* columns are not
authorized, including the column which uses the expression.

Thoughts?
--
Michael

Attachment
Michael Paquier <michael@paquier.xyz> writes:
> One idea which came from Amit, and it seems to me that it is a good
> idea, would be to have more context-related error messages directly in
> transformColumnRef(), so as we can discard at an early stage column
> references which are part of contexts where there is no meaning to
> have them.

+1 for the general idea, but I find the switch a bit overly verbose.
Do we really need to force every new EXPR_KIND to visit this spot,
when so few of them have a need to do anything?  I'd be a bit inclined
to simplify it to

    switch (pstate->p_expr_kind)
    {
        case EXPR_KIND_COLUMN_DEFAULT:
            ereport(...);
            break;
        case EXPR_KIND_PARTITION_BOUND:
            ereport(...);
            break;
        default:
            break;
    }

That's just a nitpick though.

> While this takes care of the RTE issues, this has a downside though.
> Take for example this case using an expression with an aggregate and
> a column reference: 
> =# CREATE TABLE part_bogus_expr_fail PARTITION OF list_parted
>    FOR VALUES IN (sum(a));
> -ERROR:  aggregate functions are not allowed in partition bound
> +ERROR:  cannot use column reference in partition bound expression

I don't see that as an issue.

> The docs of CREATE TABLE also look incorrect to me when it comes to
> default expressions.  It says the following: "other columns in the
> current table are not allowed".  However *all* columns are not
> authorized, including the column which uses the expression.

I think the idea is that trying to reference another column is something
that people might well try to do, whereas referencing the DEFAULT's
own column is obviously silly.  In particular the use of "cross-reference"
implies that another column is what is being referenced.  If we dumb it
down to just "references to columns in the current table", then it's
consistent, but it's also completely redundant with the main part of the
sentence.  It doesn't help that somebody decided to jam the independent
issue of subqueries into the same sentence.  In short, maybe it'd be
better like this:

       ... The value
       is any variable-free expression (in particular, cross-references
       to other columns in the current table are not allowed).  Subqueries
       are not allowed either.

            regards, tom lane


On Wed, Mar 27, 2019 at 12:13:16PM +0900, Michael Paquier wrote:
> ParseExprKind is an enum, so listing all the options without the
> default has the advantage to generate a warning if somebody adds a
> value.  This way anybody changing this code will need to think about
> it.

A bit late, but committed without the case/default.

>>        ... The value
>>        is any variable-free expression (in particular, cross-references
>>        to other columns in the current table are not allowed).  Subqueries
>>        are not allowed either.
>
> Okay, I think I see your point here.  That sounds sensible.

And I have used this suggestion from Tom as well for the docs.
--
Michael

Attachment
On 2019/03/28 21:14, Michael Paquier wrote:
> On Wed, Mar 27, 2019 at 12:13:16PM +0900, Michael Paquier wrote:
>> ParseExprKind is an enum, so listing all the options without the
>> default has the advantage to generate a warning if somebody adds a
>> value.  This way anybody changing this code will need to think about
>> it.
> 
> A bit late, but committed without the case/default.
> 
>>>        ... The value
>>>        is any variable-free expression (in particular, cross-references
>>>        to other columns in the current table are not allowed).  Subqueries
>>>        are not allowed either.
>>
>> Okay, I think I see your point here.  That sounds sensible.
> 
> And I have used this suggestion from Tom as well for the docs.

Thanks Michael for taking care of this.

Regards,
Amit