Re: Typmod associated with multi-row VALUES constructs - Mailing list pgsql-hackers

From David G. Johnston
Subject Re: Typmod associated with multi-row VALUES constructs
Date
Msg-id CAKFQuwb=rjcXhzgBVRaXj5nLw2TVR09yYuqpe5BeTWJDWY8L8Q@mail.gmail.com
Whole thread Raw
In response to Typmod associated with multi-row VALUES constructs  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Typmod associated with multi-row VALUES constructs
Re: Typmod associated with multi-row VALUES constructs
List pgsql-hackers
On Mon, Dec 5, 2016 at 1:08 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
I looked into the issue reported in bug #14448,
https://www.postgresql.org/message-id/20161205143037.4377.60754%40wrigleys.postgresql.org

The core of it seems to be that expandRTE() will report the type and
typmod of a column of a VALUES construct as being exprType() and
exprTypmod() of the corresponding expression in the first row of
the VALUES.  It's okay to handle data type that way, because we've
coerced all the expressions for the column to the same type; but
we have *not* coerced them to the same typmod.  So some of the values
from later rows may fail to meet the claimed typmod.  This is not good.

In order to fix this, we first have to decide what the semantics ought
to be.  I think there are two plausible definitions:

1. If all the expressions in the VALUES column share the same typmod,
use that typmod, else use -1.

2. Use -1 whenever there is more than one VALUES row.

#1 is what we do for some comparable cases such as UNION and CASE.
However, it's potentially quite expensive for large VALUES constructs.
#2 would be a lot cheaper, and given that this is the first complaint
we've gotten in all the years we've had multi-row-VALUES support, it's
not clear that deriving a precise typmod is really all that useful
for VALUES.

I have no strong preference between these two, but I think whatever
we do needs to be back-patched.  The behavior described in the bug
report is definitely broken.

Thoughts?

​Can we be precise enough to perform #2 if the top-level (or immediate parent) command is an INSERT - the existing table is going to enforce its own typemod anyway, otherwise go with #1?

​Lacking that possibility I'd say that documenting that our treatment of typemod in VALUES is similar to our treatment of typemod in function arguments would be acceptable. This suggests a #3 - simply use "-1" regardless of the number of rows in the VALUES expression.

David J.

pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: Cache Hash Index meta page.
Next
From: Pantelis Theodosiou
Date:
Subject: Re: missing optimization - column <> column