Re: Typmod associated with multi-row VALUES constructs - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Typmod associated with multi-row VALUES constructs
Date
Msg-id 26883.1481140987@sss.pgh.pa.us
Whole thread Raw
In response to Re: Typmod associated with multi-row VALUES constructs  (Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>)
Responses Re: Typmod associated with multi-row VALUES constructs
Re: Typmod associated with multi-row VALUES constructs
List pgsql-hackers
Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> writes:
> At Mon, 5 Dec 2016 21:54:08 -0700, "David G. Johnston" <david.g.johnston@gmail.com> wrote in
<CAKFQuwYOeesXJ1bH31b2MKx1UStEzrpYe=tSAO2-eg1Ai4=Eww@mail.gmail.com>
>> If every row passes you can retain the typemod.  That is arguably the
>> ​perfect solution.  The concern is that "scan every row" could be very
>> expensive - though in writing this I'm thinking that you'd quickly find a

> I found that it already scans the whole list. select_common_type
> does that. And it returns the type that is found to be applicable
> on all values. Handling typmod there has a small
> impact. (Though, I don't assert that we should do this.)

The problem is that there is not where the impact is.  It would be really
easy for transformValuesClause to check for a common typmod while it's
transforming the construct, but it has noplace to put the information.
The place where the info is needed is in expandRTE and
get_rte_attribute_type (which is called by make_var).  So basically,
parsing of each Var reference to a VALUES subquery would potentially have
to scan all rows of the VALUES list to identify the right typmod to give
to the Var.  The repetitiveness of that is what's bothering me.

In HEAD, we could change the RTE data structure so that
transformValuesClause could save the typmod information in the RTE,
keeping the lookups cheap.  But that option is not available to us in
released branches.

On the other hand, we might be worrying over nothing.  I think that
in typical use, expandRTE() will be done just once per query, and it
could amortize the cost by considering all the rows in one scan.
get_rte_attribute_type would be a problem except that I believe it's
rarely used for VALUES RTEs --- our code coverage report shows that that
switch branch isn't reached at all in the standard regression tests.
(The reason for the above statements is that we convert a bare VALUES to
SELECT * FROM VALUES, and the * is expanded by expandRTE, not by retail
applications of make_var.)

> Ok, I think all of the #1 to #5 are the change of behavior in
> this criteria. What we shold resolve here is the inconsistent
> table generated by create table as select from (values...

Well, that's one symptom, but not the only one; consider

# select x::varchar(4) from (values ('z'::varchar(4)), ('0123456789')) v(x);    x
------------z0123456789
(2 rows)

The Var representing v.x is (mis)labeled as already having the right typmod
for varchar(4), so transformTypeCast concludes that no run-time work is
needed, and you get clearly-wrong answers out.

Still, things have been like this since 8.2 when we implemented multi-row
VALUES, and nobody's noticed up to now.  Maybe the right answer is to
change the data structure in HEAD and decree that we won't fix it in back
branches.  I don't really like that answer though ...
        regards, tom lane



pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: patch: function xmltable
Next
From: Tom Lane
Date:
Subject: Re: [GENERAL] Select works only when connected from login postgres