Re: Typmod associated with multi-row VALUES constructs - Mailing list pgsql-hackers

From Kyotaro HORIGUCHI
Subject Re: Typmod associated with multi-row VALUES constructs
Date
Msg-id 20161207.173134.190084221.horiguchi.kyotaro@lab.ntt.co.jp
Whole thread Raw
In response to Re: Typmod associated with multi-row VALUES constructs  ("David G. Johnston" <david.g.johnston@gmail.com>)
Responses Re: Typmod associated with multi-row VALUES constructs
List pgsql-hackers
Hello,

At Mon, 5 Dec 2016 21:54:08 -0700, "David G. Johnston" <david.g.johnston@gmail.com> wrote in
<CAKFQuwYOeesXJ1bH31b2MKx1UStEzrpYe=tSAO2-eg1Ai4=Eww@mail.gmail.com>
> feel free to s/typemod/typmod/ ... my fingers don't want to drop the "e"
> 
> On Mon, Dec 5, 2016 at 9:17 PM, Kyotaro HORIGUCHI <
> horiguchi.kyotaro@lab.ntt.co.jp> wrote:
> 
> > Hello,
> >
> > At Mon, 5 Dec 2016 18:59:42 -0700, "David G. Johnston" <
> > david.g.johnston@gmail.com> wrote in <CAKFQuwYXzBQNpH5L=AHJzOjOZCZS
> > zRvF9qiA0wwt_QZmAuYmEA@mail.gmail.com>
> > > On Mon, Dec 5, 2016 at 6:36 PM, Kyotaro HORIGUCHI <
> > > horiguchi.kyotaro@lab.ntt.co.jp> wrote:
> > >
> >
> > (It's not typo but my poor wording... Sorry.)
> > Mmm. I think the typemod of a row should not be applied onto
> > other rows, and the same can be said for types. But type of the
> > first row is applied to all of the rest rows, though... Does it
> > make sense?
> >
> 
> Yes.  ​All rows in a given relation must end up with the exact same type
> and it isn't friendly to fail when a implicit conversion from unknown is
> possible.

I'm not sure how clearly users are conscious of the type unkown,
my feeling is opposed to implicity applying of the type of the
first value in VALUES to all other values of the type "unknown".

On the other hand, such behavior doesn't harm anything when we
don't explicitly type the values in VALUES. (But it would be a
new feature as you wrote below)

Anyway the real behavior of the current parser for VALUES is
scanning whole the list and extract common-coerceable type, then
applying the type on the whole list. (transformValuesClause)

> > But what I wanted to say was not that but the something like the
> > following.
> >
> > select pg_typeof('12345'::varchar(1));
> >      pg_typeof
> > -------------------
> >  character varying
> >
> > A value seemingly doesn't have typmod. So it seems reasonable
> > that VALUES cannot handle typmod. It seems reasonable regardless
> > of how it is acutually implemented.
> >
> 
> ​This is an artifact of functions - the typemod associated with the value
> '12345' is lost when that value is passed into the function pg_typeof.

It seems to me what is occuring in VALUES.

> Thus it is impossible to write a SQL query the reports the typemod of a
> literal or column reference.  Nonetheless it is there in reality.  Just see
> the original CREATE TABLE AS example for proof.  The created table's column
> is shown (using direct catalog queries) to contain typemod value of ​20 -
> which it could only have gotten from the first values rows which contained
> a casted literal.

Ok, I found myself have read the original problem wrongly.

=# create table t2 (a) as select * from (values ('abcdee'::varchar(3)), ('defghij'::varchar(5))) x;
postgres=# \d t2;            Table "public.t2"Column |         Type         | Modifiers 
--------+----------------------+-----------a      | character varying(3) | 
postgres=# select * from t2;  a   
-------abcdefgh
(2 rows)

The problem to be resolved here seems to be that CREATE TABLE AS
creates a broken in-a-sense table. Not a coercion of VALUES.

#6 - raise an error if a subquery's result doesn't fit the newly    created table. Or, create a new table so that all
thevalue    given from subqery fit to it. (I havent' understand how the    source typmod affects the new table and I
dont'seehow to do    that, though)
 


> > But I undetstand what we want to solve here is how to change
> > VALUES's behavior to perfectly coerce the row values to the types
> > explicity applied in the first row. Right?
> >
> ​
> ​Actually, no, since it is not possible to coerce "perfectly".  Since any
> attempt at doing so could fail it is only possible to scan every row and
> compare its typemod to the first row's typemod.  Ideally (but maybe not in
> reality) as soon as a discrepancy is found stop and discard the typemod.


> If every row passes you can retain the typemod.  That is arguably the
> ​perfect solution.  The concern is that "scan every row" could be very
> expensive - though in writing this I'm thinking that you'd quickly find a

I found that it already scans the whole list. select_common_type
does that. And it returns the type that is found to be applicable
on all values. Handling typmod there has a small
impact. (Though, I don't assert that we should do this.)

Even the value itself doesn't convey typmod, VALUES can take
expressions. (It is obvious because the first value was already a
coercing expression). They are scanned already.

> non-match even in a large dataset - and so a less perfect but still valid
> solution is to simply discard the typemod if there is more than one row.
> My thought was that if you are going to discard typemod in the n > 1 case
> for consistency you should discard the typemod in the n = 1 case as well.
> 
> That is, in a nutshell, options 1, 2, and 3 in order.
> 
> The "fault" in #1 that #4 attempted to fix was that VALUES are often hand
> entered and so the inexperienced would like to only type in a cast one the
> first row and have it will apply to all subsequent rows. That would be a
> feature request, though.  Your capability to add type information to any
> FROM alias is likewise a feature request - solving the overall problem by
> giving the author a place to specify the desired type explicitly without
> having to pollute the query with excessive casts or subqueries.

Ok, I think all of the #1 to #5 are the change of behavior in
this criteria. What we shold resolve here is the inconsistent
table generated by create table as select from (values...

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

pgsql-hackers by date:

Previous
From: Masahiko Sawada
Date:
Subject: Re: Quorum commit for multiple synchronous replication.
Next
From: Magnus Hagander
Date:
Subject: Re: Back-patch use of unnamed POSIX semaphores for Linux?