Thread: select_common_type()'s behavior doesn't match the documentation
In our fine manual, at http://www.postgresql.org/docs/devel/static/typeconv-union-case.html it's claimed that the nontrivial parts of UNION type resolution work like this: 4. Choose the first non-unknown input type which is a preferred type in that category, if there is one. 5. Otherwise, choose the last non-unknown input type that allows all the preceding non-unknown inputs to be implicitly convertedto it. (There always is such a type, since at least the first type in the list must satisfy this condition.) This appears to have only the vaguest of resemblances to what select_common_type() actually does, which is to make a single pass over the inputs in which it does this: /* * take new type if can coerce to it implicitly but not the * other way; but if we have a preferredtype, stay on it. */ Thus for example there's a surprising inconsistency between these cases: regression=# select pg_typeof(t) from (select 'a'::text union select 'b'::char(1)) s(t);pg_typeof -----------texttext (2 rows) regression=# select pg_typeof(t) from (select 'a'::char(1) union select 'b'::text) s(t);pg_typeof -----------charactercharacter (2 rows) I think that at the very least, we ought to prefer preferred types, the way the manual claims. I'm less certain about whether step 5 is ideal as written. This came up because some of my Salesforce colleagues were griping about the fact that UNION isn't commutative. They argue that the type resolution behavior ought not be sensitive at all to the ordering of the inputs. I'm not sure we can achieve that in general, but the current approach certainly seems more order-sensitive than it oughta be. Some trolling in the git history says that the last actual change in this area was in my commit b26dfb95222fddd25322bdddf3a5a58d3392d8b1 of 2002-09-18, though it appears the documentation has been rewritten more recently. It's a bit scary to be proposing to change behavior that's been stable for eleven years, but ... Thoughts? regards, tom lane
Tom Lane <tgl@sss.pgh.pa.us> wrote: > This came up because some of my Salesforce colleagues were griping about > the fact that UNION isn't commutative. They argue that the type > resolution behavior ought not be sensitive at all to the ordering of the > inputs. I'm not sure we can achieve that in general, but the current > approach certainly seems more order-sensitive than it oughta be. > > Some trolling in the git history says that the last actual change in > this area was in my commit b26dfb95222fddd25322bdddf3a5a58d3392d8b1 of > 2002-09-18, though it appears the documentation has been rewritten more > recently. It's a bit scary to be proposing to change behavior that's > been stable for eleven years, but ... > > Thoughts? The current behavior is bad enough to merit changing it. Not for back-patch, of course. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Any progress on this? --------------------------------------------------------------------------- On Sat, Nov 30, 2013 at 12:43:39PM -0500, Tom Lane wrote: > In our fine manual, at > http://www.postgresql.org/docs/devel/static/typeconv-union-case.html > it's claimed that the nontrivial parts of UNION type resolution > work like this: > > 4. Choose the first non-unknown input type which is a preferred type in > that category, if there is one. > > 5. Otherwise, choose the last non-unknown input type that allows all the > preceding non-unknown inputs to be implicitly converted to it. (There > always is such a type, since at least the first type in the list must > satisfy this condition.) > > This appears to have only the vaguest of resemblances to what > select_common_type() actually does, which is to make a single > pass over the inputs in which it does this: > > /* > * take new type if can coerce to it implicitly but not the > * other way; but if we have a preferred type, stay on it. > */ > > Thus for example there's a surprising inconsistency between > these cases: > > regression=# select pg_typeof(t) from (select 'a'::text union select 'b'::char(1)) s(t); > pg_typeof > ----------- > text > text > (2 rows) > > regression=# select pg_typeof(t) from (select 'a'::char(1) union select 'b'::text) s(t); > pg_typeof > ----------- > character > character > (2 rows) > > I think that at the very least, we ought to prefer preferred types, > the way the manual claims. I'm less certain about whether step 5 > is ideal as written. > > This came up because some of my Salesforce colleagues were griping about > the fact that UNION isn't commutative. They argue that the type > resolution behavior ought not be sensitive at all to the ordering of the > inputs. I'm not sure we can achieve that in general, but the current > approach certainly seems more order-sensitive than it oughta be. > > Some trolling in the git history says that the last actual change in > this area was in my commit b26dfb95222fddd25322bdddf3a5a58d3392d8b1 of > 2002-09-18, though it appears the documentation has been rewritten more > recently. It's a bit scary to be proposing to change behavior that's been > stable for eleven years, but ... > > Thoughts? > > regards, tom lane > > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. +
Tom Lane-2 wrote > In our fine manual, at > http://www.postgresql.org/docs/devel/static/typeconv-union-case.html > it's claimed that the nontrivial parts of UNION type resolution > work like this: > > 4. Choose the first non-unknown input type which is a preferred type in > that category, if there is one. > > 5. Otherwise, choose the last non-unknown input type that allows all the > preceding non-unknown inputs to be implicitly converted to it. (There > always is such a type, since at least the first type in the list must > satisfy this condition.) > > This came up because some of my Salesforce colleagues were griping about > the fact that UNION isn't commutative. They argue that the type > resolution behavior ought not be sensitive at all to the ordering of the > inputs. I'm not sure we can achieve that in general, but the current > approach certainly seems more order-sensitive than it oughta be. 4. Use the preferred type for whatever category all inputs share (per 3). Per 1 this is only used if at least one input does not agree. 5. No longer needed 6. Stays the same It is possible for a result type to not match any of the input types but if you want to be commutative this would have to be allowed. You could add a "majority rules" condition rules before 4 and punt if there is no one dominate type. Should #1 repeat after flattening domains to their base types? I would probably logically place 2 before 1 since if everything is unknown nothing else matters. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/select-common-type-s-behavior-doesn-t-match-the-documentation-tp5780985p5813963.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.