In our fine manual, at
http://www.postgresql.org/docs/devel/static/typeconv-union-case.html
it's claimed that the nontrivial parts of UNION type resolution
work like this:
4. Choose the first non-unknown input type which is a preferred type in that category, if there is one.
5. Otherwise, choose the last non-unknown input type that allows all the preceding non-unknown inputs to be implicitly
convertedto it. (There always is such a type, since at least the first type in the list must satisfy this condition.)
This appears to have only the vaguest of resemblances to what
select_common_type() actually does, which is to make a single
pass over the inputs in which it does this:
/* * take new type if can coerce to it implicitly but not the * other way; but if we have a
preferredtype, stay on it. */
Thus for example there's a surprising inconsistency between
these cases:
regression=# select pg_typeof(t) from (select 'a'::text union select 'b'::char(1)) s(t);pg_typeof
-----------texttext
(2 rows)
regression=# select pg_typeof(t) from (select 'a'::char(1) union select 'b'::text) s(t);pg_typeof
-----------charactercharacter
(2 rows)
I think that at the very least, we ought to prefer preferred types,
the way the manual claims. I'm less certain about whether step 5
is ideal as written.
This came up because some of my Salesforce colleagues were griping about
the fact that UNION isn't commutative. They argue that the type
resolution behavior ought not be sensitive at all to the ordering of the
inputs. I'm not sure we can achieve that in general, but the current
approach certainly seems more order-sensitive than it oughta be.
Some trolling in the git history says that the last actual change in
this area was in my commit b26dfb95222fddd25322bdddf3a5a58d3392d8b1 of
2002-09-18, though it appears the documentation has been rewritten more
recently. It's a bit scary to be proposing to change behavior that's been
stable for eleven years, but ...
Thoughts?
regards, tom lane