Thread: BUG #15129: Problem with UNION/UNION ALL type setting when severalNULL values before defining the proper type

The following bug has been logged on the website:

Bug reference:      15129
Logged by:          Philippe BEAUDOIN
Email address:      phb07@apra.asso.fr
PostgreSQL version: 10.3
Operating system:   linux
Description:

While migrating a view from another RDBMS,I reached something that looks lie
a bug in postgres.
It concerns all supported postgres versions.
After a lot of simplification steps, I have a very simple test case that
reproduces the issue.

-- This statement works as expected: the type of the result column is
determined by the type of the first not null value
SELECT NULL
  UNION
SELECT 1::INT;

-- But with an additional NULL value, the statement fails with a message:
"UNION types text and integer cannot be matched"
SELECT NULL
  UNION
SELECT NULL
  UNION
SELECT 1::INT;

-- Adding an explicit cast on the first occurrence fixes the issue (I used
this trick as a workaround)
SELECT NULL::INT
  UNION
SELECT NULL
  UNION
SELECT 1::INT;


>>>>> "PG" == PG Bug reporting form <noreply@postgresql.org> writes:

 PG> While migrating a view from another RDBMS,I reached something that
 PG> looks lie a bug in postgres.

It's not a bug in the code, though perhaps you could point out a place
where the documentation could be improved?

 PG> -- This statement works as expected: the type of the result column is
 PG> determined by the type of the first not null value
 PG> SELECT NULL
 PG>   UNION
 PG> SELECT 1::INT;

 PG> -- But with an additional NULL value, the statement fails with a message:
 PG> "UNION types text and integer cannot be matched"
 PG> SELECT NULL
 PG>   UNION
 PG> SELECT NULL
 PG>   UNION
 PG> SELECT 1::INT;

This fails because the UNION is processed pairwise; that is, it's
treated as if it were (select null union select null) union select 1::int

For each union of two queries, the result types have to be resolved to
some non-unknown type. If both types are unknown the result is assumed
to be text (since the usual reason for unknown types is the use of a
string literal); if one type is known and the other unknown, the known
type is selected as the common type; otherwise unification via implicit
casts is tried, otherwise the query fails.

So in this case the (select null union select null) is resolved as
having an output of one text column, and then trying to union against an
integer column fails (since we don't implictly cast to text).

(The SQL standard is of no particular help here since it does not allow
NULL to appear "bare", except in contextually typed expressions.)

-- 
Andrew (irc:RhodiumToad)


Andrew Gierth <andrew@tao11.riddles.org.uk> writes:
> "PG" == PG Bug reporting form <noreply@postgresql.org> writes:
>  PG> While migrating a view from another RDBMS,I reached something that
>  PG> looks lie a bug in postgres.

> It's not a bug in the code, though perhaps you could point out a place
> where the documentation could be improved?

It strikes me that section 10.5 doesn't say explicitly that multiple
UNIONs are resolved pairwise.  Someone who expected "x union y union z"
to be resolved holistically, like a 3-way CASE would be, is not going
to be enlightened by that section.  Perhaps an additional example
using this exact situation would be helpful.

> (The SQL standard is of no particular help here since it does not allow
> NULL to appear "bare", except in contextually typed expressions.)

I think the SQL spec does offer considerable support for our pairwise
resolution approach --- nowhere does it suggest that "x union y union z"
is a construct that should be considered as a whole rather than as a
nest of two independent union operations.

            regards, tom lane


Le 25/03/2018 à 10:50, Andrew Gierth a écrit :
>>>>>> "PG" == PG Bug reporting form <noreply@postgresql.org> writes:
>   PG> While migrating a view from another RDBMS,I reached something that
>   PG> looks lie a bug in postgres.
>
> It's not a bug in the code, though perhaps you could point out a place
> where the documentation could be improved?
>
>   PG> -- This statement works as expected: the type of the result column is
>   PG> determined by the type of the first not null value
>   PG> SELECT NULL
>   PG>   UNION
>   PG> SELECT 1::INT;
>
>   PG> -- But with an additional NULL value, the statement fails with a message:
>   PG> "UNION types text and integer cannot be matched"
>   PG> SELECT NULL
>   PG>   UNION
>   PG> SELECT NULL
>   PG>   UNION
>   PG> SELECT 1::INT;
>
> This fails because the UNION is processed pairwise; that is, it's
> treated as if it were (select null union select null) union select 1::int
>
> For each union of two queries, the result types have to be resolved to
> some non-unknown type. If both types are unknown the result is assumed
> to be text (since the usual reason for unknown types is the use of a
> string literal); if one type is known and the other unknown, the known
> type is selected as the common type; otherwise unification via implicit
> casts is tried, otherwise the query fails.
>
> So in this case the (select null union select null) is resolved as
> having an output of one text column, and then trying to union against an
> integer column fails (since we don't implictly cast to text).
>
> (The SQL standard is of no particular help here since it does not allow
> NULL to appear "bare", except in contextually typed expressions.)
>
Thanks Andrew and Tom (sorry can't find the last mail of the thread in 
my mailbox :-()
So the postgres behaviour is clear, once the rules of the game are 
known, eventhough it is not very intuitive for end-user and is different 
from at least another RDBMS (namely SQL-Server).
So yes, an explanation in the documentation would be useful.
Thanks again. Philippe.