Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Well, the case that is bothering me is stuff like
>
> (select '1' union select '1 ') union all select 2;
>
> The first union produces 1 row if you resolve the constants as
> integers, but 2 rows if you resolve as text, which I think is what
> the spec would expect here.
The way I have read it, the spec would make those first two literals
char(1) and char(2), and the trailing space would be ignored in an
equality comparison between those. But you could make your point
with a leading space, I think.
> Now in this particular case we'd fail with "UNION types text and
> integer cannot be matched" so you never get as far as noticing
> what the runtime behavior is.
Right, which makes it OK to provide something which *does* work here
as an extension.
> [ experiments a bit... ] You can show a difference in results
> with this:
>
> regression=# (select '1' union select '1 ') union all select
> '2'::bpchar;
> ?column?
> ----------
> 1
> 1
> 2
> (3 rows)
>
> This produces 3 rows because the UNION resolves as text, but what
> we're discussing here would allow it to resolve as bpchar, which
> would have different behavior:
>
> regression=# (select '1' union select '1 '::bpchar) union all
> select '2'::bpchar;
> ?column?
> ----------
> 1
> 2
> (2 rows)
Which would be the right answer according to the spec, although that
seems to be sort of an accident here.
> It's debatable about how important this is, and one could also say
> that the behavior of our bpchar is not entirely standards
> compliant in the first place, so maybe this isn't a compelling
> example. But I'm worried that there may be related cases where
> it's a bigger deal.
We are in territory where the choice to treat literals as type
unknown where the spec requires bpchar will probably lead to *some*
corner cases where behavior is nonstandard no matter what we do. I
think the best we can do here is (in what I think is order of
importance):
(1) Try not to break anything which works for current PostgreSQL
queries.
(2) Try not to add any additional behavioral differences from the
standard where a query now runs without error with
standard-conforming results.
(3) Try to maintain some coherent handling for unknown values. I
think that's currently lacking when the first of these fails and the
others work:
select null union select null union select 1;
select null union (select null union select 1);
select null union select 1 union select null;
Likewise, the first of these fails and the others don't:
select '1' union select '1 ' union select 1;
select '1' union (select '1 ' union select 1);
select '1' union select 1 union select '1 ';
Explaining that could be tough. I'm arguing that the first line
should be made to work like the others in terms of type resolution.
Since that case now throws and error under both the standard and
current PostgreSQL releases, it's OK as an extension.
-Kevin