Thread: BUG #12458: Comparison with CHAR is inconsistent between string types

BUG #12458: Comparison with CHAR is inconsistent between string types

From
tneumann@users.sourceforge.net
Date:
The following bug has been logged on the website:

Bug reference:      12458
Logged by:          Thomas Neumann
Email address:      tneumann@users.sourceforge.net
PostgreSQL version: 9.4.0
Operating system:   Linux 3.16
Description:

The following two queries return different results:

select 'A'::char(1)='A '::text;
select 'A'::char(1)='A '::varchar(10);

I would expect both queries to return "true", and this is also suggested by
the documentation (section 8.3.), but the text comparison returns false.

Re: BUG #12458: Comparison with CHAR is inconsistent between string types

From
Amit Langote
Date:
On Thu, Jan 8, 2015 at 6:39 AM,  <tneumann@users.sourceforge.net> wrote:
> The following two queries return different results:
>
> select 'A'::char(1)='A '::text;
> select 'A'::char(1)='A '::varchar(10);
>
> I would expect both queries to return "true", and this is also suggested by
> the documentation (section 8.3.), but the text comparison returns false.
>

Quick inspection of pg_operator suggests there is no '=' operator that
takes varchar as either of the operands. Not sure why that is so. It
appears the comparison proceeds using =(char, char) operator with the
varchar operand appropriately coerced. Hence the result.

Thanks,
Amit

Re: BUG #12458: Comparison with CHAR is inconsistent between string types

From
Tom Lane
Date:
Amit Langote <amitlangote09@gmail.com> writes:
> On Thu, Jan 8, 2015 at 6:39 AM,  <tneumann@users.sourceforge.net> wrote:
>> The following two queries return different results:
>>
>> select 'A'::char(1)='A '::text;
>> select 'A'::char(1)='A '::varchar(10);
>>
>> I would expect both queries to return "true", and this is also suggested by
>> the documentation (section 8.3.), but the text comparison returns false.

I see nothing in 8.3 addressing the question of which type has precedence
for cross-type comparisons.

> Quick inspection of pg_operator suggests there is no '=' operator that
> takes varchar as either of the operands. Not sure why that is so. It
> appears the comparison proceeds using =(char, char) operator with the
> varchar operand appropriately coerced. Hence the result.

The available operators are char=char and text=text (varchar has no
operators of its own, it just uses text's operators).  For an input
of the form char=text, the text=text operator wins on the grounds of
text being a preferred type; see
http://www.postgresql.org/docs/9.4/static/typeconv-oper.html
rule 3d.  So we coerce char to text and compare using text's rules,
in which trailing spaces are significant.  But if the input is
char=varchar, only the char=char operator has any exact matches, so it
wins at rule 3c.  Then we'll coerce varchar to char and compare using
char's rules, in which trailing spaces are not significant.

So it's behaving as expected and documented.  Whether this behavior is
desirable is a different question of course.  I read the SQL spec as
requiring us to use PAD SPACE semantics for char=varchar comparisons,
so that case is forced.  char=text is outside the bounds of the spec
(since text isn't in the spec) so we don't have to do it the same way;
and we choose not to because text's semantics are generally more natural.

            regards, tom lane

Re: BUG #12458: Comparison with CHAR is inconsistent between string types

From
Thomas Neumann
Date:
>>> select 'A'::char(1)='A '::text;
>>> select 'A'::char(1)='A '::varchar(10);
>>>
>>> I would expect both queries to return "true", and this is also suggested by
>>> the documentation (section 8.3.), but the text comparison returns false.
>
> I see nothing in 8.3 addressing the question of which type has precedence
> for cross-type comparisons.

you are right, I did not read 8.3 careful enough. It says that "trailing
spaces are treated as semantically insignificant and disregarded when
comparing two values of type character", but nothing about mixed types.

The SQL standard 8,2 <comparison predicate>, Rule 3, b) says that

""
If the length in characters of X is not equal to the length in
characters of Y, then the shorter string is effectively replaced, for
the purposes of comparison, with a copy of itself that has been extended
to the length of the longer string by concatenation on the right of one
or more pad characters, where the pad character is chosen based on CS.
If CS has the NO PAD characteristic, then the pad character is an
implementation-dependent character different from any character in the
character set of X and Y that collates less than any string under CS.
Otherwise, the pad character is a <space>.
""

which would seem to suggest that spaces have to be ignore when comparing
CHAR with anything else (as the other string type is free to choose any
pad character, but CHAR will always pad with spaces as needed).

But of course you can deviate from that if you wish. I just found it
surprising that text behaved different from varchar, which can lead to
very subtle bugs.

Thomas