Home > mailing lists

Re: String comparison and the SQL standard - Mailing list pgsql-general

From	Albe Laurenz
Subject	Re: String comparison and the SQL standard
Date	January 18, 2013 13:03:49
Msg-id	A737B7A37273E048B164557ADEF4A58B0579B0AD@ntex2010a.host.magwien.gv.at Whole thread Raw
In response to	Re: String comparison and the SQL standard (Tom Lane <tgl@sss.pgh.pa.us>)
Responses	Re: String comparison and the SQL standard
List	pgsql-general

Tree view

Tom Lane wrote:
>>    b) If the length in characters of X is not equal to the length
>>       in characters of Y, then the shorter string is effectively
>>       replaced, for the purposes of comparison, with a copy of itself
>>       that has been extended to the length of the longer string by
>>       concatenation on the right of one or more pad characters,
>>       where the pad character is chosen based on CS.
>>       If CS has the NO PAD characteristic, then the pad character is
>>       an implementation-dependent character different from
>>       any character in the character set of X and Y that collates
>>       less than any string under CS.
>>       Otherwise, the pad character is a <space>.
>
> The PAD case is specifying the way that CHAR(n) comparison should work.
> (We don't expose the PAD/NO PAD distinction in any other way than
> CHAR vs VARCHAR/TEXT types.)
>
> AFAICS, the NO PAD case is ignorable BS: [...]

> In any case, the most significant word in that whole paragraph is
> "effectively", which means you can do it however you want as long
> as you get an equivalent comparison result.
>
>> That would effectively mean that 'a'='a   ' is TRUE for
>> all character string types.
>
> In the PAD case, yes.  Else no.

Thanks for the clarification.

>> Of the DBMS I tested, Microsoft SQL Server and MySQL gave me
>> that very result, while PostgreSQL and Oracle gave me FALSE.
>
> This probably has more to do with what these systems think the
> data type of an undecorated literal is, than with whether they do
> trailing-space-insensitive comparison all the time.

I tested not only with string literals, but also comparing
table columns of the respective types.

I came up with the following table of semantics used for
comparisons:

           | CHAR(n)=CHAR(n) | VARCHAR(n)=VARCHAR(n) | CHAR(n)=VARCHAR(n) |
-----------+-----------------+-----------------------+--------------------+
Oracle     |    PAD SPACE    |        NO PAD         |      NO PAD        |
-----------+-----------------+-----------------------+--------------------+
PostgreSQL |    PAD SPACE    |        NO PAD         |     PAD SPACE      |
-----------+-----------------+-----------------------+--------------------+
MySQL      |    PAD SPACE    |       PAD SPACE       |     PAD SPACE      |
-----------+-----------------+-----------------------+--------------------+
SQL Server |    PAD SPACE    |       PAD SPACE       |     PAD SPACE      |
-----------+-----------------+-----------------------+--------------------+

Yours,
Laurenz Albe

pgsql-general by date:

From: Jasen Betts
Date: 18 January 2013, 13:00:13
Subject: Re: Case insensitive collation

From: salah jubeh
Date: 18 January 2013, 14:05:24
Subject: Re: SELECT DISTINCT

Re: String comparison and the SQL standard - Mailing list pgsql-general

Previous

Next