Re: String comparison and the SQL standard - Mailing list pgsql-general
| From | Albe Laurenz |
|---|---|
| Subject | Re: String comparison and the SQL standard |
| Date | |
| Msg-id | A737B7A37273E048B164557ADEF4A58B0579B0AD@ntex2010a.host.magwien.gv.at Whole thread |
| In response to | Re: String comparison and the SQL standard (Tom Lane <tgl@sss.pgh.pa.us>) |
| Responses |
Re: String comparison and the SQL standard
|
| List | pgsql-general |
Tom Lane wrote:
>> b) If the length in characters of X is not equal to the length
>> in characters of Y, then the shorter string is effectively
>> replaced, for the purposes of comparison, with a copy of itself
>> that has been extended to the length of the longer string by
>> concatenation on the right of one or more pad characters,
>> where the pad character is chosen based on CS.
>> If CS has the NO PAD characteristic, then the pad character is
>> an implementation-dependent character different from
>> any character in the character set of X and Y that collates
>> less than any string under CS.
>> Otherwise, the pad character is a <space>.
>
> The PAD case is specifying the way that CHAR(n) comparison should work.
> (We don't expose the PAD/NO PAD distinction in any other way than
> CHAR vs VARCHAR/TEXT types.)
>
> AFAICS, the NO PAD case is ignorable BS: [...]
> In any case, the most significant word in that whole paragraph is
> "effectively", which means you can do it however you want as long
> as you get an equivalent comparison result.
>
>> That would effectively mean that 'a'='a ' is TRUE for
>> all character string types.
>
> In the PAD case, yes. Else no.
Thanks for the clarification.
>> Of the DBMS I tested, Microsoft SQL Server and MySQL gave me
>> that very result, while PostgreSQL and Oracle gave me FALSE.
>
> This probably has more to do with what these systems think the
> data type of an undecorated literal is, than with whether they do
> trailing-space-insensitive comparison all the time.
I tested not only with string literals, but also comparing
table columns of the respective types.
I came up with the following table of semantics used for
comparisons:
| CHAR(n)=CHAR(n) | VARCHAR(n)=VARCHAR(n) | CHAR(n)=VARCHAR(n) |
-----------+-----------------+-----------------------+--------------------+
Oracle | PAD SPACE | NO PAD | NO PAD |
-----------+-----------------+-----------------------+--------------------+
PostgreSQL | PAD SPACE | NO PAD | PAD SPACE |
-----------+-----------------+-----------------------+--------------------+
MySQL | PAD SPACE | PAD SPACE | PAD SPACE |
-----------+-----------------+-----------------------+--------------------+
SQL Server | PAD SPACE | PAD SPACE | PAD SPACE |
-----------+-----------------+-----------------------+--------------------+
Yours,
Laurenz Albe
pgsql-general by date: