Re: String comparison and the SQL standard - Mailing list pgsql-general
From | Albe Laurenz |
---|---|
Subject | Re: String comparison and the SQL standard |
Date | |
Msg-id | A737B7A37273E048B164557ADEF4A58B0579B0AD@ntex2010a.host.magwien.gv.at Whole thread Raw |
In response to | Re: String comparison and the SQL standard (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: String comparison and the SQL standard
|
List | pgsql-general |
Tom Lane wrote: >> b) If the length in characters of X is not equal to the length >> in characters of Y, then the shorter string is effectively >> replaced, for the purposes of comparison, with a copy of itself >> that has been extended to the length of the longer string by >> concatenation on the right of one or more pad characters, >> where the pad character is chosen based on CS. >> If CS has the NO PAD characteristic, then the pad character is >> an implementation-dependent character different from >> any character in the character set of X and Y that collates >> less than any string under CS. >> Otherwise, the pad character is a <space>. > > The PAD case is specifying the way that CHAR(n) comparison should work. > (We don't expose the PAD/NO PAD distinction in any other way than > CHAR vs VARCHAR/TEXT types.) > > AFAICS, the NO PAD case is ignorable BS: [...] > In any case, the most significant word in that whole paragraph is > "effectively", which means you can do it however you want as long > as you get an equivalent comparison result. > >> That would effectively mean that 'a'='a ' is TRUE for >> all character string types. > > In the PAD case, yes. Else no. Thanks for the clarification. >> Of the DBMS I tested, Microsoft SQL Server and MySQL gave me >> that very result, while PostgreSQL and Oracle gave me FALSE. > > This probably has more to do with what these systems think the > data type of an undecorated literal is, than with whether they do > trailing-space-insensitive comparison all the time. I tested not only with string literals, but also comparing table columns of the respective types. I came up with the following table of semantics used for comparisons: | CHAR(n)=CHAR(n) | VARCHAR(n)=VARCHAR(n) | CHAR(n)=VARCHAR(n) | -----------+-----------------+-----------------------+--------------------+ Oracle | PAD SPACE | NO PAD | NO PAD | -----------+-----------------+-----------------------+--------------------+ PostgreSQL | PAD SPACE | NO PAD | PAD SPACE | -----------+-----------------+-----------------------+--------------------+ MySQL | PAD SPACE | PAD SPACE | PAD SPACE | -----------+-----------------+-----------------------+--------------------+ SQL Server | PAD SPACE | PAD SPACE | PAD SPACE | -----------+-----------------+-----------------------+--------------------+ Yours, Laurenz Albe
pgsql-general by date: