Thread: String Comparision Weirdness
We had major problems after migrating the DB to a more powerful server; we managed to locate the problem to a type conversion bug in our software. Never the less, this thing puzzles us a lot: NBTEST2=# select '-1'>'0'; ?column? ---------- t (1 row) We've tried this query on several servers with different versions of postgresql and different versions of glibc - some returns true, others returns false - and it seems neither to be related to the postgresql version nor the glibc version. At all servers we tested, strcmp("-1","0") returned negative - at some -3 and at others -1, and not related to postgresql. The correct result above should be false, since ascii('-')=45 while ascii('0')=48. Can the character set in use be significant? -- Notice of Confidentiality: This email is sent unencrypted over the network, and may be stored on several email servers; it can be read by third parties as easy as a postcard. Do not rely on email for confidential information.
[Tobias Brox - Mon at 12:17:33PM +0200] > At all servers we tested, strcmp("-1","0") returned > negative - at some -3 and at others -1, and not related to postgresql. Ehr, uncorrelated to the result of the evaluation of '-1'>'0' on postgresql, I mean. -- Notice of Confidentiality: This email is sent unencrypted over the network, and may be stored on several email servers; it can be read by third parties as easy as a postcard. Do not rely on email for confidential information.
On Mon, 26 Sep 2005, Tobias Brox wrote: > We had major problems after migrating the DB to a more powerful server; we > managed to locate the problem to a type conversion bug in our software. > Never the less, this thing puzzles us a lot: > > NBTEST2=# select '-1'>'0'; > ?column? > ---------- > t > (1 row) > > We've tried this query on several servers with different versions of > postgresql and different versions of glibc - some returns true, others > returns false - and it seems neither to be related to the postgresql version > nor the glibc version. At all servers we tested, strcmp("-1","0") returned > negative - at some -3 and at others -1, and not related to postgresql. > > The correct result above should be false, since ascii('-')=45 while > ascii('0')=48. > > Can the character set in use be significant? It's more likely to be the locale in use. For example, on my machine, given a file with -1 and 0. LANG="C" sort file -1 0 LANG="en_US" sort file 0 -1 Many locales do a more complicated comparison than ascii values (like strcmp). For example, symbols and spaces may only be used as tiebreakers after effectively comparing the strings without them.
[Stephan Szabo - Mon at 06:07:48AM -0700] > It's more likely to be the locale in use. For example, on my machine, > given a file with -1 and 0. (...) > LANG="en_US" sort file > 0 > -1 Hah, those Americans don't even know how to count ;-) I find this weird, but it's clearly not a problem with postgresql at least. We should obviously check up the locale and stick to "C" on the servers. Thanks for the effort. -- Notice of Confidentiality: This email is sent unencrypted over the network, and may be stored on several email servers; it can be read by third parties as easy as a postcard. Do not rely on email for confidential information.