Thread: String Comparision Weirdness

String Comparision Weirdness

From
Tobias Brox
Date:
We had major problems after migrating the DB to a more powerful server; we
managed to locate the problem to a type conversion bug in our software.
Never the less, this thing puzzles us a lot:

NBTEST2=# select '-1'>'0';
 ?column?
----------
 t
(1 row)

We've tried this query on several servers with different versions of
postgresql and different versions of glibc - some returns true, others
returns false - and it seems neither to be related to the postgresql version
nor the glibc version.  At all servers we tested, strcmp("-1","0") returned
negative - at some -3 and at others -1, and not related to postgresql.

The correct result above should be false, since ascii('-')=45 while
ascii('0')=48.

Can the character set in use be significant?

--
Notice of Confidentiality: This email is sent unencrypted over the network,
and may be stored on several email servers; it can be read by third parties
as easy as a postcard.  Do not rely on email for confidential information.

Re: String Comparision Weirdness

From
Tobias Brox
Date:
[Tobias Brox - Mon at 12:17:33PM +0200]
> At all servers we tested, strcmp("-1","0") returned
> negative - at some -3 and at others -1, and not related to postgresql.

Ehr, uncorrelated to the result of the evaluation of '-1'>'0' on postgresql,
I mean.

--
Notice of Confidentiality: This email is sent unencrypted over the network,
and may be stored on several email servers; it can be read by third parties
as easy as a postcard.  Do not rely on email for confidential information.

Re: String Comparision Weirdness

From
Stephan Szabo
Date:
On Mon, 26 Sep 2005, Tobias Brox wrote:

> We had major problems after migrating the DB to a more powerful server; we
> managed to locate the problem to a type conversion bug in our software.
> Never the less, this thing puzzles us a lot:
>
> NBTEST2=# select '-1'>'0';
>  ?column?
> ----------
>  t
> (1 row)
>
> We've tried this query on several servers with different versions of
> postgresql and different versions of glibc - some returns true, others
> returns false - and it seems neither to be related to the postgresql version
> nor the glibc version.  At all servers we tested, strcmp("-1","0") returned
> negative - at some -3 and at others -1, and not related to postgresql.
>
> The correct result above should be false, since ascii('-')=45 while
> ascii('0')=48.
>
> Can the character set in use be significant?

It's more likely to be the locale in use.  For example, on my machine,
given a file with -1 and 0.

LANG="C" sort file
-1
0

LANG="en_US" sort file
0
-1

Many locales do a more complicated comparison than ascii values (like
strcmp).  For example, symbols and spaces may only be used as tiebreakers
after effectively comparing the strings without them.

Re: String Comparision Weirdness

From
Tobias Brox
Date:
[Stephan Szabo - Mon at 06:07:48AM -0700]
> It's more likely to be the locale in use.  For example, on my machine,
> given a file with -1 and 0.
(...)
> LANG="en_US" sort file
> 0
> -1

Hah, those Americans don't even know how to count ;-)

I find this weird, but it's clearly not a problem with postgresql at least.
We should obviously check up the locale and stick to "C" on the servers.

Thanks for the effort.

--
Notice of Confidentiality: This email is sent unencrypted over the network,
and may be stored on several email servers; it can be read by third parties
as easy as a postcard.  Do not rely on email for confidential information.