Re: BUG #4999: select 'a' < 'A' is true, but should be false . . . - Mailing list pgsql-bugs

From Brian Ceccarelli
Subject Re: BUG #4999: select 'a' < 'A' is true, but should be false . . .
Date
Msg-id 000901ca25ed$ebf09590$c3d1c0b0$@com
Whole thread Raw
In response to BUG #4999: select 'a' < 'A' is true, but should be false . . .  ("Brian Ceccarelli" <ceccareb@talussoftware.com>)
Responses Re: BUG #4999: select 'a' < 'A' is true, but should be false . . .
Re: BUG #4999: select 'a' < 'A' is true, but should be false . . .
Re: BUG #4999: select 'a' < 'A' is true, but should be false . . .
List pgsql-bugs
Actually Greg . . .

     After a long and careful study of this, this is a Postgres bug.
Postgres is boogering up the sort.   The operators < >, and the "order by"
in the select statement are both case-sensitive and case-insensitive.    It
is inconsistent.   It does not matter what my collating sequence is.    This
is the behavior I am seeing:

If the words are the same words but letters have different case, then the
operator is case-sensitive.
If the words are not the same words, then the operator is case-insensitive
until the operator reaches the character position in both strings where the
letters become different.

It is a combination of both case-sensitive and case-insenstive.   Completely
bogus.    It does not matter what computer I am using.    I get the same
bogus behavior on all my operating systems and Postgres versions.



Brian Ceccarelli
Talus Software
4605 Woodmill Run
Apex, NC  27539







-----Original Message-----
From: gsstark@gmail.com [mailto:gsstark@gmail.com] On Behalf Of Greg Stark
Sent: Thursday, August 20, 2009 5:50 PM
To: Brian Ceccarelli
Subject: Re: BUG #4999: select 'a' < 'A' is true, but should be false . . .

On Thu, Aug 20, 2009 at 9:24 PM, Brian
Ceccarelli<ceccareb@talussoftware.com> wrote:

> The operator equals is case sensitive.
> The operator < and > are case-insensitive.
> This is not consistent.
>
> Most of the time, operator < and > are case-insenstive, until you compare
> the upper and lower cases of the same letter.

This is all down to your collation locale. en_US and most other non-C
locales sort case insensitively. You'll find the Unix command "sort"
and other utilities behave similarly. You can set your lc_collate to C
if you want strict ascii binary byte order. In 8.3 and previous this
was a cluster-wide setting but in 8.4 you can set it per-database.

If you can find any cases that are actually inconsistent (the above
examples are not -- you need to find three values where a < b < c but
a >= c) then this is a serious problem and you should not use that
locale with postgres or it will result in corrupt indexes.
--
greg
http://mit.edu/~gsstark/resume.pdf

pgsql-bugs by date:

Previous
From: "WANGRUNGVICHAISRI, SHIVESH"
Date:
Subject: Re: BUG #4996: postgres.exe memory consumption keeps going up
Next
From: Tom Lane
Date:
Subject: Re: BUG #4999: select 'a' < 'A' is true, but should be false . . .