Thread: "ORDER BY" issue - is this a bug?
Greetings, I've been using postgresql for some time and currently am using it in three different environments (specific details below) w/ the first two being 6.5.x releases and the third 7.0.2-2: I've come across what I consider an odd sorting anomaly in 7.0.2-2 where running the following query: select headline from headlines where headline like 'Alb%' order by 1 ; The results in the 7.0.2-2 install come back case-insensitive and oblivious to punctuation. This seems to me to be a bug; case-insensitive ordering can be achieved with the use of UPPER() or LOWER() functions, otherwise why have the functions. Also, I've created three test databases on the 7.0.2-2 system each with a different specified encoding - List of databases Database | Owner | Encoding ------------+----------+----------- headlines | deckard | WIN headlines2 | deckard | SQL_ASCII headlines3 | deckard | LATIN1 to see if the results are different. Each of the encodings produces the same anomalous results. Any help in this regard as well as resolving it would be appreciated. Details of systems and schema are appended below. -- Max Pyziur BRAMA - Gateway Ukraine pyz@brama.com http://www.brama.com/ The Details: On the first two systems the results are as follows: headline ------------------------------------------------------------- Albright Arrives in Ukraine to Boost Reforms Albright Calls on Ukraine to Announce Chernobyl Closure Albright Calls on Ukraine to Speed up Military Reforms Albright Hopeful on Chernobyl Cover Albright Meets Ukraine Officials Albright Reschedules Ukraine Visit Albright Throws Weight Behind Ukraine's Kuchma Albright To Hold Talks in Ukraine Albright plans to show support for reform in visit to Ukraine Albright plans to show support for reform in visit to Ukraine Albright rushes to kyiv ahead of putin Albright says U.S. still backs Kuchma Albright to perform balancing act in Central Asia Albright, in Kiev, Hails Russian START-2 Vote (14 rows) ##################################################################### On the 7.0.2-2 system it is: headline --------------------------------------------------------------- Albright Arrives in Ukraine to Boost Reforms Albright Calls on Ukraine to Announce Chernobyl Closure Albright Calls on Ukraine to Speed up Military Reforms Albright Hopeful on Chernobyl Cover Albright, in Kiev, Hails Russian START-2 Vote Albright Meets Ukraine Officials Albright plans to show support for reform in visit to Ukraine Albright plans to show support for reform in visit to Ukraine Albright Reschedules Ukraine Visit Albright rushes to kyiv ahead of putin Albright says U.S. still backs Kuchma Albright Throws Weight Behind Ukraine's Kuchma Albright To Hold Talks in Ukraine Albright to perform balancing act in Central Asia (14 rows) ##################################################################### 1 - Production - Solaris 2.6 running Postgresql 6.5.2 on a small Sparc 2 compiled from source 2 - Development - RH5.2 Linux 2.0.36 running Postgresql 6.5.3 installed from rpms 3 - Development - RH6.2 Linux 2.2.14-5.0smp running Postgresql 7.0.2-2 installed from stock rpms Database table schema: headlines=> \d headlines Table = headlines +----------------------------------+----------------------------------+-------+ | Field | Type | Length| +----------------------------------+----------------------------------+-------+ | headline_id | int4 not null default nextval ( | 4 | | headline | text | var | | url | text | var | | postdate | date | 4 | | source | text | var | | flags | text | var | | posttime | timestamp | 4 | +----------------------------------+----------------------------------+-------+ Indices: headlines_headline_id_key hl_pdate hl_s_pd hl_srce -- Max Pyziur BRAMA - Gateway Ukraine pyz@brama.com http://www.brama.com/
Max Pyziur <pyz@panix.com> writes: > I've come across what I consider an odd sorting anomaly in 7.0.2-2 where running > the following query: > select headline from headlines where headline like 'Alb%' order by 1 ; > The results in the 7.0.2-2 install come back case-insensitive and > oblivious to punctuation. That's pretty bizarre (not to say difficult to believe). What LOCALE setting are you running the postmaster in? regards, tom lane
Tom Lane wrote: > > Max Pyziur <pyz@panix.com> writes: > > I've come across what I consider an odd sorting anomaly in 7.0.2-2 where running > > the following query: > > select headline from headlines where headline like 'Alb%' order by 1 ; > > > The results in the 7.0.2-2 install come back case-insensitive and > > oblivious to punctuation. > > That's pretty bizarre (not to say difficult to believe). What LOCALE > setting are you running the postmaster in? On none of the installations - the two 6.5.x and the 7.0.2-2 one - I don't have any locale set. I get (what I think are) correct results with the first two. Does 7.0.2-2 require the setting of locale. If so, where is it documented and/or how is it done? > regards, tom lane -- Max Pyziur BRAMA - Gateway Ukraine pyz@brama.com http://www.brama.com/
Max Pyziur <pyz@panix.com> writes: >> That's pretty bizarre (not to say difficult to believe). What LOCALE >> setting are you running the postmaster in? > On none of the installations - the two 6.5.x and the 7.0.2-2 one - I > don't have any locale set. I get (what I think are) correct results > with the first two. > Does 7.0.2-2 require the setting of locale. AFAIK its behavior should be the same as 6.5 for LOCALE issues. That's why I suspect an environment difference. I can assure you there is no code in the backend that will do case-insensitive, punctuation-insensitive comparisons --- much less any to do so without request. I'm betting that either this is your error, or the strcmp() library function is doing it; and as far as I've heard, only LOCALE environment variables might affect the behavior of strcmp(). It also seems possible that no sort is happening at all (which would be a planner bug), and the ordering you're getting is just whatever happens to be in the underlying table. Does EXPLAIN show that the query is being done with an explicit sort? regards, tom lane
Tom Lane wrote: > > Max Pyziur <pyz@panix.com> writes: > >> That's pretty bizarre (not to say difficult to believe). What LOCALE > >> setting are you running the postmaster in? > > > On none of the installations - the two 6.5.x and the 7.0.2-2 one - I > > don't have any locale set. I get (what I think are) correct results > > with the first two. > > > Does 7.0.2-2 require the setting of locale. > > AFAIK its behavior should be the same as 6.5 for LOCALE issues. That's > why I suspect an environment difference. Thanks for your quick replies. I dropped my databases, uninstalled the 7.0.2-2 rpms and installed 6.5.3 rpms on my development RH6.2 Linux 2.2.14-5.0 system and I still get the anomalous query result on that box. > I can assure you there is no code in the backend that will do > case-insensitive, punctuation-insensitive comparisons --- much less any > to do so without request. I'm betting that either this is your error, > or the strcmp() library function is doing it; and as far as I've heard, > only LOCALE environment variables might affect the behavior of strcmp(). There is a $LANG variable which is set to en_US; is this what might be causing the problem? I've tried unsetting it (unset LANG) and still get the problem. Is there something else which I should be looking at? > It also seems possible that no sort is happening at all (which would be > a planner bug), and the ordering you're getting is just whatever happens > to be in the underlying table. Does EXPLAIN show that the query is > being done with an explicit sort? Running EXPLAIN the results are: headlines1=> explain select headline from headlines where headline like 'Alb%' order by 1 ; NOTICE: QUERY PLAN: Sort (cost=221.15 rows=1 width=12) -> Seq Scan on headlines (cost=221.15 rows=1 width=12) EXPLAIN > regards, tom lane Thanks again, Max Pyziur BRAMA - Gateway Ukraine pyz@brama.com http://www.brama.com/
Max Pyziur wrote: > I dropped my databases, uninstalled the 7.0.2-2 rpms and installed 6.5.3 rpms on > my development RH6.2 Linux 2.2.14-5.0 system and I still get the anomalous query > result on that box. This is a RedHat 6.2 locale problem. Set LC_COLLATE=C and see if that fixes things. -- Lamar Owen WGCR Internet Radio 1 Peter 4:11
On Mon, 11 Sep 2000, Lamar Owen wrote: > Max Pyziur wrote: > > I dropped my databases, uninstalled the 7.0.2-2 rpms and installed 6.5.3 rpms on > > my development RH6.2 Linux 2.2.14-5.0 system and I still get the anomalous query > > result on that box. > > This is a RedHat 6.2 locale problem. Set LC_COLLATE=C and see if that > fixes things. That didn't do it. But I'll tell you what did give me the (original) results for which I was looking. Looking through the various *rc files which get called in the stock RH6.2 installation I saw that there was a /etc/profile.d/lang.sh file (for bash shells) and lang.csh (for csh shells). I moved the lang.sh to lang.sh.bak (since the login initialization script looks for /etc/profile.d/*sh files), shutdown postmaster and restarted and now I'm getting the results I got before. The thing which I'm concerned about is doing a 'env | sort' doesn't show any of the LC_xxx variable settings. Nor does 'echo $LC_COLLATE'. What command can I use to "expose" these settings to make sure that I don't run into a similar situation? Much thanks to you and tom lane (in his ferlinghetti way) for helping me out of this one (at least to get the original results I had). > -- > Lamar Owen > WGCR Internet Radio > 1 Peter 4:11 > Max Pyziur BRAMA - Gateway Ukraine pyz@brama.com http://www.brama.com/