Thread: "ORDER BY" issue - is this a bug?

"ORDER BY" issue - is this a bug?

From
Max Pyziur
Date:
Greetings,

I've been using postgresql for some time and currently am using it in three
different environments (specific details below) w/ the first two being 6.5.x
releases and the third  7.0.2-2:

I've come across what I consider an odd sorting anomaly in 7.0.2-2 where running
the following query:
select headline from headlines where headline like 'Alb%' order by 1 ;

The results in the 7.0.2-2 install come back case-insensitive and oblivious to
punctuation.  This seems to me to be a bug; case-insensitive ordering can be
achieved with the use of UPPER() or LOWER() functions, otherwise why have the
functions.  Also, I've created three test databases on the 7.0.2-2 system each
with a different specified encoding  -
         List of databases
  Database  |  Owner   | Encoding
------------+----------+-----------
 headlines  | deckard  | WIN
 headlines2 | deckard  | SQL_ASCII
 headlines3 | deckard  | LATIN1

to see if the results are different.  Each of the encodings produces the same
anomalous results.

Any help in this regard as well as resolving it would be appreciated.  Details
of systems and schema are appended below.

--
Max Pyziur                                     BRAMA - Gateway Ukraine
pyz@brama.com                                  http://www.brama.com/


The Details:

On the first two systems the results are as follows:
headline
-------------------------------------------------------------
Albright Arrives in Ukraine to Boost Reforms
Albright Calls on Ukraine to Announce Chernobyl Closure
Albright Calls on Ukraine to Speed up Military Reforms
Albright Hopeful on Chernobyl Cover
Albright Meets Ukraine Officials
Albright Reschedules Ukraine Visit
Albright Throws Weight Behind Ukraine's Kuchma
Albright To Hold Talks in Ukraine
Albright plans to show support for reform in visit to Ukraine
Albright plans to show support for reform in visit to Ukraine
Albright rushes to kyiv ahead of putin
Albright says U.S. still backs Kuchma
Albright to perform balancing act in Central Asia
Albright, in Kiev, Hails Russian START-2 Vote
(14 rows)
#####################################################################

On the 7.0.2-2 system it is:
                           headline
---------------------------------------------------------------
 Albright Arrives in Ukraine to Boost Reforms
 Albright Calls on Ukraine to Announce Chernobyl Closure
 Albright Calls on Ukraine to Speed up Military Reforms
 Albright Hopeful on Chernobyl Cover
 Albright, in Kiev, Hails Russian START-2 Vote
 Albright Meets Ukraine Officials
 Albright plans to show support for reform in visit to Ukraine
 Albright plans to show support for reform in visit to Ukraine
 Albright Reschedules Ukraine Visit
 Albright rushes to kyiv ahead of putin
 Albright says U.S. still backs Kuchma
 Albright Throws Weight Behind Ukraine's Kuchma
 Albright To Hold Talks in Ukraine
 Albright to perform balancing act in Central Asia
(14 rows)
#####################################################################



1 - Production - Solaris 2.6 running Postgresql 6.5.2 on a small Sparc 2
compiled from source
2 - Development - RH5.2 Linux 2.0.36 running Postgresql 6.5.3 installed from
rpms
3 - Development - RH6.2 Linux 2.2.14-5.0smp running Postgresql 7.0.2-2 installed
from stock rpms

Database table schema:
headlines=> \d headlines
Table    = headlines
+----------------------------------+----------------------------------+-------+
|              Field               |              Type                | Length|
+----------------------------------+----------------------------------+-------+
| headline_id                      | int4 not null default nextval (  |     4 |
| headline                         | text                             |   var |
| url                              | text                             |   var |
| postdate                         | date                             |     4 |
| source                           | text                             |   var |
| flags                            | text                             |   var |
| posttime                         | timestamp                        |     4 |
+----------------------------------+----------------------------------+-------+
Indices:  headlines_headline_id_key
          hl_pdate
          hl_s_pd
          hl_srce



--
Max Pyziur                                     BRAMA - Gateway Ukraine
pyz@brama.com                                  http://www.brama.com/

Re: [BUGS] "ORDER BY" issue - is this a bug?

From
Tom Lane
Date:
Max Pyziur <pyz@panix.com> writes:
> I've come across what I consider an odd sorting anomaly in 7.0.2-2 where running
> the following query:
> select headline from headlines where headline like 'Alb%' order by 1 ;

> The results in the 7.0.2-2 install come back case-insensitive and
> oblivious to punctuation.

That's pretty bizarre (not to say difficult to believe).  What LOCALE
setting are you running the postmaster in?

            regards, tom lane

Re: [BUGS] "ORDER BY" issue - is this a bug?

From
Max Pyziur
Date:
Tom Lane wrote:
>
> Max Pyziur <pyz@panix.com> writes:
> > I've come across what I consider an odd sorting anomaly in 7.0.2-2 where running
> > the following query:
> > select headline from headlines where headline like 'Alb%' order by 1 ;
>
> > The results in the 7.0.2-2 install come back case-insensitive and
> > oblivious to punctuation.
>
> That's pretty bizarre (not to say difficult to believe).  What LOCALE
> setting are you running the postmaster in?

On none of the installations - the two 6.5.x and the 7.0.2-2 one - I don't have
any locale set.  I get (what I think are) correct results with the first two.

Does 7.0.2-2 require the setting of locale.  If so, where is it documented
and/or how is it done?

>                         regards, tom lane

--
Max Pyziur                                     BRAMA - Gateway Ukraine
pyz@brama.com                                  http://www.brama.com/

Re: [BUGS] "ORDER BY" issue - is this a bug?

From
Tom Lane
Date:
Max Pyziur <pyz@panix.com> writes:
>> That's pretty bizarre (not to say difficult to believe).  What LOCALE
>> setting are you running the postmaster in?

> On none of the installations - the two 6.5.x and the 7.0.2-2 one - I
> don't have any locale set.  I get (what I think are) correct results
> with the first two.

> Does 7.0.2-2 require the setting of locale.

AFAIK its behavior should be the same as 6.5 for LOCALE issues.  That's
why I suspect an environment difference.

I can assure you there is no code in the backend that will do
case-insensitive, punctuation-insensitive comparisons --- much less any
to do so without request.  I'm betting that either this is your error,
or the strcmp() library function is doing it; and as far as I've heard,
only LOCALE environment variables might affect the behavior of strcmp().

It also seems possible that no sort is happening at all (which would be
a planner bug), and the ordering you're getting is just whatever happens
to be in the underlying table.  Does EXPLAIN show that the query is
being done with an explicit sort?

            regards, tom lane

Re: [SQL] Re: [BUGS] "ORDER BY" issue - is this a bug?

From
Max Pyziur
Date:
Tom Lane wrote:
>
> Max Pyziur <pyz@panix.com> writes:
> >> That's pretty bizarre (not to say difficult to believe).  What LOCALE
> >> setting are you running the postmaster in?
>
> > On none of the installations - the two 6.5.x and the 7.0.2-2 one - I
> > don't have any locale set.  I get (what I think are) correct results
> > with the first two.
>
> > Does 7.0.2-2 require the setting of locale.
>
> AFAIK its behavior should be the same as 6.5 for LOCALE issues.  That's
> why I suspect an environment difference.

Thanks for your quick replies.

I dropped my databases, uninstalled the 7.0.2-2 rpms and installed 6.5.3 rpms on
my development RH6.2 Linux 2.2.14-5.0 system and I still get the anomalous query
result on that box.

> I can assure you there is no code in the backend that will do
> case-insensitive, punctuation-insensitive comparisons --- much less any
> to do so without request.  I'm betting that either this is your error,
> or the strcmp() library function is doing it; and as far as I've heard,
> only LOCALE environment variables might affect the behavior of strcmp().

There is a $LANG variable which is set to en_US; is this what might be causing
the problem?  I've tried unsetting it (unset LANG) and still get the problem.
Is there something else which I should be looking at?

> It also seems possible that no sort is happening at all (which would be
> a planner bug), and the ordering you're getting is just whatever happens
> to be in the underlying table.  Does EXPLAIN show that the query is
> being done with an explicit sort?

Running EXPLAIN the results are:
headlines1=> explain select headline from headlines where headline like 'Alb%'
order by 1 ;
NOTICE:  QUERY PLAN:

Sort  (cost=221.15 rows=1 width=12)
  ->  Seq Scan on headlines  (cost=221.15 rows=1 width=12)

EXPLAIN



>                         regards, tom lane


Thanks again,

Max Pyziur                                     BRAMA - Gateway Ukraine
pyz@brama.com                                  http://www.brama.com/

Re: [SQL] Re: [BUGS] "ORDER BY" issue - is this a bug?

From
Lamar Owen
Date:
Max Pyziur wrote:
> I dropped my databases, uninstalled the 7.0.2-2 rpms and installed 6.5.3 rpms on
> my development RH6.2 Linux 2.2.14-5.0 system and I still get the anomalous query
> result on that box.

This is a RedHat 6.2 locale problem.  Set LC_COLLATE=C and see if that
fixes things.

--
Lamar Owen
WGCR Internet Radio
1 Peter 4:11

Re: [SQL] Re: [BUGS] "ORDER BY" issue - is this a bug?

From
Max Pyziur
Date:

On Mon, 11 Sep 2000, Lamar Owen wrote:

> Max Pyziur wrote:
> > I dropped my databases, uninstalled the 7.0.2-2 rpms and installed 6.5.3 rpms on
> > my development RH6.2 Linux 2.2.14-5.0 system and I still get the anomalous query
> > result on that box.
>
> This is a RedHat 6.2 locale problem.  Set LC_COLLATE=C and see if that
> fixes things.

That didn't do it.  But I'll tell you what did give me the
(original) results for which I was looking.  Looking through the various
*rc files which get called in the stock RH6.2 installation I saw that
there was a /etc/profile.d/lang.sh file (for bash shells) and lang.csh
(for csh shells).  I moved the lang.sh to lang.sh.bak (since the login
initialization script looks for /etc/profile.d/*sh files), shutdown
postmaster and restarted and now I'm getting the results I got before.

The thing which I'm concerned about is doing a 'env | sort' doesn't show
any of the LC_xxx variable settings.  Nor does 'echo $LC_COLLATE'.  What
command can I use to "expose" these settings to make sure that I don't
run into a similar situation?

Much thanks to you and tom lane (in his ferlinghetti way) for helping me
out of this one (at least to get the original results I had).

> --
> Lamar Owen
> WGCR Internet Radio
> 1 Peter 4:11
>


Max Pyziur                                     BRAMA - Gateway Ukraine
pyz@brama.com                                  http://www.brama.com/