Thread: Order by behaviour

Order by behaviour

From
"Carlos H. Reimer"
Date:
Hi,
 
We have a PostgreSQL 8.0.6 cluster configured with lc_collate=pt_BR.UTF-8 and when we run the following SELECT:
SELECT substr(nomerazao,1,4),
ascii(substr(nomerazao,1,1)),
ascii(substr(nomerazao,2,1))
from spunico.unico order by nomerazao;
 
is returning:
 
 substr | ascii | ascii
--------+-------+-------
        |    32 |     0
        |    32 |     0
 1000   |    49 |    48
 1.DI   |    49 |    46
 1° R   |    49 |   176
 2M C   |    50 |    77
 3A.G   |    51 |    65
 A. A   |    65 |    46
 AABA   |    65 |    65
 A.A.   |    65 |    46
 A.AG   |    65 |    46
 A.A.   |    65 |    46
 A.A.   |    65 |    46
 ABAS   |    65 |    66
 ABAS   |    65 |    66
 ABAT   |    65 |    66
 A.B.   |    65 |    46
 A.B.   |    65 |    46
 ABCC   |    65 |    66
 A.B.   |    65 |    46
 A.B.   |    65 |    46
 
Are not the lines out of order or is it a normal behaviour for a server with lc_collate=pt_BR.UTF-8?
 
Thank you in advance!
 
Reimer
 

Re: Order by behaviour

From
Raymond O'Donnell
Date:
On 28/03/2007 22:52, Carlos H. Reimer wrote:

> SELECT substr(nomerazao,1,4),
> ascii(substr(nomerazao,1,1)),
> ascii(substr(nomerazao,2,1))
> from spunico.unico order by nomerazao;

You need to add aliases to the returned column by which you'd like to
order the result: your query is ordering the rows according to the
column "nomerazao" in the original table, rather than by the substr()
value returned.

Do something like this:

    SELECT substr(nomerazao,1,4) AS my_col,
    ascii(substr(nomerazao,1,1)),
    ascii(substr(nomerazao,2,1))
    from spunico.unico order by my_col;

HTH

Ray.

---------------------------------------------------------------
Raymond O'Donnell, Director of Music, Galway Cathedral, Ireland
rod@iol.ie
---------------------------------------------------------------

Re: Order by behaviour

From
Peter Eisentraut
Date:
Carlos H. Reimer wrote:
> Are not the lines out of order

No.

> or is it a normal behaviour for a
> server with lc_collate=pt_BR.UTF-8?

Yes.

--
Peter Eisentraut
http://developer.postgresql.org/~petere/

Re: Order by behaviour

From
Stephan Szabo
Date:
On Wed, 28 Mar 2007, Carlos H. Reimer wrote:

> Hi,
>
> We have a PostgreSQL 8.0.6 cluster configured with lc_collate=pt_BR.UTF-8
> and when we run the following SELECT:
> SELECT substr(nomerazao,1,4),
> ascii(substr(nomerazao,1,1)),
> ascii(substr(nomerazao,2,1))
> from spunico.unico order by nomerazao;
>
> is returning:
>
>  substr | ascii | ascii
> --------+-------+-------
>         |    32 |     0
>         |    32 |     0
>  1000   |    49 |    48
>  1.DI   |    49 |    46
>  1� R   |    49 |   176
>  2M C   |    50 |    77
>  3A.G   |    51 |    65
>  A. A   |    65 |    46
>  AABA   |    65 |    65
>  A.A.   |    65 |    46
>  A.AG   |    65 |    46
>  A.A.   |    65 |    46
>  A.A.   |    65 |    46
>  ABAS   |    65 |    66
>  ABAS   |    65 |    66
>  ABAT   |    65 |    66
>  A.B.   |    65 |    46
>  A.B.   |    65 |    46
>  ABCC   |    65 |    66
>  A.B.   |    65 |    46
>  A.B.   |    65 |    46
>
> Are not the lines out of order or is it a normal behaviour for a server with
> lc_collate=pt_BR.UTF-8?

Many collations ignore spaces and symbols on the first pass, so, for
example you might have "A Z" > "AB" despite the fact that a space has a
lower value than a B.

RES: Order by behaviour

From
"Carlos H. Reimer"
Date:
Humm, ok, it is clear now.

And is there a way to change something in this behaviour, like not ignore
spaces and some type of symbols?

A configuration file or a patch?

Thanks in advance!


> -----Mensagem original-----
> De: pgsql-general-owner@postgresql.org
> [mailto:pgsql-general-owner@postgresql.org]Em nome de Stephan Szabo
> Enviada em: quarta-feira, 28 de março de 2007 19:23
> Para: Carlos H. Reimer
> Cc: pgsql-general@postgresql.org
> Assunto: Re: [GENERAL] Order by behaviour
>
>
> On Wed, 28 Mar 2007, Carlos H. Reimer wrote:
>
> > Hi,
> >
> > We have a PostgreSQL 8.0.6 cluster configured with
> lc_collate=pt_BR.UTF-8
> > and when we run the following SELECT:
> > SELECT substr(nomerazao,1,4),
> > ascii(substr(nomerazao,1,1)),
> > ascii(substr(nomerazao,2,1))
> > from spunico.unico order by nomerazao;
> >
> > is returning:
> >
> >  substr | ascii | ascii
> > --------+-------+-------
> >         |    32 |     0
> >         |    32 |     0
> >  1000   |    49 |    48
> >  1.DI   |    49 |    46
> >  1° R   |    49 |   176
> >  2M C   |    50 |    77
> >  3A.G   |    51 |    65
> >  A. A   |    65 |    46
> >  AABA   |    65 |    65
> >  A.A.   |    65 |    46
> >  A.AG   |    65 |    46
> >  A.A.   |    65 |    46
> >  A.A.   |    65 |    46
> >  ABAS   |    65 |    66
> >  ABAS   |    65 |    66
> >  ABAT   |    65 |    66
> >  A.B.   |    65 |    46
> >  A.B.   |    65 |    46
> >  ABCC   |    65 |    66
> >  A.B.   |    65 |    46
> >  A.B.   |    65 |    46
> >
> > Are not the lines out of order or is it a normal behaviour for
> a server with
> > lc_collate=pt_BR.UTF-8?
>
> Many collations ignore spaces and symbols on the first pass, so, for
> example you might have "A Z" > "AB" despite the fact that a space has a
> lower value than a B.
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: if posting/reading through Usenet, please send an appropriate
>        subscribe-nomail command to majordomo@postgresql.org so that your
>        message can get through to the mailing list cleanly
>
>


Re: RES: Order by behaviour

From
Richard Huxton
Date:
Carlos H. Reimer wrote:
> Humm, ok, it is clear now.
>
> And is there a way to change something in this behaviour, like not ignore
> spaces and some type of symbols?
>
> A configuration file or a patch?

Pick a different locale?

--
   Richard Huxton
   Archonet Ltd

Re: RES: Order by behaviour

From
Stephan Szabo
Date:
On Thu, 29 Mar 2007, Carlos H. Reimer wrote:

> Humm, ok, it is clear now.
>
> And is there a way to change something in this behaviour, like not ignore
> spaces and some type of symbols?

Well, right now it's generally determined by your OS's definition of the
locale you've chosen. You might be able to pick another locale which has
different behavior if your system supports one or gives you a reasonable
way to create one, or as the last ditch attempt, locale "C" for bytewise
ordering. However, I think you'll need to recreate the database if you
change the locale.


RES: Order by behaviour

From
"Carlos H. Reimer"
Date:
Hi,

I was trying to find the docs about the collating sequence standards but
could not find.

Would like to know for example which characters are ignored by the "order
by" in some of the collating types.

Please, can anyone indicate me where could I find documentation about these
standards?

Thanks in advance!

> -----Mensagem original-----
> De: Stephan Szabo [mailto:sszabo@megazone.bigpanda.com]
> Enviada em: quarta-feira, 28 de março de 2007 19:23
> Para: Carlos H. Reimer
> Cc: pgsql-general@postgresql.org
> Assunto: Re: [GENERAL] Order by behaviour
>
>
> On Wed, 28 Mar 2007, Carlos H. Reimer wrote:
>
> > Hi,
> >
> > We have a PostgreSQL 8.0.6 cluster configured with
> lc_collate=pt_BR.UTF-8
> > and when we run the following SELECT:
> > SELECT substr(nomerazao,1,4),
> > ascii(substr(nomerazao,1,1)),
> > ascii(substr(nomerazao,2,1))
> > from spunico.unico order by nomerazao;
> >
> > is returning:
> >
> >  substr | ascii | ascii
> > --------+-------+-------
> >         |    32 |     0
> >         |    32 |     0
> >  1000   |    49 |    48
> >  1.DI   |    49 |    46
> >  1° R   |    49 |   176
> >  2M C   |    50 |    77
> >  3A.G   |    51 |    65
> >  A. A   |    65 |    46
> >  AABA   |    65 |    65
> >  A.A.   |    65 |    46
> >  A.AG   |    65 |    46
> >  A.A.   |    65 |    46
> >  A.A.   |    65 |    46
> >  ABAS   |    65 |    66
> >  ABAS   |    65 |    66
> >  ABAT   |    65 |    66
> >  A.B.   |    65 |    46
> >  A.B.   |    65 |    46
> >  ABCC   |    65 |    66
> >  A.B.   |    65 |    46
> >  A.B.   |    65 |    46
> >
> > Are not the lines out of order or is it a normal behaviour for
> a server with
> > lc_collate=pt_BR.UTF-8?
>
> Many collations ignore spaces and symbols on the first pass, so, for
> example you might have "A Z" > "AB" despite the fact that a space has a
> lower value than a B.
>
>


Re: RES: Order by behaviour

From
Martijn van Oosterhout
Date:
On Thu, Apr 05, 2007 at 06:46:27PM -0300, Carlos H. Reimer wrote:
> Hi,
>
> I was trying to find the docs about the collating sequence standards but
> could not find.
>
> Would like to know for example which characters are ignored by the "order
> by" in some of the collating types.

There are no standards. The collation used by the postgres is whatever
collation is done by your OS. So you need to check there...

Loosely defined, its sorts the same way as "sort" does. So you can test
by doing things like:

LC_COLLATE=foo sort <bar

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.

Attachment