Thread: RES: Issue on Varchar Ordering

RES: Issue on Varchar Ordering

From
Erika Terahata Torres Prada e Silva - MPS
Date:
Peter,

Thanks for answering.

But it's not correct to my language pt_BR and it worked correctly with
PostgreSQL at Windows and FreeBSD. I have this problem only with Debian and
now I'm looking for some Debian related issue.



-----Mensagem original-----
De: Peter Eisentraut [mailto:peter_e@gmx.net]
Enviada em: quinta-feira, 17 de agosto de 2006 10:04
Para: pgsql-admin@postgresql.org
Cc: Erika Terahata Torres Prada e Silva - MPS
Assunto: Re: [ADMIN] Issue on Varchar Ordering

Erika Terahata Torres Prada e Silva - MPS wrote:
> I'm having the following issue:
>
> create table aluno (id serial, nome varchar(60));
>
> rodrigo=# select * from aluno order by nome offset 35 limit 5;
>  id |               nome
> ----+-----------------------------------
>  36 | GABRIELA HELEDA DE SOUZA
>  37 | GABRIELA JACOBY NOS
>  38 | GABRIEL ALCIDES KLIM PERONDI
>  39 | GABRIELA LETICIA BATISTA NUNES
>  40 | GABRIEL ALEXANDRE DA SILVA MANICA
> (5 registros)
>
> The problem is that records with id 38 and 40 might come before the
> others. It seems that the spacing isn't considered on ordering.

That is, for practical purposes, correct.

May I suggest that you properly normalize your tables, by putting given
name and family name into separate columns.

--
Peter Eisentraut
http://developer.postgresql.org/~petere/

Re: RES: Issue on Varchar Ordering

From
"Paul B. Anderson"
Date:
I just experienced this.  I changed the collating sequence to POSIX and sorting worked as I had expected.

To change the collating sequence you will have to backup the database, issue initdb with the --lc-collate=POSIX and then restore the data.  In my case, the encoding is still UTF8.

Paul


Erika Terahata Torres Prada e Silva - MPS wrote:
Peter, 

Thanks for answering.

But it's not correct to my language pt_BR and it worked correctly with
PostgreSQL at Windows and FreeBSD. I have this problem only with Debian and
now I'm looking for some Debian related issue.



-----Mensagem original-----
De: Peter Eisentraut [mailto:peter_e@gmx.net] 
Enviada em: quinta-feira, 17 de agosto de 2006 10:04
Para: pgsql-admin@postgresql.org
Cc: Erika Terahata Torres Prada e Silva - MPS
Assunto: Re: [ADMIN] Issue on Varchar Ordering

Erika Terahata Torres Prada e Silva - MPS wrote: 
I'm having the following issue:

create table aluno (id serial, nome varchar(60));

rodrigo=# select * from aluno order by nome offset 35 limit 5;id |               nome
----+-----------------------------------36 | GABRIELA HELEDA DE SOUZA37 | GABRIELA JACOBY NOS38 | GABRIEL ALCIDES KLIM PERONDI39 | GABRIELA LETICIA BATISTA NUNES40 | GABRIEL ALEXANDRE DA SILVA MANICA
(5 registros)

The problem is that records with id 38 and 40 might come before the
others. It seems that the spacing isn't considered on ordering.   
That is, for practical purposes, correct.

May I suggest that you properly normalize your tables, by putting given 
name and family name into separate columns.
 

Re: RES: Issue on Varchar Ordering

From
Alvaro Herrera
Date:
Erika Terahata Torres Prada e Silva - MPS wrote:
> Peter,
>
> Thanks for answering.
>
> But it's not correct to my language pt_BR and it worked correctly with
> PostgreSQL at Windows and FreeBSD. I have this problem only with Debian and
> now I'm looking for some Debian related issue.

It's still correct -- it's the ordering you'll find on the telephone
guide.  It works the same for portuguese and for spanish and a lot
others.  The problem with going to C locale (POSIX), as suggested in a
related follow-up, is that it sorts by byte value, which is not at all
correct when you start sorting stuff that contains funny letters
(accented vowels, etc).  It works for english people just because they
don't worry about accents at all.  I wonder if portuguese has special
sorting rules for "nh", for example?  You'll lose that if you just
change to the C locale.

I think you _can_ define a collation on which spaces are significant,
but it's also a significant amount of work.  (And IMHO not at all worth
the trouble.)

--
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Re: RES: Issue on Varchar Ordering

From
Peter Eisentraut
Date:
Erika Terahata Torres Prada e Silva - MPS wrote:
> But it's not correct to my language pt_BR and it worked correctly
> with PostgreSQL at Windows and FreeBSD. I have this problem only with
> Debian and now I'm looking for some Debian related issue.

String ordering is determined by national and international standards.
In your case, glibc seems to think that ISO 14651 applies.  If you
think differently, complain to glibc, or work on changing the standard,
or define your own locale.  I wouldn't know why Windows and FreeBSD
operate differently (I have some idea, but I won't spread rumours).
But unless they have a more substantial reason than "We thought this
looks reasonable", I'd rather go with the public standard.

For that matter, you have not told what you consider the "correct"
ordering in your example, nor have you proposed a computation rule to
get there.  But in the overall scheme of things, first normal form will
cooperate better with established sorting rules (for any data type),
and vice versa.

--
Peter Eisentraut
http://developer.postgresql.org/~petere/