Thread: furiously yours

furiously yours

From

"Rony Khoury"

Date:

08 June 2001, 05:20:40

I've been using postgresql for a quite some time and I have nothing to say
about it except it is wonderfull.

I recently upgraded to linux redhat 7.1 that had postgresql 7.0.3 on it and
started using it.

I know I should upgrade to the latest version but since I was behind on time
and I'm not using anything fancy, I thought I could just use the version in
hand until I get the oportunity to upgrade.

Today I noticed (the hard way) after an error came out and caused a big fuss
around that the way this version implement ordering is different then the
previous version I was using.

example
new version:
BOU MOUSLEH
BOUSTANI
BOU YAZBEK

old version
BOU MOUSLEH
BOU YAZBECK
BOUSTANI

it seems that the interpretation differs for the sapce.

I would like to know if this was solved in the new version? and if not
someone should defenetly do something about it.

once again thank you to the developoement team for the inspiring and
wonderfull effort you guys are doing. but I realy found this up the hard
hard way.

thanks,
Rony.

_________________________________________________________________________
Get Your Private, Free E-mail from MSN Hotmail at http://www.hotmail.com.

Re: furiously yours

From

Lamar Owen

Date:

08 June 2001, 12:17:04

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Friday 08 June 2001 05:20, Rony Khoury wrote:
[collation sequencing problem regarding space interpretation in ORDER BY]
> I would like to know if this was solved in the new version? and if not
> someone should defenetly do something about it.

This is not a PostgreSQL issue, but a locale issue that was introduced by Red
Hat implementing 'correct' locale collation. PostgreSQL has to rely on the
system's C library for the string comparison/collation routines -- and those
routines are greatly changed by the locale setting.

Red Hat does not view this behavior as buggy.

Trond, I didn't know whether you are subscribed to the BUGS list or not, so I
am  CC'ing you on this.  The two collations:

example
new version:
BOU MOUSLEH
BOUSTANI
BOU YAZBEK

old version
BOU MOUSLEH
BOU YAZBECK
BOUSTANI

This appears to me to be wrong, Trond.  Red Hat 7.1 out of the box
installation.

Rony, in the normal usage of these names, what would the correct collation be?
- --
Lamar Owen
WGCR Internet Radio
1 Peter 4:11

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.4 (GNU/Linux)
Comment: For info see http://www.gnupg.org

iD8DBQE7IPTi5kGGI8vV9eERAnesAKC0RxdvdPs3EcFR9MAkapPs5drwbQCfVQuY
xrUfcfOFjEzLdCPsLbV59PM=
=bsSL
-----END PGP SIGNATURE-----

Re: furiously yours

From

Peter Eisentraut

Date:

08 June 2001, 12:30:59

Rony Khoury writes:

> example
> new version:
> BOU MOUSLEH
> BOUSTANI
> BOU YAZBEK
>
> old version
> BOU MOUSLEH
> BOU YAZBECK
> BOUSTANI

Seems like the two installations are running under different locales.
Since you didn't give much information I can only guess that the "old
version" is prior to 7.1 and running in some non-C locale, whereas the
"new version" is 7.1 or later but was initialized in the C locale.  Note
that 7.1 or later always use the locale at initdb time for sort order.

--
Peter Eisentraut   peter_e@gmx.net   http://funkturm.homeip.net/~peter

Re: furiously yours

From

Lamar Owen

Date:

08 June 2001, 14:34:59

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Friday 08 June 2001 12:26, Peter Eisentraut wrote:
> Seems like the two installations are running under different locales.
> Since you didn't give much information I can only guess that the "old
> version" is prior to 7.1 and running in some non-C locale, whereas the
> "new version" is 7.1 or later but was initialized in the C locale.  Note
> that 7.1 or later always use the locale at initdb time for sort order.

He specified the _new_ installation as Red Hat 7.1, with PostgreSQL 7.0.3 --
which has no 'hard-wiring' to a C locale for initdb.

If it were an upgrade, it would have just about had to have been from 6.4.2
or earlier -- although it is slimly possible it could be a 6.5.x prior,
unless he had 7.0.x running on a Red Hat <7.0, which would just about have to
be either a machine at RH 6.0, or a machine that has been upgraded from RH
6.0 to RH 6.1 or 2.

I base that on the fact the the collation change was a Red Hat 6.1 and later
deal -- and Red Hat 6.1 shipped PostgreSQL 6.5.2.  Red Hat 6.0 shipped
PostgreSQL 6.4.2, RH 5.1 and 5.2 shipped PG6.3.2, and RH 5.0 shipped PG
6.2.1.  RH 7.0 shipped PG 7.0.2.  Also, an upgrade from RH 6.0 to either 6.1
or 6.2 doesn't force the i18n issue -- but, IIRC, an upgrade from 6.0 to 7.x
_does_.
- --
Lamar Owen
WGCR Internet Radio
1 Peter 4:11
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.4 (GNU/Linux)
Comment: For info see http://www.gnupg.org

iD8DBQE7IRpA5kGGI8vV9eERApdzAJ9MqqVB5EOhRJpqND/S1esE+Eu+UQCgkzvF
dBDVZHqXqVl1s67xLUXBfS0=
=Dvr4
-----END PGP SIGNATURE-----

Re: furiously yours

From

"Rony Khoury"

Date:

09 June 2001, 17:24:26

Hello again,

If I understood right Postgresql depends on C language to do the sorting,
and C language depends on the settings of redhat for the sorting task.
Before going to RedHat with that I took the liberty to do the following test
and I would like to take your opinion about it first.

I installed a whole new redhat 7.1 version at home, with it came postgresql
7.0.3, I then downloaded the rpms for postgresql 7.1.1 from the internet and
upgraded to it. After doing all the proper installation procedure, I tried
the sorting on the new system and got the same results as at work.

Now I took the liberety to write a small program in C-language to see how
the sorting works and surprisingly enough I got the results desired that I
used to get on the old version. (ie the space is interpreted as < A).

Following is the code I wrote and the results I got. I belive this requires
your comment before going to redhat with that, I still think that there
might be some parameters missing somewhere to put things back in order.

Lamer I checked the parameters on my system and did not find LC_ALL nor
LC_COLL, but I found LANG=en_US. I do not know what these do, so your
guidance is appretiated in this respect if you think this is related to the
problem.

The C-Language program is:

main () {
  char tparr[10][500];
  char tpstring[500];
  int i,j;

  sprintf(tparr[0],"BOU ASSAF");
  sprintf(tparr[1],"BOUHAIDAR");
  sprintf(tparr[2],"BOU ZAHRA");

  for(i=0;i<3;i++) {
    for (j=0;j<3;j++) {
      if (strcmp(tparr[i],tparr[j]) == -1) {
        sprintf(tpstring,"%s",tparr[i]);
        sprintf(tparr[i],"%s",tparr[j]);
        sprintf(tparr[j],"%s",tpstring);
      }
    }
  }

  for (i=0;i<3;i++) {
    printf("\n---%s",tparr[i]);
  }
  printf("\n");
}


and the result is

---BOU ASSAF
---BOU ZAHRA
---BOUHAIDAR

while the postgresql continues to interpret this info as

---BOU ASSAF
---BOUHAIDAR
---BOU ZAHRA


not that these results I'm getting on the same system and is all a new
system from scratch thus can not be dependant on any previous versions.

Thanks,
Rony.
_________________________________________________________________________
Get Your Private, Free E-mail from MSN Hotmail at http://www.hotmail.com.

Re: furiously yours

From

Tom Lane

Date:

09 June 2001, 18:22:01

"Rony Khoury" <rkrk@hotmail.com> writes:
> Lamer I checked the parameters on my system and did not find LC_ALL nor
> LC_COLL, but I found LANG=en_US.

Indeed, you are getting en_US collation order.  Try setting LANG=C and
then redoing initdb.  Or, if you really have no use for non-C locale,
you could recompile Postgres without any locale support at all...

            regards, tom lane

Re: furiously yours

From

Peter Eisentraut

Date:

09 June 2001, 19:08:08

Rony Khoury writes:

>   for(i=0;i<3;i++) {
>     for (j=0;j<3;j++) {
>       if (strcmp(tparr[i],tparr[j]) == -1) {
>         sprintf(tpstring,"%s",tparr[i]);
>         sprintf(tparr[i],"%s",tparr[j]);
>         sprintf(tparr[j],"%s",tpstring);
>       }
>     }
>   }

You need to use strcoll() to get locale-dependent sorting. -- Or not, if
you don't.

--
Peter Eisentraut   peter_e@gmx.net   http://funkturm.homeip.net/~peter

Re: furiously yours

From

Stephan Szabo

Date:

09 June 2001, 19:22:00

That's mostly because your c program is wrong.  If you reorder the strings
you'll see that it's not sorting at all.  It's just giving them back
in the order you gave them.

You should not compare the output from strcmp to -1.  It's <0, 0, >0 not
-1, 0, 1.  When I run the below program and add a printf of the
strcmp values, I get values like 40 and -40.   Also, you'd probably want
to be using strcoll rather than strcmp to make the comparison valid.

[As a side note, I believe this means that the results from varstr_cmp
in varlena.c are not guaranteed to be -1/0/1 as the comment purports
if strncmp doesn't return -1/+1.]

Try the following program and switch which setlocale is enabled:
#include <locale.h>

void main () {
  char tparr[10][500];
  char tpstring[500];
  int i,j;

 /* setlocale(LC_ALL, "en_US"); */
  setlocale(LC_ALL, "C");
  sprintf(tparr[0],"BOU ASSAF");
  sprintf(tparr[1],"BOU ZAHRA");
  sprintf(tparr[2],"BOUHAIDAR");

  for(i=0;i<3;i++) {
    for (j=0;j<3;j++) {
      if (strcoll(tparr[i],tparr[j])<0) {
        sprintf(tpstring,"%s",tparr[i]);
        sprintf(tparr[i],"%s",tparr[j]);
        sprintf(tparr[j],"%s",tpstring);
      }
    }
  }

  for (i=0;i<3;i++) {
    printf("\n---%s",tparr[i]);
  }
  printf("\n");
}

With C you should get:

---BOU ASSAF
---BOU ZAHRA
---BOUHAIDAR

With en_US you should get:

---BOU ASSAF
---BOUHAIDAR
---BOU ZAHRA

On Sun, 10 Jun 2001, Rony Khoury wrote:

>
> Hello again,
>
> If I understood right Postgresql depends on C language to do the sorting,
> and C language depends on the settings of redhat for the sorting task.
> Before going to RedHat with that I took the liberty to do the following test
> and I would like to take your opinion about it first.
>
> I installed a whole new redhat 7.1 version at home, with it came postgresql
> 7.0.3, I then downloaded the rpms for postgresql 7.1.1 from the internet and
> upgraded to it. After doing all the proper installation procedure, I tried
> the sorting on the new system and got the same results as at work.
>
> Now I took the liberety to write a small program in C-language to see how
> the sorting works and surprisingly enough I got the results desired that I
> used to get on the old version. (ie the space is interpreted as < A).
>
> Following is the code I wrote and the results I got. I belive this requires
> your comment before going to redhat with that, I still think that there
> might be some parameters missing somewhere to put things back in order.
>
> Lamer I checked the parameters on my system and did not find LC_ALL nor
> LC_COLL, but I found LANG=en_US. I do not know what these do, so your
> guidance is appretiated in this respect if you think this is related to the
> problem.
>
> The C-Language program is:
>
> main () {
>   char tparr[10][500];
>   char tpstring[500];
>   int i,j;
>
>   sprintf(tparr[0],"BOU ASSAF");
>   sprintf(tparr[1],"BOUHAIDAR");
>   sprintf(tparr[2],"BOU ZAHRA");
>
>   for(i=0;i<3;i++) {
>     for (j=0;j<3;j++) {
>       if (strcmp(tparr[i],tparr[j]) == -1) {
>         sprintf(tpstring,"%s",tparr[i]);
>         sprintf(tparr[i],"%s",tparr[j]);
>         sprintf(tparr[j],"%s",tpstring);
>       }
>     }
>   }
>
>   for (i=0;i<3;i++) {
>     printf("\n---%s",tparr[i]);
>   }
>   printf("\n");
> }
>
>
> and the result is
>
> ---BOU ASSAF
> ---BOU ZAHRA
> ---BOUHAIDAR
>
> while the postgresql continues to interpret this info as
>
> ---BOU ASSAF
> ---BOUHAIDAR
> ---BOU ZAHRA
>
>
> not that these results I'm getting on the same system and is all a new
> system from scratch thus can not be dependant on any previous versions.
>
> Thanks,
> Rony.
> _________________________________________________________________________
> Get Your Private, Free E-mail from MSN Hotmail at http://www.hotmail.com.
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to majordomo@postgresql.org so that your
> message can get through to the mailing list cleanly
>