Thread: More on character encoding in SELECTs

More on character encoding in SELECTs

From
Rolf Johansson
Date:
I'm running PostgreSQL 7.0.3 on RedHat 7.0 eng.
In the database I have several lines of data beginning with "Å", "Ä" or "Ö"
(swedish special characters).

When I do a SELECT xx FROM xx WHERE xx >= 'Å' in psql, it returns lines where
xx is starting with "A". With "Ö", it returns lines where xx starts with "O".

I've tried to set LC_ALL to sv_SE or swedish, nothing happens.
Is the rpm for Postgres 7.0.3 compiled with locale support? Does PostgreSQL
care at all? What can I do to make it right?

Database encoding is LATIN1.

It works fine on another system, the only difference is that the other
system is RedHat 7 swedish.

/Rolf

Re: More on character encoding in SELECTs

From
Tom Lane
Date:
Rolf Johansson <rojo@nocrew.org> writes:
> When I do a SELECT xx FROM xx WHERE xx >= '�' in psql, it returns lines where
> xx is starting with "A". With "�", it returns lines where xx starts with "O".

> I've tried to set LC_ALL to sv_SE or swedish, nothing happens.

Don't forget that the controlling value of LC_ALL is the one in the
postmaster's environment, not the psql client's.

Also, if you restart the postmaster with a new LC_ALL setting, you will
have to drop and rebuild indexes on text columns that contain non-ASCII
characters, since they'll be out of order according to the new collation
rule.

It's possible that the misbehavior you are seeing comes from indexes
becoming corrupted because they've been run under different collation
rules at different times.  You need to be careful always to start the
postmaster with the same LC_ value(s).  (7.1 will enforce this by saving
LC_COLLATE at initdb time, but in current releases you have to be
careful.)

> Is the rpm for Postgres 7.0.3 compiled with locale support?

It should be...

            regards, tom lane