Encoding problem with 7.4 - Mailing list pgsql-hackers

From E.Rodichev
Subject Encoding problem with 7.4
Date
Msg-id Pine.GSO.4.58.0311272251560.12305@ra.sai.msu.su
Whole thread Raw
Responses Re: Encoding problem with 7.4  (Peter Eisentraut <peter_e@gmx.net>)
Re: Encoding problem with 7.4  (Jean-Michel POURE <jm@poure.com>)
Re: Encoding problem with 7.4  (Christopher Kings-Lynne <chriskl@familyhealth.com.au>)
Re: Encoding problem with 7.4  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Hi,

I just noticed some incorrect behaviour for postgresql-7.4 related
to locale.

After installing 7.4 I created database completely from scratch
with cyrillic locale:

su postgres
export LC_CTYPE=ru_RU.KOI8-R
export LC_COLLATE=ru_RU.KOI8-R
/usr/local/pgsql/bin/initdb -D /db2/pgdata
/usr/local/pgsql/bin/createuser -d er

Then I switch off to my normal account. At this point I have:

/e:1>psql -l       List of databases  Name    |  Owner   | Encoding
-----------+----------+-----------template0 | postgres | SQL_ASCIItemplate1 | postgres | SQL_ASCII
(2 rows)


Then I created new db:

/e:2>createdb test
CREATE DATABASE
/e:3>psql -l       List of databases  Name    |  Owner   | Encoding
-----------+----------+-----------template0 | postgres | SQL_ASCIItemplate1 | postgres | SQL_ASCIItest      | er
|SQL_ASCII   <----- Incorrect!
 
(3 rows)

Let's note than the last line is in fact completely incorrect.
DB test is really in ru_RU.KOI8-R, not ASCII. I can create tables
with ascii characters, and with non-ascii (cyrillic) as well,
and order by, select upper, etc. works in ru_RU.KOI8-R locale.

After first initdb it doesn't affected by my LC_CTYPE and LC_COLLATE
settings. I may set

export LC_CTYPE=ru_RU.KOI8-R
export LC_COLLATE=ru_RU.KOI8-R

or

export LC_CTYPE=C
export LC_COLLATE=C

but order by and select upper works really in cyrillic locale.


As I may see, there are two points here:

1. Reporting Encoding as SQL_ASCII is incorrect - all db are in KOI8,
not in SQL_ASCII;

2. More generally, such kind of fixed locale behaviour is not very
convenient. More natural way looks as follows: the user got
a db encoding as it specified at the moment createdb is issued.
By this way it will be possible to have different databases with
different encodings.

Best regards,  Evgeny Rodichev


_________________________________________________________________________
Evgeny Rodichev                          Sternberg Astronomical Institute
email: er@sai.msu.su                              Moscow State University
Phone: 007 (095) 939 2383
Fax:   007 (095) 932 8841                       http://www.sai.msu.su/~er


pgsql-hackers by date:

Previous
From: Peter Eisentraut
Date:
Subject: Re: building outside source tree
Next
From: Peter Eisentraut
Date:
Subject: Re: Encoding problem with 7.4