Re: Encoding problem with 7.4 - Mailing list pgsql-hackers

From Stephan Szabo
Subject Re: Encoding problem with 7.4
Date
Msg-id 20031203162509.F43160@megazone.bigpanda.com
Whole thread Raw
In response to Re: Encoding problem with 7.4  ("E.Rodichev" <er@sai.msu.su>)
Responses Re: Encoding problem with 7.4  ("E.Rodichev" <er@sai.msu.su>)
List pgsql-hackers
On Thu, 4 Dec 2003, E.Rodichev wrote:

> On Wed, 3 Dec 2003, Stephan Szabo wrote:
>
> > Only the locale settings at initdb time matter.  Changing the LC_* later
> > is not going to change what the database does.  Encoding and locale are
> > separate (but related) and it is your responsibility to make sure the
> > choices are consistent. If you do not specify an encoding, SQL_ASCII is
> > used for the encoding. If the characters happen to line up appropriately
> > for what your ru_RU.KOI8-R locale expects it'll even happen to appear to
> > work for sorting and case changes (and things like isprint). Which part of
> > this are you not understanding?
>
>
> Thank you, it is much more consistent answer. But again, the things are
> going not exactly the way you wrote.
>
> >From your opinion the chain is
>
> data -> encoding transform -> locale transform -> output
>
> It looks clean and reasonable.
>
> Encoding transform may be set during initdb or createdb (is it true?)
>
> But when locale transform is defined? In general unix flavor it should
> depend on LC_* setting (is it true?)
>
> As I described in my first posting the situation is different. Namely,
> locale setting now defines _encoding transform_ (and data representation
> in storage), but _locale transform_ doesnt depend on LC_*.

The locale settings depend on LC_* at initdb time only. When the
postmaster starts it sets the locale based on the stored values from
initdb, not on the current environment.

With an SQL_ASCII database being accessed from a client with
client_encoding set to SQL_ASCII (which it should be if you aren't setting
it) the byte values of a string are passed along with no conversion for
the encoding.  This means that from within one environment you should get
back what you put in, so it might *look* like it's KOI8-R if that's what
you're in, but it's not because someone accessing it from say an ISO8859-1
system may see something different.


pgsql-hackers by date:

Previous
From: Doug McNaught
Date:
Subject: Re: PostgreSQL 7.3.4 gets killed by SIG_KILL
Next
From: "Magnus Naeslund(t)"
Date:
Subject: Re: PostgreSQL 7.3.4 gets killed by SIG_KILL