Re: utf8 issue - Mailing list pgsql-general

From Dean Gibson (DB Administrator)
Subject Re: utf8 issue
Date
Msg-id 47C48AEC.5020809@ultimeth.com
Whole thread Raw
In response to Re: utf8 issue  (Tom Hart <tomhart@coopfed.org>)
List pgsql-general
On 2008-02-26 13:04, Tom Hart wrote:
>>
> I already have a php script that does some data scrubbing before the
> copy. I added this line to the script and things seem to be working
> better now
>
> $line = iconv("ISO-8859-1", "UTF-8", $line);
>
> Thanks for the help guys :-)
>

Read up on the difference between PostgreSQL's server_encoding and
client_encoding.

The "server_encoding" is how the data is stored in the server, and can
be anything compatible (UTF-8, ISO-8859-1, whatever will hold your
character set).

The "client_encoding" is how the incoming (or outgoing) data is
treated/assumed.  PostgreSQL does the necessary conversion for you.

You can set/change the "client_encoding" in so many ways, it gives you
total flexibility, in order of increasing priority:

1. You can set it as the default for any database (see ALTER DATABASE ...).
2. You can set it in an environment variable, which means the client
utilities (and I believe the libraries) use that.
3. In PSQL, you can set it with the "\encoding" statement (which applies
to the session or until changed), or the "SET [SESSION | LOCAL ]
client_encoding TO ...", which will set it for the session or just the
current transaction.

I just went through this, and while I initially used "iconv" to get up
and running, I've removed most of those in my scripts and just use the
PostgreSQL conversion instead.

--
Mail to my list address MUST be sent via the mailing list.
All other mail to my list address will bounce.


pgsql-general by date:

Previous
From: Maciej Sieczka
Date:
Subject: Re: how to auto GRANT custom ACL on a new table?
Next
From: "Tim Uckun"
Date:
Subject: citext in windows.