Thread: pg_client_encoding

pg_client_encoding

From
Tatsuo Ishii
Date:
Hi,

I'm going to add a new function "pg_client_encoding" returning the
current client side encoding name. I know there is a similar
functionality already there in PostgreSQL (show client_encoding) but
it's pain to handle notice message by a program.

Also note that JDBC driver and maybe some other APIs use
getdatabaseencoding, but I think it's not adequate for FE APIs to know
actual encoding passed to FE side, since an encoding conversion might
be made in BE side. For example, if PGCLIENTENCODING is set to SJIS
before starting postmaster, the actual encoding passed to FE would be
SJIS even the database encoding is EUC_JP.

Comments?
--
Tatsuo Ishii


Re: pg_client_encoding

From
Karel Zak
Date:
On Mon, Sep 10, 2001 at 01:46:28PM +0900, Tatsuo Ishii wrote:
> Hi,
> 
> I'm going to add a new function "pg_client_encoding" returning the
> current client side encoding name. I know there is a similar
> functionality already there in PostgreSQL (show client_encoding) but
> it's pain to handle notice message by a program.
> 
> Also note that JDBC driver and maybe some other APIs use
> getdatabaseencoding, but I think it's not adequate for FE APIs to know
> actual encoding passed to FE side, since an encoding conversion might
> be made in BE side. For example, if PGCLIENTENCODING is set to SJIS
> before starting postmaster, the actual encoding passed to FE would be
> SJIS even the database encoding is EUC_JP.
> 
> Comments?
What some common function like pg_show():
SELECT pg_show('CLIENT_ENCODING');SELECT pg_show('SERVER_ENCODING');SELECT pg_show('DATESTYLE');
that returns same result as standard 'SHOW' command, but not as NOTICE?
A lot of code for this function can be shared with current SHOW routines.
I'm sure non-libpq clients (like JDBC) maintainers will happy with it. 
Karel

-- Karel Zak  <zakkr@zf.jcu.cz>http://home.zf.jcu.cz/~zakkr/C, PostgreSQL, PHP, WWW, http://docs.linux.cz,
http://mape.jcu.cz


Re: pg_client_encoding

From
Tatsuo Ishii
Date:
> Tatsuo,
> 
> Did you ever commit this new function?  I just tried a 'select 
> pg_client_encoding()' and it told me that there was no such function. 
> This was on sources that I pulled and built two days ago.
> 
> I was planning on changing the JDBC code to use this function instead of 
> getdatabaseencoding().

Sorry for the delay. I have just added pg_client_encoding() which
returns client side encoding name.

> Also, what names will this new function return (the old character set 
> names like getdatabaseencoding still does, or the new names)?

The "old" ones. To make sure, here are the encoding names list
currently supported. 

encoding    what pg_client_encoding/    alias    getdatabaseencoding    returns
----------------------------------------------------------------
ASCII        SQL_ASCII
UTF-8        UNICODE                UTF_8
MULE-INTERNAL    MULE_INTERNAL
ISO-8859-1    LATIN1                ISO_8859_1
ISO-8859-2    LATIN2                ISO_8859_2
ISO-8859-3    LATIN3                ISO_8859_3
ISO-8859-4    LATIN4                ISO_8859_4
ISO-8859-5    ISO_8859_5
ISO-8859-6    ISO_8859_6
ISO-8859-7    ISO_8859_7
ISO-8859-8    ISO_8859_8
ISO-8859-9    LATIN5                ISO_8859_9
ISO-8859-10    ISO_8859_10            LATIN6
ISO-8859-13    ISO_8859_13            LATIN7
ISO-8859-14    ISO_8859_14            LATIN8
ISO-8859-15    ISO_8859_15            LATIN9
ISO-8859-16    ISO_8859_16
EUC-JP        EUC_JP
EUC-CN        EUC_CN
EUC-KR        EUC_KR
EUC-TW        EUC_TW
Shift_JIS    SJIS                SHIFT_JIS
Big5        BIG5
Windows1250    WIN1250
Windows1251    WIN
KOI8-R        KOI8                KOI8R
IBM866        ALT

> thanks,
> --Barry
> 
> 
> 
> Tatsuo Ishii wrote:
> 
> > Hi,
> > 
> > I'm going to add a new function "pg_client_encoding" returning the
> > current client side encoding name. I know there is a similar
> > functionality already there in PostgreSQL (show client_encoding) but
> > it's pain to handle notice message by a program.
> > 
> > Also note that JDBC driver and maybe some other APIs use
> > getdatabaseencoding, but I think it's not adequate for FE APIs to know
> > actual encoding passed to FE side, since an encoding conversion might
> > be made in BE side. For example, if PGCLIENTENCODING is set to SJIS
> > before starting postmaster, the actual encoding passed to FE would be
> > SJIS even the database encoding is EUC_JP.
> > 
> > Comments?
> > --
> > Tatsuo Ishii
> > 
> > ---------------------------(end of broadcast)---------------------------
> > TIP 4: Don't 'kill -9' the postmaster
> > 
> > 
> 
> 


Re: pg_client_encoding

From
Peter Eisentraut
Date:
Tatsuo Ishii writes:

> encoding    what pg_client_encoding/    alias
>         getdatabaseencoding
>         returns
> ----------------------------------------------------------------
> ASCII        SQL_ASCII
> UTF-8        UNICODE                UTF_8
> MULE-INTERNAL    MULE_INTERNAL
> ISO-8859-1    LATIN1                ISO_8859_1
> ISO-8859-2    LATIN2                ISO_8859_2
> ISO-8859-3    LATIN3                ISO_8859_3
> ISO-8859-4    LATIN4                ISO_8859_4
> ISO-8859-5    ISO_8859_5
> ISO-8859-6    ISO_8859_6
> ISO-8859-7    ISO_8859_7
> ISO-8859-8    ISO_8859_8
> ISO-8859-9    LATIN5                ISO_8859_9
> ISO-8859-10    ISO_8859_10            LATIN6
> ISO-8859-13    ISO_8859_13            LATIN7
> ISO-8859-14    ISO_8859_14            LATIN8
> ISO-8859-15    ISO_8859_15            LATIN9
> ISO-8859-16    ISO_8859_16

Why aren't you using LATINx for (some of) these as well?

-- 
Peter Eisentraut   peter_e@gmx.net   http://funkturm.homeip.net/~peter



Re: pg_client_encoding

From
Tatsuo Ishii
Date:
> > ASCII        SQL_ASCII
> > UTF-8        UNICODE                UTF_8
> > MULE-INTERNAL    MULE_INTERNAL
> > ISO-8859-1    LATIN1                ISO_8859_1
> > ISO-8859-2    LATIN2                ISO_8859_2
> > ISO-8859-3    LATIN3                ISO_8859_3
> > ISO-8859-4    LATIN4                ISO_8859_4
> > ISO-8859-5    ISO_8859_5
> > ISO-8859-6    ISO_8859_6
> > ISO-8859-7    ISO_8859_7
> > ISO-8859-8    ISO_8859_8
> > ISO-8859-9    LATIN5                ISO_8859_9
> > ISO-8859-10    ISO_8859_10            LATIN6
> > ISO-8859-13    ISO_8859_13            LATIN7
> > ISO-8859-14    ISO_8859_14            LATIN8
> > ISO-8859-15    ISO_8859_15            LATIN9
> > ISO-8859-16    ISO_8859_16
> 
> Why aren't you using LATINx for (some of) these as well?

If LATIN6 to 9 are well defined in the SQL or some other standards, I
would not object using them. I just don't have enough confidence.
For ISO-8859-5 to 8, and 16, I don't see well defined standards.
--
Tatsuo Ishii



Re: pg_client_encoding

From
Patrice Hédé
Date:
* Tatsuo Ishii <t-ishii@sra.co.jp> [011014 16:05]:
> > > ASCII        SQL_ASCII
> > > UTF-8        UNICODE                UTF_8
> > > MULE-INTERNAL    MULE_INTERNAL
> > > ISO-8859-1    LATIN1                ISO_8859_1
> > > ISO-8859-2    LATIN2                ISO_8859_2
> > > ISO-8859-3    LATIN3                ISO_8859_3
> > > ISO-8859-4    LATIN4                ISO_8859_4
> > > ISO-8859-5    ISO_8859_5
> > > ISO-8859-6    ISO_8859_6
> > > ISO-8859-7    ISO_8859_7
> > > ISO-8859-8    ISO_8859_8
> > > ISO-8859-9    LATIN5                ISO_8859_9
> > > ISO-8859-10    ISO_8859_10            LATIN6
> > > ISO-8859-13    ISO_8859_13            LATIN7
> > > ISO-8859-14    ISO_8859_14            LATIN8
> > > ISO-8859-15    ISO_8859_15            LATIN9
> > > ISO-8859-16    ISO_8859_16
> > 
> > Why aren't you using LATINx for (some of) these as well?
> 
> If LATIN6 to 9 are well defined in the SQL or some other standards, I
> would not object using them. I just don't have enough confidence.
> For ISO-8859-5 to 8, and 16, I don't see well defined standards.

ISO-8859-16 *is* LATIN10, I just don't have the reference to prove it
(I can look for it, if you want to).

ISO-8859-5 to 8 aren't latin scripts. From memory, 5 is cyrillic, 6 is
arabic, 7 is greek, 8 is ??? (hebrew ?)...

So it would make sense to add LATIN10, still :)

Patrice

-- 
Patrice Hédé
email: patrice hede à islande org
www  : http://www.islande.org/


Re: pg_client_encoding

From
Tatsuo Ishii
Date:
> * Tatsuo Ishii <t-ishii@sra.co.jp> [011014 16:05]:
> > > > ASCII        SQL_ASCII
> > > > UTF-8        UNICODE                UTF_8
> > > > MULE-INTERNAL    MULE_INTERNAL
> > > > ISO-8859-1    LATIN1                ISO_8859_1
> > > > ISO-8859-2    LATIN2                ISO_8859_2
> > > > ISO-8859-3    LATIN3                ISO_8859_3
> > > > ISO-8859-4    LATIN4                ISO_8859_4
> > > > ISO-8859-5    ISO_8859_5
> > > > ISO-8859-6    ISO_8859_6
> > > > ISO-8859-7    ISO_8859_7
> > > > ISO-8859-8    ISO_8859_8
> > > > ISO-8859-9    LATIN5                ISO_8859_9
> > > > ISO-8859-10    ISO_8859_10            LATIN6
> > > > ISO-8859-13    ISO_8859_13            LATIN7
> > > > ISO-8859-14    ISO_8859_14            LATIN8
> > > > ISO-8859-15    ISO_8859_15            LATIN9
> > > > ISO-8859-16    ISO_8859_16
> > > 
> > > Why aren't you using LATINx for (some of) these as well?
> > 
> > If LATIN6 to 9 are well defined in the SQL or some other standards, I
> > would not object using them. I just don't have enough confidence.
> > For ISO-8859-5 to 8, and 16, I don't see well defined standards.
> 
> ISO-8859-16 *is* LATIN10, I just don't have the reference to prove it
> (I can look for it, if you want to).
> 
> ISO-8859-5 to 8 aren't latin scripts. From memory, 5 is cyrillic, 6 is
> arabic, 7 is greek, 8 is ??? (hebrew ?)...
> 
> So it would make sense to add LATIN10, still :)

If you were sure ISO-8859-16 == LATIN10, I could add it.

Ok, here is the modified encoding table (column1 is the standard name,
2 is our "official" name, and 3 is alias). If there's no objection, I
will change them.

ASCII        SQL_ASCII
UTF-8        UNICODE        UTF_8
MULE-INTERNAL    MULE_INTERNAL
ISO-8859-1    LATIN1        ISO_8859_1
ISO-8859-2    LATIN2        ISO_8859_2
ISO-8859-3    LATIN3        ISO_8859_3
ISO-8859-4    LATIN4        ISO_8859_4
ISO-8859-5    ISO_8859_5
ISO-8859-6    ISO_8859_6
ISO-8859-7    ISO_8859_7
ISO-8859-8    ISO_8859_8
ISO-8859-9    LATIN5        ISO_8859_9
ISO-8859-10    LATIN6        ISO_8859_10
ISO-8859-13    LATIN7        ISO_8859_13
ISO-8859-14    LATIN8        ISO_8859_14
ISO-8859-15    LATIN9        ISO_8859_15
ISO-8859-16    LATIN10        ISO_8859_16


Re: pg_client_encoding

From
Tatsuo Ishii
Date:
Done.

> Ok, here is the modified encoding table (column1 is the standard name,
> 2 is our "official" name, and 3 is alias). If there's no objection, I
> will change them.
> 
> ASCII        SQL_ASCII
> UTF-8        UNICODE        UTF_8
> MULE-INTERNAL    MULE_INTERNAL
> ISO-8859-1    LATIN1        ISO_8859_1
> ISO-8859-2    LATIN2        ISO_8859_2
> ISO-8859-3    LATIN3        ISO_8859_3
> ISO-8859-4    LATIN4        ISO_8859_4
> ISO-8859-5    ISO_8859_5
> ISO-8859-6    ISO_8859_6
> ISO-8859-7    ISO_8859_7
> ISO-8859-8    ISO_8859_8
> ISO-8859-9    LATIN5        ISO_8859_9
> ISO-8859-10    LATIN6        ISO_8859_10
> ISO-8859-13    LATIN7        ISO_8859_13
> ISO-8859-14    LATIN8        ISO_8859_14
> ISO-8859-15    LATIN9        ISO_8859_15
> ISO-8859-16    LATIN10        ISO_8859_16