NLS vs error processing, again (was Re: Composite Type with Domain) - Mailing list pgsql-bugs

From Tom Lane
Subject NLS vs error processing, again (was Re: Composite Type with Domain)
Date
Msg-id 18913.1144161673@sss.pgh.pa.us
Whole thread Raw
In response to Re: Composite Type with Domain  (JiangWei <jw.pgsql@sduept.com>)
Responses Re: NLS vs error processing, again (was Re: Composite Type  (Euler Taveira de Oliveira <euler@timbira.com>)
Re: NLS vs error processing, again  (Tatsuo Ishii <ishii@sraoss.co.jp>)
List pgsql-bugs
JiangWei <jw.pgsql@sduept.com> writes:
>         LANG=zh_CN.UTF-8
> [ set client_encoding to LATIN1 and provoke an error ]

OK, I can reproduce the crash after initdb'ing with that LANG setting
(in an nls-enabled build).  The postmaster log fills with a whole lot
of occurrences of

警告:  忽略不能转换的 UTF-8 字符 0x00e9
警告:  忽略不能转换的 UTF-8 字符 0x00e8
警告:  忽略不能转换的 UTF-8 字符 0x00e8
警告:  忽略不能转换的 UTF-8 字符 0x00e8
比致命错误还过分的错误:  ERRORDATA_STACK_SIZE exceeded

Tracing through the dump shows that the error-handling code is
recursively producing this warning while trying to translate the word
WARNING to LATIN1.  The zh_CN.po file shows the translation as

#: utils/error/elog.c:1909
msgid "WARNING"
msgstr "¾¯¸æ"

(which apparently is GB2312?) and what's actually getting passed to
utf8_to_iso8859_1() is

(gdb) x/6o str
0x8b89d8:       0350    0255    0246    0345    0221    0212

I have no idea if this is a correct UTF8 transliteration of the GB2312
phrase --- can anyone confirm?  But anyway, if this is Chinese then it's
hardly surprising that there would be no LATIN1 equivalent.  And then
trying to report the problem gets us into a new instance of the same
problem.  Even the code that's supposed to stop error recursion doesn't
get us out of it.

It seems to me that there basically is no graceful solution to this sort
of mismatch.  It might be possible to kluge things so that we disable
NLS once we've recursed too many times in error processing, but that's
surely pretty ugly.  What would be a lot more user-friendly would be to
refuse the attempt to set client_encoding to something that can't handle
our error message encoding, but I don't know what a reasonable set of
restrictions would be.

Comments?

            regards, tom lane

pgsql-bugs by date:

Previous
From: JiangWei
Date:
Subject: Re: Composite Type with Domain
Next
From: Anthony Ransley
Date:
Subject: PostgreSQL 8.1.3.6044 crashes randomly.