Re: [bug fix] strerror() returns ??? in a UTF-8/C database with LC_MESSAGES=non-ASCII - Mailing list pgsql-hackers

From Noah Misch
Subject Re: [bug fix] strerror() returns ??? in a UTF-8/C database with LC_MESSAGES=non-ASCII
Date
Msg-id 20130910022847.GB219994@tornado.leadboat.com
Whole thread Raw
In response to Re: [bug fix] strerror() returns ??? in a UTF-8/C database with LC_MESSAGES=non-ASCII  ("MauMau" <maumau307@gmail.com>)
Responses Re: [bug fix] strerror() returns ??? in a UTF-8/C database withLC_MESSAGES=non-ASCII
List pgsql-hackers
On Tue, Sep 10, 2013 at 05:42:06AM +0900, MauMau wrote:
> From: "Tom Lane" <tgl@sss.pgh.pa.us>
>> Noah Misch <noah@leadboat.com> writes:
>>> ... I think
>>> MauMau's original bind_textdomain_codeset() proposal was on the right 
>>> track.
>>
>> It might well be.  My objection was to the proposal for back-patching it
>> when we have little idea of the possible side-effects.

Agreed.

> We are using 9.1/9.2 and 9.2 is probably dominant, so I would be relieved 
> with either of the following choices:
>
> 1. Take the approach that doesn't use bind_textdomain_codeset("libc") 
> (i.e. the second version of errno_str.patch) for 9.4 and older releases.
>
> 2. Use bind_textdomain_codeset("libc") (i.e. take strerror_codeset.patch) 
> for 9.4, and take the non-bind_textdomain_codeset approach for older  
> releases.

I like (2), at least at a high level.  The concept of errno_str.patch is safe
enough to back-patch.  One can verify that it only changes behavior when
strerror() returns NULL, an empty string, or something that begins with '?'.
I can't see resenting the change when that has happened.

Note that you can work around the problem today by linking PostgreSQL with a
better iconv() implementation.

Question-mark-damaged messages are not limited to strerror().  A combination
like lc_messages=ja_JP, encoding=LATIN1, lc_ctype=en_US will produce question
marks for PG and libc messages even with the bind_textdomain_codeset("libc")
change.  Is it worth doing anything about that?  That one looks self-inflicted
in comparison to the lc_messages=ja_JP, encoding=UTF8, lc_ctype=C case.

-- 
Noah Misch
EnterpriseDB                                 http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Peter Eisentraut
Date:
Subject: Re: Re: Privileges for INFORMATION_SCHEMA.SCHEMATA (was Re: [DOCS] Small clarification in "34.41. schemata")
Next
From: Tom Lane
Date:
Subject: Re: Protocol forced to V2 in low-memory conditions?