Home > mailing lists

Re: [bug fix] strerror() returns ??? in a UTF-8/C database with LC_MESSAGES=non-ASCII - Mailing list pgsql-hackers

From	MauMau
Subject	Re: [bug fix] strerror() returns ??? in a UTF-8/C database with LC_MESSAGES=non-ASCII
Date	September 7, 2013 11:04:19
Msg-id	76E87134576A4731A8AAFE70DD4D3910@maumau Whole thread Raw
In response to	Re: [bug fix] strerror() returns ??? in a UTF-8/C database with LC_MESSAGES=non-ASCII (Tom Lane <tgl@sss.pgh.pa.us>)
Responses	Re: [bug fix] strerror() returns ??? in a UTF-8/C database with LC_MESSAGES=non-ASCII
List	pgsql-hackers

Tree view

Thank you for your opinions and ideas.

From: "Tom Lane" <tgl@sss.pgh.pa.us>
> Greg Stark <stark@mit.edu> writes:
>> What would be nicer would be to display the C define, EINVAL, EPERM, etc.
>> Afaik there's no portable way to do that though. I suppose we could just
>> have a small array or hash table of all the errors we know about and look
>> it up.
>
> Yeah, I was just thinking the same thing.  We could do
>
> switch (errno)
> {
> case EINVAL: str = "EINVAL"; break;
> case ENOENT: str = "ENOENT"; break;
> ...
> #ifdef EFOOBAR
> case EFOOBAR: str = "EFOOBAR"; break;
> #endif
> ...
>
> for all the common or even less-common names, and only fall back on
> printing a numeric value if it's something really unusual.
>
> But I still maintain that we should only do this if we can't get a useful
> string out of strerror().

OK, I'll take this approach.  That is:

str = strerror(errnum);
if (str == NULL || *str == '\0' || *str == '?')
{switch (errnum){case EINVAL: str = "errno=EINVAL"; break;case ENOENT: str = "errno=ENOENT"; break;...#ifdef
EFOOBARcaseEFOOBAR: str = "EFOOBAR"; break;#endifdefault: snprintf(errorstr_buf, sizeof(errorstr_buf),    _("operating
systemerror %d"), errnum); str = errorstr_buf;}

}

The number of questionmarks probably depends on the original message, so I 
won't strcmp() against "???".

From: "Tom Lane" <tgl@sss.pgh.pa.us>
> There is certainly no way we'd risk back-patching something with as
> many potential side-effects as fooling with libc's textdomain.

Agreed.  It should be better to avoid making use of undocumented behavior 
(i.e. strerror() uses libc.mo), if we can take another approach.

> BTW: personally, I would say that what you're looking at is a glibc bug.
> I always thought the contract of gettext was to return the ASCII version
> if it fails to produce a translated version.  That might not be what the
> end user really wants to see, but surely returning something like "???"
> is completely useless to anybody.

I think so, too.  Under the same condition, PostgreSQL built with Oracle 
Studio on Solaris outputs correct Japanese for strerror(), and English is 
output on Windows.  I'll contact glibc team to ask for improvement.

From: "Tom Lane" <tgl@sss.pgh.pa.us>
> I dislike that on grounds of readability and translatability; and
> I'm also of the opinion that errno codes aren't really consistent
> enough across platforms to be all that trustworthy for remote diagnostic
> purposes.  I'm fine with printing the code if strerror fails to
> produce anything useful --- but not if it succeeds.

I don't think this is a concern, because we should ask trouble reporters 
about the operating system where they are running the database server.

From: "Tom Lane" <tgl@sss.pgh.pa.us>
> There isn't any way to cram this information
> into the current usage of %m without doing damage to the readability and
> translatability of the string.  Our style & translatability guidelines
> specifically recommend against assembling messages out of fragments,
> and also against sticking in parenthetical additions.

From: "Andres Freund" <andres@2ndquadrant.com>
> If we'd add the errno inside %m processing, I don't see how it's
> a problem for translation?

I'm for Andres.  I don't see any problem if we don't translate "errno=%d".

I'll submit a revised patch again next week.  However, I believe my original 
approach is better, because it outputs user-friendly Japanese message 
instead of "errno=ENOENT".  Plus, outputing both errno value and its 
descriptive text is more useful, because the former is convenient for 
OS/library experts and the latter is convenient for PostgreSQL users.  Any 
better idea would be much appreciated.

Regards
MauMau

pgsql-hackers by date:

From: Pavel Stehule
Date: 07 September 2013, 08:02:52
Subject: review: psql and pset without any arguments

From: Michael Paquier
Date: 07 September 2013, 11:46:07
Subject: Re: ENABLE/DISABLE CONSTRAINT NAME

Re: [bug fix] strerror() returns ??? in a UTF-8/C database with LC_MESSAGES=non-ASCII - Mailing list pgsql-hackers

Previous

Next