Re: Internationalized error messages - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Internationalized error messages
Date
Msg-id 4741.984103209@sss.pgh.pa.us
Whole thread Raw
In response to Re: Internationalized error messages  (ncm@zembu.com (Nathan Myers))
Responses Re: Internationalized error messages
Re: Internationalized error messages
Re: Internationalized error messages
List pgsql-hackers
> On Thu, Mar 08, 2001 at 11:49:50PM +0100, Peter Eisentraut wrote:
>> I really feel that translated error messages need to happen soon.

Agreed.

ncm@zembu.com (Nathan Myers) writes:
> Similar approaches have been tried frequently, and even enshrined 
> in standards (e.g. POSIX catgets), but have almost always proven too
> cumbersome.  The problem is that keeping programs that interpret the 
> numeric code in sync with the program they monitor is hard, and trying 
> to avoid breaking all those secondary programs hinders development on 
> the primary program.  Furthermore, assigning code numbers is a nuisance,
> and they add uninformative clutter.  

There's a difficult tradeoff to make here, but I think we do want to
distinguish between the "official error code" --- the thing that has
translations into various languages --- and what the backend is actually
allowed to print out.  It seems to me that a fairly large fraction of
the unique messages found in the backend can all be lumped under the
category of "internal error", and that we need to have only one official
error code and one user-level translated message for the lot of them.
But we do want to be able to print out different detail messages for
each of those internal errors.  There are other categories that might be
lumped together, but that one alone is sufficiently large to force us
to recognize it.  This suggests a distinction between a "primary" or
"user-level" error message, which we catalog and provide translations
for, and a "secondary", "detail", or "wizard-level" error message that
exists only in the backend source code, and only in English, and so
can be made up on the spur of the moment.

Another thing that's bothered me for a long time is our inconsistent
approach to determining where in the code a message comes from.  A lot
of the messages currently embed the name of the generating routine right
into the error text.  Again, we ought to separate the functionality:
the source-code location is valuable but ought not form part of the
primary error message.  I would like to see elog() become a macro that
invokes __FILE__ and __LINE__ to automatically make the *exact* source
code location become part of the secondary error information, and then
drop the convention of using the routine name in the message text.

Something else we have talked about off-and-on is providing locator
information for errors that can be associated with a particular point in
the query string (lexical and syntactic errors).  This would probably be
best returned as a character index.

Another thing that I missed in Peter's proposal is how we are going to
cope with messages that include parameters.  Surely we do not expect
gettext to start with 'Attribute "foo" not found' and distinguish fixed
from variable parts of that string?

So it's clear that we need to devise a way of breaking an "error
message" into multiple portions, including:
Primary error message (localizable)Parameters to insert into error message (user identifiers, etc)Secondary (wizard)
errormessage (optional)Source code locationQuery text location (optional)
 

and perhaps others that I have forgotten about.  One of the key things
to think about is whether we can, or should try to, transmit all this
stuff in a backwards-compatible protocol.  That would mean we'd have
to dump all the info into a single string, which is doable but would
perhaps look pretty ugly:
ERROR: Attribute "foo" not found  -- basic message for dumb frontendsERRORCODE: UNREC_IDENT        -- key for finding
localizedmessagePARAM1: foo    -- something to embed in the localized messageMESSAGE: Attribute or table name not known
withincontext of queryCODELOC: src/backend/parser/parse_clause.c line 345QUERYLOC: 22
 

Alternatively we could suppress most of this stuff unless the frontend
specifically asks for it (and presumably is willing to digest it for
the user).

Bottom line for me is that if we are going to go to the trouble of
examining and changing every single elog() in the system, we should
try to get all of these issues cleaned up at once.  Let's not have to
go back and do it again later.
        regards, tom lane


pgsql-hackers by date:

Previous
From: ncm@zembu.com (Nathan Myers)
Date:
Subject: Re: Internationalized error messages
Next
From: Hiroshi Inoue
Date:
Subject: Re: How to handle waitingForLock in LockWaitCancel()