Thread: Re: [PATCHES] Patch for more readable parse error messages

Re: [PATCHES] Patch for more readable parse error messages

From
Jeroen van Vianen
Date:
At 00:56 22-02-00 +0100, Peter Eisentraut wrote:
>On 2000-02-20, Jeroen van Vianen mentioned:
>
> > The format of the error messages is changed to:
> >
> > jeroen=# create abc ( a int4, b int4 );
> > ERROR:  parser: parse error at or near "abc":
> > create abc ( a int4, b int4 )
> >         *
>
>I believe this is the wrong approach because it's highly psql specific. If
>you use PHP or JDBC or something not character cell based you will get
>misleading output.
>
>You might want to start thinking about putting a little more information
>into an ERROR than just a text string, such as error codes or
>supplementary information like this. psql could then choose to print a
>star, other interfaces might set the cursor to the specified place, etc.
>Much more flexible.

Good idea. As far as I understand things, libpq uses special datastructures 
to access the error code and message and it's up to the application (psql, 
and others) to do what it wants to do with it (let's say print the error). 
These structures might be enhanced with an error location, but this might 
be breaking things. And my question is how to do this.

Note that this location part is only filled now when yyerror() throws an 
error, but other parts of the backend might use a similar approach. OTOH it 
might be nice then to have every token know its location in the query 
string (as Don suggested), so you might end up with error messages like:

mydb-> select * from t1, t2 where ...
ERROR: table t2 not found:
select * from t1, t2 where ...                  ^

(which may be nice, or not).

What I see now is something like this (for psql):

psql sends a query
psql reads response        if response is error                get error location and find context in which error
occurred               print error message, with error location and context        otherwise                do what it
usedto do
 

and for the other interfaces nothing changes.

This is something I might be able to implement for 7.1.

What do you think?


Jeroen



Re: [HACKERS] Re: [PATCHES] Patch for more readable parse error messages

From
Tom Lane
Date:
Jeroen van Vianen <jeroen@design.nl> writes:
> What I see now is something like this (for psql):

> psql sends a query
> psql reads response
>          if response is error
>                  get error location and find context in which error occurred
>                  print error message, with error location and context
>          otherwise
>                  do what it used to do

> and for the other interfaces nothing changes.

> This is something I might be able to implement for 7.1.

This looks much better to me than doing it in the backend.  What still
needs a little thought is how to send back the error location from
backend to client app.

I'd be inclined to say that the location info should be imbedded as
text in the existing textual error message, rather than trying to add
a separate message with a machine-readable location value.  The first
way is much less likely to create compatibility problems with old client
apps.  One way to do it is to say that if the last line of the error
message looks like

Error-location: nnn

then libpq should recognize that, strip the line out of the saved
textual error message, and make the location value available through
a new API call.

The reason I suggest a label is that we could further extend this
protocol to handle some other things that people have been griping
about for a long time: providing identifying error code numbers that
client code could rely on instead of trying to match against the error
text, and separating out the info about which routine generated the
error (which is mighty handy for backend debugging but is useless
info for Joe Average user).  Someday the message being sent back
might look less like

ERROR: relation_info: Relation 12345 not found

and more like

ERROR: Failed to find relation OID 12345 in system catalogs
Error-code: 4242
Reporting-routine: relation_info, plancat.c line 543

of which only the first line is really meant for the user.

Of course, making that happen will be a lot of work, and I'm not
asking you to volunteer for it.  But what you do now should fit
in with further development of the error handling stuff...
        regards, tom lane


Re: [HACKERS] Re: [PATCHES] Patch for more readable parse error messages

From
Jeroen van Vianen
Date:
At 11:12 22-02-00 -0500, Tom Lane wrote:
>Jeroen van Vianen <jeroen@design.nl> writes:
> > What I see now is something like this (for psql):
>
> > psql sends a query
> > psql reads response
> >          if response is error
> >                  get error location and find context in which error 
> occurred
> >                  print error message, with error location and context
> >          otherwise
> >                  do what it used to do
>
> > and for the other interfaces nothing changes.
>
> > This is something I might be able to implement for 7.1.
>
>This looks much better to me than doing it in the backend.  What still
>needs a little thought is how to send back the error location from
>backend to client app.
>
>I'd be inclined to say that the location info should be imbedded as
>text in the existing textual error message, rather than trying to add
>a separate message with a machine-readable location value.  The first
>way is much less likely to create compatibility problems with old client
>apps.  One way to do it is to say that if the last line of the error
>message looks like
>
>Error-location: nnn
>
>then libpq should recognize that, strip the line out of the saved
>textual error message, and make the location value available through
>a new API call.

Isn't it possible to get this kind of information from a call to a new API 
struct errorinfo *  PQerrorInfo(conn) where the struct contains info about 
the error message, location and code, rather than a call to 
PQerrorMessage(conn) ?

>The reason I suggest a label is that we could further extend this
>protocol to handle some other things that people have been griping
>about for a long time: providing identifying error code numbers that
>client code could rely on instead of trying to match against the error
>text, and separating out the info about which routine generated the
>error (which is mighty handy for backend debugging but is useless
>info for Joe Average user).  Someday the message being sent back
>might look less like
>
>ERROR: relation_info: Relation 12345 not found
>
>and more like
>
>ERROR: Failed to find relation OID 12345 in system catalogs
>Error-code: 4242
>Reporting-routine: relation_info, plancat.c line 543
>
>of which only the first line is really meant for the user.

This might even allow the client app to write out a customized error 
message, instead of 'foreign key ... violation' write 'You cannot delete 
any ... when there are still ...', based upon error codes.

>Of course, making that happen will be a lot of work, and I'm not
>asking you to volunteer for it.  But what you do now should fit
>in with further development of the error handling stuff...



Jeroen



Re: [HACKERS] Re: [PATCHES] Patch for more readable parse error messages

From
Peter Eisentraut
Date:
On 2000-02-24, Jeroen van Vianen mentioned:

> At 11:12 22-02-00 -0500, Tom Lane wrote:

> >I'd be inclined to say that the location info should be imbedded as
> >text in the existing textual error message, rather than trying to add
> >a separate message with a machine-readable location value.

> Isn't it possible to get this kind of information from a call to a new API 
> struct errorinfo *  PQerrorInfo(conn) where the struct contains info about 
> the error message, location and code, rather than a call to 
> PQerrorMessage(conn) ?

IMHO, the use of error messages in PostgreSQL has a big conceptual
problem. It's only too tempting to write elog(ERROR, "I don't know what to
do now.") anywhere and any time. This is very convenient for the
developers but not very nice for client applications that want to
recognize, categorize, and recover from errors. There isn't even a clean
separation of perfectly normal user-level errors ("referential integrity
violation") and internal errors (bugs) ("can't attach node 718 to
T_ParseNodeFoo"). Sure, there's FATAL, but it's not always appropriate.

Chapter 22 of SQL92 defines error codes ("SQLSTATE") for (presumably)
every condition that could come up. It has classes and subclasses and
its code space is extensible. It would be very nice if we could classify
error messages in the backend according to that list and, say, do an
error(PGE_TRIGGERED_DATA_CHANGE_VIOLATION);

instead. The frontend could then call PQsqlstate(connection) to get this
code, or it could call something equivalent to strerror that would convert
this code to a string (potentially language-dependent even). If someone
wants to communicate an internal yet non-fatal error, there would be a
special code reserved for it, telling the client application that it might
as well forget about it. Legacy applications could still call
PQerrorMessage which would interally call the above two.

A necessary extension to the above would be a way to pass along supportive 
data. The tricky part will be to figure out a syntax that is not too
cumbersome, not too restrictive, and encourages help by the compiler. For
example,
error(PG_PARSE_ERROR, 2345)error(PG_PARSE_ERROR(2345))error(PG_PARSE_ERROR, errorIntData(2345))error(PG_INTERNAL,
errorStrData("I'mway lost"))
 

or something hopefully much better. If error() is made a macro, then we
could include file and line number and have some libpq accessor function
for them. Somehow, the client should also be able to access the "2345"
directly.

In any case, I believe that the actual error message string should be
assembled in the front-end. I'm not too fond of the idea of letting
clients parse out the interesting parts of an error out of a blob of text.

Comments? Anyone interested? This would be very dear to my heart so I'd be
very willing to spend a lot of time on it.

-- 
Peter Eisentraut                  Sernanders väg 10:115
peter_e@gmx.net                   75262 Uppsala
http://yi.org/peter-e/            Sweden



Re: [HACKERS] Re: [PATCHES] Patch for more readable parse error messages

From
Bruce Momjian
Date:
> In any case, I believe that the actual error message string should be
> assembled in the front-end. I'm not too fond of the idea of letting
> clients parse out the interesting parts of an error out of a blob of text.
> 
> Comments? Anyone interested? This would be very dear to my heart so I'd be
> very willing to spend a lot of time on it.

Vadim strongly believes in error mesage numbers.  We certainly should do
better, if only to print a code before the error code or something.

--  Bruce Momjian                        |  http://www.op.net/~candle pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: [HACKERS] Re: [PATCHES] Patch for more readable parse error messages

From
Tom Lane
Date:
Peter Eisentraut <peter_e@gmx.net> writes:
> IMHO, the use of error messages in PostgreSQL has a big conceptual
> problem. It's only too tempting to write elog(ERROR, "I don't know what to
> do now.") anywhere and any time. This is very convenient for the
> developers but not very nice for client applications that want to
> recognize, categorize, and recover from errors.

The vast majority of the one-off error messages are internal consistency
checks.  It seems to me that a workable compromise is to insist on
standardized error codes/texts for reporting user mistakes, but to
continue to allow spur-of-the-moment messages for internal errors.
Most or all internal errors would have the same classification anyway
from the point of view of an application trying to decide what to do,
so they could all share one or a few "error ID numbers".

> A necessary extension to the above would be a way to pass along supportive 
> data. The tricky part will be to figure out a syntax that is not too
> cumbersome, not too restrictive, and encourages help by the compiler.

A printf/elog-like syntax should still work --- the message catalog that
PGE_TRIGGERED_DATA_CHANGE_VIOLATION indexes into would contain strings
that still have %-escapes, but that shouldn't make life any more
difficult for internationalization.  And we do have the opportunity
to check mistakes with gcc, if we stick to the standard printf escapes.

Or do we?  Hmm ... not if the error message text isn't available at
the call site ...  Here's a thought: suppose that error code macros like
PGE_TRIGGERED_DATA_CHANGE_VIOLATION normally expand to an error code
number, which eventually gets used as an index into a localizable table
of error format strings; but we have the option to run with header files
that define all these macros as the actual error message literal
strings.  Then gcc could check for parameter mismatch in that case.
For development work that might even be the normal thing, and only
in production scenarios would you introduce the extra level of
indirection to get to an error message string.

> In any case, I believe that the actual error message string should be
> assembled in the front-end.

That will not work, because the set of possible error messages will
undoubtedly change with every backend release, and we do *not* want
to get into a situation where out-of-date clients mean you get no
error message (or worse, a wrong error message).  It will be better
to have the message table on the backend side.  As long as the backend
ships an identifying code number along with the message text, I think
that will satisfy the needs of applications to avoid reverse-parsing
error messages.

Other than that, I agree with everything you say ;-)

> Comments? Anyone interested? This would be very dear to my heart so I'd be
> very willing to spend a lot of time on it.

It will take a lot of time to clean this up, but I think everyone agrees
we need to do it.  It's just been a matter of someone taking on the job.
        regards, tom lane


Re: [HACKERS] Re: [PATCHES] Patch for more readable parse error messages

From
Don Baccus
Date:
At 06:50 PM 2/24/00 -0500, Bruce Momjian wrote:
>> In any case, I believe that the actual error message string should be
>> assembled in the front-end. I'm not too fond of the idea of letting
>> clients parse out the interesting parts of an error out of a blob of text.
>> 
>> Comments? Anyone interested? This would be very dear to my heart so I'd be
>> very willing to spend a lot of time on it.
>
>Vadim strongly believes in error mesage numbers.  We certainly should do
>better, if only to print a code before the error code or something.

I do, too.  Anyone else with a language implementation background is likely
to share that bias.

For starters ... you can at least imagine doing things like provide error
messages in languages other than English.  Actually...Vadim could probably
force the issue by commiting a version with all the error messages in
Russian!   Hmmm...wonder if he's thought of that? :)

And for applications it often makes a lot more sense to just get a 
defined code.

When I improved on the AOLserver driver for Postgres, one of my goals
was to have it survive the closing of a backend.  This gets less 
crucial with each bug fix, but, heck ... the backend still pees its
pants and crashes occasionally, let's face it.  In this case, the
driver wants to reestablish the connection to the backend (because
it's being managed as part of a persistent pool of connections by
the web server) but return an error.

Afterwards, all other backends close themselves and pass back a
delightfully wordy message that one should retry their query because
it didn't really crash, but rather is closing just in case shared
memory has been corrupted by the very naughty backend that really
did crash.  In this case, the driver wants to reconnect and 
retry the query, and if it succeeds return normally, with the
web server none the wiser.

(works great, BTW)

There's no documented way to distinguish between the two kinds of
backend closures that I could find.  Interpreting the string in
general seems to be how one is expected to probe to determine exactly
what has happened, not only in this case but with other errors, too.

This sucks, IMO.

It turns out there's a trivial way to distinguish these two particular
cases I mention, without resorting to looking at the actual error message,
but I think it illustrates the general kludginess of returning strings
with no error code.




- Don Baccus, Portland OR <dhogaza@pacifier.com> Nature photos, on-line guides, Pacific Northwest Rare Bird Alert
Serviceand other goodies at http://donb.photo.net.
 


Re: [HACKERS] Re: [PATCHES] Patch for more readable parse error messages

From
Don Baccus
Date:
At 07:17 PM 2/24/00 -0500, Tom Lane wrote:

>The vast majority of the one-off error messages are internal consistency
>checks.  It seems to me that a workable compromise is to insist on
>standardized error codes/texts for reporting user mistakes, but to
>continue to allow spur-of-the-moment messages for internal errors.
>Most or all internal errors would have the same classification anyway
>from the point of view of an application trying to decide what to do,
>so they could all share one or a few "error ID numbers".

I have no problem with this.  Why not just prepend them with an "internal"
error code?  Clients can't really do much other than gasp "omigosh!" when
confronted with an internal error anyway...

>Or do we?  Hmm ... not if the error message text isn't available at
>the call site ...  Here's a thought: suppose that error code macros like
>PGE_TRIGGERED_DATA_CHANGE_VIOLATION normally expand to an error code
>number, which eventually gets used as an index into a localizable table
>of error format strings; but we have the option to run with header files
>that define all these macros as the actual error message literal
>strings.  Then gcc could check for parameter mismatch in that case.
>For development work that might even be the normal thing, and only
>in production scenarios would you introduce the extra level of
>indirection to get to an error message string.

Something like this sounds like a fine.

>> In any case, I believe that the actual error message string should be
>> assembled in the front-end.
>
>That will not work, because the set of possible error messages will
>undoubtedly change with every backend release, and we do *not* want
>to get into a situation where out-of-date clients mean you get no
>error message (or worse, a wrong error message).  It will be better
>to have the message table on the backend side.

Yes, this is where it belongs.  An application gets an error number,
then asks for a message to go with it if it wants one.  Or, the
error's returned as an error code and message, either way.  

>  As long as the backend
>ships an identifying code number along with the message text, I think
>that will satisfy the needs of applications to avoid reverse-parsing
>error messages.

Yep.

>
>Other than that, I agree with everything you say ;-)
>
>> Comments? Anyone interested? This would be very dear to my heart so I'd be
>> very willing to spend a lot of time on it.
>
>It will take a lot of time to clean this up, but I think everyone agrees
>we need to do it.  It's just been a matter of someone taking on the job.

Go, Peter!



- Don Baccus, Portland OR <dhogaza@pacifier.com> Nature photos, on-line guides, Pacific Northwest Rare Bird Alert
Serviceand other goodies at http://donb.photo.net.