Re: Error message style guide - Mailing list pgsql-hackers

From Steve Crawford
Subject Re: Error message style guide
Date
Msg-id 20030315011010.78E09103C2@polaris.pinpointresearch.com
Whole thread Raw
In response to Error message style guide  (Peter Eisentraut <peter_e@gmx.net>)
Responses Re: Error message style guide
List pgsql-hackers
One thing that would be great from a user's perspective (and which might 
reduce the volume of support questions as well) is to uniquely number all 
errors as in:
Error 1036: the foo could not faz the fleep

The advantages of this include:
Ease of documentation: a manual could containg a section discussing each 
message. Similarly an error number could be used to easily access a web page 
discussing the error in more detail than a simple message allows.

Ease of searching: google searches like "postgresql error 1036" tend to yield 
lots of relevant information - I've found that including an error number 
where available in a google search yields far better results that searching 
with text alone.

Pinpointing trouble: unique IDs would mean that anyone looking into a 
specific problem would know exactly which line of code in PostgreSQL sent the 
error.

If one wants to get fancy the numbers could run in series depending on the 
category of error similar to http/smtp/ftp response codes.

Of course this would require appointing a keeper of the error codes who would 
dole them out as required to prevent dups.

Just a thought - now for a pint of Guinness.

Cheers,
Steve




On Friday 14 March 2003 4:43 pm, Peter Eisentraut wrote:
> Some people were mentioning an error message style guide.  Here's a start
> of one that I put together a while ago.  Feel free to consider it.
>
>
> Size of message
> ---------------
>
> The main part of a message should be at most 72 characters long.  For
> embedded format specifiers (%s, %d, etc.), a reasonable estimate of
> the expected string should be taken into account.  The rest should be
> distributed to the detail and the hint parts.
>
> RATIONALE: 72 characters is typically considered an appropriate line
> length on terminal-type displays. Consequently, this length is fair to
> psql users and readers of the server log.  Also, longer messages will
> tend to get chatty.
>
>
> Newlines, tabs
> --------------
>
> A message may not contain a newline or a tab.
>
> RATIONALE: Messages are not necessarily displayed on terminal-type
> displays.  In GUI displays or browsers these formatting intructions
> are at best ignored.
>
> QUESTION: I think formatting characters should be avoided in detail
> and hint messages as well, for the same reasons.
>
>
> Quotation marks
> ---------------
>
> English text should use double quotes when quoting is appropriate.
> Text in other languages should consistently use one kind of quotes
> that is consistent with publishing customs and computer output of
> other programs.
>
> RATIONALE: The choice of double quotes over single quotes is somewhat
> arbitrary, but tends to be the preferred use.  Do not distinguish the
> kind of quotes depending on the type of object in SQL terms (i.e.,
> strings single quoted, identifiers double quoted).  This is a
> language-internal technical issue that many users aren't even familiar
> with, it won't scale to all quoted terms, it doesn't translate to
> other languages, and it's pretty pointless, too.
>
>
> Use of quotes
> -------------
>
> Use quotes always to denote files, database objects, and other
> variables of a character-string nature.  Do not use them to mark up
> nonvariable items.
>
> RATIONALE: Objects can have names that create ambiguity when embedded
> in a message.  Be consistent about denoting where a plugged-in name
> starts and ends.
>
> NOTE: This format encourages embedding data items into the message in
> grammatical positions instead of the old style 'invalid value: bar'.
>
>
> Punctuation
> -----------
>
> Do not end the message with a period.  Do not even think about ending
> a message with an exclamation point.
>
> RATIONALE: Avoiding punctuation makes it easier for client
> applications to embed the message into a variety of grammatical
> contexts.  Often, messages are not grammatically complete sentences
> anyway.  (And if they're long enough to be more than one sentence,
> split them up.)
>
>
> Upper case vs. lower case
> -------------------------
>
> Use lower case for message wording, including the first letter of the
> message.  Use upper case for SQL commands and key words if the message
> refers to the command string.
>
> RATIONALE: It's easier to make everything look more consistent this
> way, since some messages are complete sentences and some not.
>
>
> Grammar
> -------
>
> Use the active voice.  Use complete sentences when there is an acting
> subject ("A could not do B").  Use telegram style without subject if
> the subject would be the program itself; do not use "I" for the
> program.
>
> RATIONALE: The program is not human.  Don't pretend otherwise.
>
> Instead of multiple sentences, consider using semicolons or commas.
>
> RATIONALE: This avoids peculiar punctuation if you follow the request
> to leave off the final period.
>
>
> Present vs past tense
> ---------------------
>
> There is a nontrivial semantic difference between sentences of the
> form
>
> | could not open file "%s"
>
> and
>
> | cannot open file "%s"
>
> The first one means that the attempt to open the file failed.  The
> message should give a reason, such as "disk full" or "file doesn't
> exist".  The past tense is appropriate because next time the disk
> might not be full anymore or the file in question may exist.
>
> The second form indicates the the functionality of opening the named
> file does not exist at all in the program, or that it's conceptually
> impossible.  The present tense is appropriate because the condition
> will persist indefinitely.
>
> RATIONALE: Granted, the average user will not be able to draw great
> conclusions merely from the tense of the message, but since the
> language provides us with a grammar we should use it correctly.
>
>
> Type of the object
> ------------------
>
> When citing the name of an object, state what kind of object it is.
>
> RATIONALE:  Else no one will know what "foo.bar.baaz" ist.
>
>
> Brackets
> --------
>
> Brackets are only to be used in command synopses to denote optional
> arguments, or to denote an array subscript.
>
> RATIONALE: Anything else does not correspond to widely-known customary
> usage and will confuse people.
>
>
> Parentheses
> -----------
>
> Parentheses can be used to separate subsentences when they are
>
> generated elsewhere.  For example:
> | could not open file %s (%m)
>
> RATIONALE: It would be difficult to account for all possible error codes
> to paste this into a single smooth sentence.  It also looks better and is
> more flexible than colons or dashes to separate the sentences
>
>
> Reasons for errors
> ------------------
>
> Messages should always state the reason for why an error occurred.
> For example:
>
> BAD: could not open file %s
> BETTER: could not open file %s (I/O failure)
>
> If the reason is not known you better fix the code. ;-)
>
>
> Tricky words to avoid
> ---------------------
>
> unable:
>
> "unable" is nearly the passive voice.  Better use "cannot" or "could
> not", as appropriate.
>
> bad:
>
> Error messages like "bad result" are really hard to interpret
> intelligently.  It's better to write why the result is "bad", e.g.,
> "invalid format".
>
> illegal:
>
> "Illegal" stands for a violation of the law, the rest is "invalid".
> Better yet, say why it's invalid.
>
> unknown:
>
> Try to avoid "unknown".  Consider, "error: unknown response".  If you
> don't know what the response is, how do you know it's erroneous?  If,
> however, the error lies in the fact that you don't know the response,
> this wording is clearly confusing.
>
>
> Function names
> --------------
>
> Rather than mentioning what the function or system call was that
> failed, describe what the function was trying to do, e.g., "could not
> open file".  This may admittedly be difficult to do with candidates
> such as "select()".
>
> RATIONALE: Users don't know what all those functions do.
>
>
> Find vs Exists
> --------------
>
> If the program uses a nontrivial algorithm to locate a resource (e.g.,
> a path search) and that algorithm fails, it is fair to say that the
> program couldn't "find" the resource.  If, on the other hand, the
> location of the resource is known and the program cannot locate it
> then just say that the resource doesn't "exist".  Using "find" in this
> case sounds weak and confuses the issue.
>
>
> Proper spelling
> ---------------
>
> Spell out words in full.  For instance, avoid:
>
> spec
> stats
> parens
> auth
> xact
>
> RATIONALE: This will improve consistency.


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: [INTERFACES] Upgrading the backend's error-message infrastructure
Next
From:
Date:
Subject: Re: [HACKERS] log_duration