Re: [COMMITTERS] pgsql: Mark JSON error detail messages for translation. - Mailing list pgsql-hackers

From Tom Lane
Subject Re: [COMMITTERS] pgsql: Mark JSON error detail messages for translation.
Date
Msg-id 28472.1339552376@sss.pgh.pa.us
Whole thread Raw
In response to Re: [COMMITTERS] pgsql: Mark JSON error detail messages for translation.  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: [COMMITTERS] pgsql: Mark JSON error detail messages for translation.
List pgsql-hackers
I wrote:
> Alvaro Herrera <alvherre@commandprompt.com> writes:
>> I am not sure about the idea of letting the detail run to the end of the
>> line; that would be problematic should the line be long (there might not
>> be newlines in the literal at all, which is not that unusual).  I think
>> it should be truncated at, say, 76 chars or so.

> Yeah, I was wondering about trying to provide a given amount of context
> instead of fixing it to "one line".  We could do something like
> (1) back up N characters;
> (2) find the next newline, if there is one at least M characters before
> the error point;
> (3) print from there to the error point.

After experimenting with this for awhile I concluded that the above is
overcomplicated, and that we might as well just print up to N characters
of context; in most input, the line breaks are far enough apart that
preferentially breaking at them just leads to not having very much
context.  Also, it seems like it might be a good idea to present the
input as a CONTEXT line, because that provides more space; you can fit
50 or so characters of data without overrunning standard display width.
This gives me output like

regression=# select '{"unique1":8800,"unique2":0,"two":0,"four":0,"ten":0,"twenty":0,"hundred":0,"thousand":800,
"twothous
and":800,"fivethous":3800,"tenthous":8800,"odd":0,"even":1,
"stringu1":"MAAAAA","stringu2":"AAAAAA","string4":"AAAAxx"}'
::json;
ERROR:  invalid input syntax for type json
LINE 1: select '{"unique1":8800,"unique2":0,"two":0,"four":0,"ten":0...              ^
DETAIL:  Character with value "0x0a" must be escaped.
CONTEXT:  JSON data, line 1: ..."twenty":0,"hundred":0,"thousand":800,
"twothous

regression=# 

I can't give too many examples because I've only bothered to context-ify
this single error case as yet ;-) ... but does this seem like a sane way
to go?

The code for this is as attached.  Note that I'd rip out the normal-path
tracking of line boundaries; it seems better to have a second scan of
the data in the error case and save the cycles in non-error cases.

           ...           ereport(ERROR,                   (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
   errmsg("invalid input syntax for type json"),                    errdetail("Character with value \"0x%02x\" must be
escaped.",                             (unsigned char) *s),                    report_json_context(lex)));
...


/** Report a CONTEXT line for bogus JSON input.** The return value isn't meaningful, but we make it non-void so that
this*can be invoked inside ereport().*/
 
static int
report_json_context(JsonLexContext *lex)
{   char       *context_start;   char       *context_end;   char       *line_start;   int         line_number;   char
   *ctxt;   int         ctxtlen;
 
   /* Choose boundaries for the part of the input we will display */   context_start = lex->input;   context_end =
lex->token_terminator;  line_start = context_start;   line_number = 1;   for (;;)   {       /* Always advance over a
newline,unless it's the current token */       if (*context_start == '\n' && context_start < lex->token_start)       {
        context_start++;           line_start = context_start;           line_number++;           continue;       }
 /* Otherwise, done as soon as we are close enough to context_end */       if (context_end - context_start < 50)
  break;       /* Advance to next multibyte character */       if (IS_HIGHBIT_SET(*context_start))
context_start+= pg_mblen(context_start);       else           context_start++;   }
 
   /* Get a null-terminated copy of the data to present */   ctxtlen = context_end - context_start;   ctxt =
palloc(ctxtlen+ 1);   memcpy(ctxt, context_start, ctxtlen);   ctxt[ctxtlen] = '\0';
 
   return errcontext("JSON data, line %d: %s%s",                     line_number,                     (context_start >
line_start)? "..." : "",                     ctxt);
 
}

        regards, tom lane


pgsql-hackers by date:

Previous
From: Craig Ringer
Date:
Subject: Re: Minimising windows installer password confusion
Next
From: Robert Haas
Date:
Subject: Re: 9.3: load path to mitigate load penalty for checksums