Re: [COMMITTERS] pgsql: Mark JSON error detail messages for translation. - Mailing list pgsql-hackers
| From | Tom Lane |
|---|---|
| Subject | Re: [COMMITTERS] pgsql: Mark JSON error detail messages for translation. |
| Date | |
| Msg-id | 28472.1339552376@sss.pgh.pa.us Whole thread Raw |
| In response to | Re: [COMMITTERS] pgsql: Mark JSON error detail messages for translation. (Tom Lane <tgl@sss.pgh.pa.us>) |
| Responses |
Re: [COMMITTERS] pgsql: Mark JSON error detail messages for translation.
|
| List | pgsql-hackers |
I wrote:
> Alvaro Herrera <alvherre@commandprompt.com> writes:
>> I am not sure about the idea of letting the detail run to the end of the
>> line; that would be problematic should the line be long (there might not
>> be newlines in the literal at all, which is not that unusual). I think
>> it should be truncated at, say, 76 chars or so.
> Yeah, I was wondering about trying to provide a given amount of context
> instead of fixing it to "one line". We could do something like
> (1) back up N characters;
> (2) find the next newline, if there is one at least M characters before
> the error point;
> (3) print from there to the error point.
After experimenting with this for awhile I concluded that the above is
overcomplicated, and that we might as well just print up to N characters
of context; in most input, the line breaks are far enough apart that
preferentially breaking at them just leads to not having very much
context. Also, it seems like it might be a good idea to present the
input as a CONTEXT line, because that provides more space; you can fit
50 or so characters of data without overrunning standard display width.
This gives me output like
regression=# select '{"unique1":8800,"unique2":0,"two":0,"four":0,"ten":0,"twenty":0,"hundred":0,"thousand":800,
"twothous
and":800,"fivethous":3800,"tenthous":8800,"odd":0,"even":1,
"stringu1":"MAAAAA","stringu2":"AAAAAA","string4":"AAAAxx"}'
::json;
ERROR: invalid input syntax for type json
LINE 1: select '{"unique1":8800,"unique2":0,"two":0,"four":0,"ten":0... ^
DETAIL: Character with value "0x0a" must be escaped.
CONTEXT: JSON data, line 1: ..."twenty":0,"hundred":0,"thousand":800,
"twothous
regression=#
I can't give too many examples because I've only bothered to context-ify
this single error case as yet ;-) ... but does this seem like a sane way
to go?
The code for this is as attached. Note that I'd rip out the normal-path
tracking of line boundaries; it seems better to have a second scan of
the data in the error case and save the cycles in non-error cases.
... ereport(ERROR, (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
errmsg("invalid input syntax for type json"), errdetail("Character with value \"0x%02x\" must be
escaped.", (unsigned char) *s), report_json_context(lex)));
...
/** Report a CONTEXT line for bogus JSON input.** The return value isn't meaningful, but we make it non-void so that
this*can be invoked inside ereport().*/
static int
report_json_context(JsonLexContext *lex)
{ char *context_start; char *context_end; char *line_start; int line_number; char
*ctxt; int ctxtlen;
/* Choose boundaries for the part of the input we will display */ context_start = lex->input; context_end =
lex->token_terminator; line_start = context_start; line_number = 1; for (;;) { /* Always advance over a
newline,unless it's the current token */ if (*context_start == '\n' && context_start < lex->token_start) {
context_start++; line_start = context_start; line_number++; continue; }
/* Otherwise, done as soon as we are close enough to context_end */ if (context_end - context_start < 50)
break; /* Advance to next multibyte character */ if (IS_HIGHBIT_SET(*context_start))
context_start+= pg_mblen(context_start); else context_start++; }
/* Get a null-terminated copy of the data to present */ ctxtlen = context_end - context_start; ctxt =
palloc(ctxtlen+ 1); memcpy(ctxt, context_start, ctxtlen); ctxt[ctxtlen] = '\0';
return errcontext("JSON data, line %d: %s%s", line_number, (context_start >
line_start)? "..." : "", ctxt);
}
regards, tom lane
pgsql-hackers by date: