Re: [COMMITTERS] pgsql: Mark JSON error detail messages for translation. - Mailing list pgsql-hackers
From | Tom Lane |
---|---|
Subject | Re: [COMMITTERS] pgsql: Mark JSON error detail messages for translation. |
Date | |
Msg-id | 28472.1339552376@sss.pgh.pa.us Whole thread Raw |
In response to | Re: [COMMITTERS] pgsql: Mark JSON error detail messages for translation. (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: [COMMITTERS] pgsql: Mark JSON error detail messages for translation.
|
List | pgsql-hackers |
I wrote: > Alvaro Herrera <alvherre@commandprompt.com> writes: >> I am not sure about the idea of letting the detail run to the end of the >> line; that would be problematic should the line be long (there might not >> be newlines in the literal at all, which is not that unusual). I think >> it should be truncated at, say, 76 chars or so. > Yeah, I was wondering about trying to provide a given amount of context > instead of fixing it to "one line". We could do something like > (1) back up N characters; > (2) find the next newline, if there is one at least M characters before > the error point; > (3) print from there to the error point. After experimenting with this for awhile I concluded that the above is overcomplicated, and that we might as well just print up to N characters of context; in most input, the line breaks are far enough apart that preferentially breaking at them just leads to not having very much context. Also, it seems like it might be a good idea to present the input as a CONTEXT line, because that provides more space; you can fit 50 or so characters of data without overrunning standard display width. This gives me output like regression=# select '{"unique1":8800,"unique2":0,"two":0,"four":0,"ten":0,"twenty":0,"hundred":0,"thousand":800, "twothous and":800,"fivethous":3800,"tenthous":8800,"odd":0,"even":1, "stringu1":"MAAAAA","stringu2":"AAAAAA","string4":"AAAAxx"}' ::json; ERROR: invalid input syntax for type json LINE 1: select '{"unique1":8800,"unique2":0,"two":0,"four":0,"ten":0... ^ DETAIL: Character with value "0x0a" must be escaped. CONTEXT: JSON data, line 1: ..."twenty":0,"hundred":0,"thousand":800, "twothous regression=# I can't give too many examples because I've only bothered to context-ify this single error case as yet ;-) ... but does this seem like a sane way to go? The code for this is as attached. Note that I'd rip out the normal-path tracking of line boundaries; it seems better to have a second scan of the data in the error case and save the cycles in non-error cases. ... ereport(ERROR, (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION), errmsg("invalid input syntax for type json"), errdetail("Character with value \"0x%02x\" must be escaped.", (unsigned char) *s), report_json_context(lex))); ... /** Report a CONTEXT line for bogus JSON input.** The return value isn't meaningful, but we make it non-void so that this*can be invoked inside ereport().*/ static int report_json_context(JsonLexContext *lex) { char *context_start; char *context_end; char *line_start; int line_number; char *ctxt; int ctxtlen; /* Choose boundaries for the part of the input we will display */ context_start = lex->input; context_end = lex->token_terminator; line_start = context_start; line_number = 1; for (;;) { /* Always advance over a newline,unless it's the current token */ if (*context_start == '\n' && context_start < lex->token_start) { context_start++; line_start = context_start; line_number++; continue; } /* Otherwise, done as soon as we are close enough to context_end */ if (context_end - context_start < 50) break; /* Advance to next multibyte character */ if (IS_HIGHBIT_SET(*context_start)) context_start+= pg_mblen(context_start); else context_start++; } /* Get a null-terminated copy of the data to present */ ctxtlen = context_end - context_start; ctxt = palloc(ctxtlen+ 1); memcpy(ctxt, context_start, ctxtlen); ctxt[ctxtlen] = '\0'; return errcontext("JSON data, line %d: %s%s", line_number, (context_start > line_start)? "..." : "", ctxt); } regards, tom lane
pgsql-hackers by date: