Thread: Mention column name in error messages
Hi all, As far as I know, there is currently no way to find which column is triggering an error on an INSERT or ALTER COLUMN statement. Example: # create table foo(bar varchar(4), baz varchar(2)); CREATE TABLE # insert into foo values ('foo!', 'ok'); INSERT 0 1 # insert into foo values ('foo2!', 'ok'); ERROR: value too long for type character varying(4) # insert into foo values ('foo!', 'ok2'); ERROR: value too long for type character varying(2) I browsed the list and found a conversation from last year (http://www.postgresql.org/message-id/3214.1390155040@sss.pgh.pa.us) that discussed adding the actual value in the output. The behavior I am proposing differs in the sense we will be able to see in the "ERROR: xxxx" what column triggered the error: # create table foo(bar varchar(4), baz varchar(2)); CREATE TABLE # insert into foo values ('foo!', 'ok'); INSERT 0 1 # insert into foo values ('foo2!', 'ok'); ERROR: value too long for bar of type character varying(4) # insert into foo values ('foo!', 'ok2'); ERROR: value too long for baz of type character varying(2) In that same conversation, Tom Lane said: > [...] People have speculated about ways to > name the target column in the error report, which would probably be > far more useful; but it's not real clear how to do that in our system > structure. Given my very restricted knowledge of PG's codebase I am not sure whether my modifications are legitimate or not, so please don't hesitate to comment on it and pointing where things are subpar to PG's codebase. In any case, it's to be considered as WIP for the moment.
Thanks in advance, Franck
Attachment
Franck Verrot <franck@verrot.fr> writes: > As far as I know, there is currently no way to find which column is triggering > an error on an INSERT or ALTER COLUMN statement. Example: Indeed ... > Given my very restricted knowledge of PG's codebase I am not sure whether my > modifications are legitimate or not, so please don't hesitate to comment on > it and pointing where things are subpar to PG's codebase. In any case, it's > to be considered as WIP for the moment. I think the way you're going at this is fundamentally wrong. It amounts to an (undocumented) change to the API for datatype input functions, and the only way it can work is to change *every datatype's* input function to know about it. That's not practical. More, it doesn't cover errors that are thrown from someplace other than a datatype input function. Evaluation errors in domain check constraint expressions are an example that would be very hard to manage this way. And the approach definitely doesn't scale nicely when you start thinking about other contexts wherein a datatype input operation or coercion might fail. What seems more likely to lead to a usable patch is to arrange for the extra information you want to be emitted as error "context", via an error context callback that gets installed at the right times. We already have that set up for datatype-specific errors detected by COPY IN, for example. If you feed garbage data to COPY you get something like ERROR: invalid input syntax for integer: "foo" CONTEXT: COPY int8_tbl, line 2, column q2: "foo" with no need for int8in to be directly aware of the context. You should try adapting that methodology for the cases you're worried about. regards, tom lane
On Wed, Jul 1, 2015 at 12:30 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
What seems more likely to lead to a usable patch is to arrange for the
extra information you want to be emitted as error "context", via an error
context callback that gets installed at the right times. ...
...
with no need for int8in to be directly aware of the context. You should
try adapting that methodology for the cases you're worried about.
Hi Tom (and others),
Sorry it took so long for me to follow up on this, hopefully I found a couple
a hours today to try writing another patch.
In any case, thanks for reviewing my first attempt and taking time to write
such a detailed critique... I've learned a lot!
I am now using the error context callback stack. The current column name
and column type are passed to the callback packed inside a new structure
of type "TransformExprState".
Those information are then passed to `errhint` and will be presented to the
user later on (in case of coercion failure).
Please find the WIP patch attached.
(I've pushed the patch on my GH fork[1] too).
Thanks again,
Franck
Attachment
On Sun, Aug 9, 2015 at 11:44 AM, Franck Verrot <franck@verrot.fr> wrote: > On Wed, Jul 1, 2015 at 12:30 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> What seems more likely to lead to a usable patch is to arrange for the >> extra information you want to be emitted as error "context", via an error >> context callback that gets installed at the right times. ... >> ... >> with no need for int8in to be directly aware of the context. You should >> try adapting that methodology for the cases you're worried about. > > > Hi Tom (and others), > > Sorry it took so long for me to follow up on this, hopefully I found a > couple > a hours today to try writing another patch. > > In any case, thanks for reviewing my first attempt and taking time to write > such a detailed critique... I've learned a lot! > > I am now using the error context callback stack. The current column name > and column type are passed to the callback packed inside a new structure > of type "TransformExprState". > Those information are then passed to `errhint` and will be presented to the > user later on (in case of coercion failure). > > > Please find the WIP patch attached. > (I've pushed the patch on my GH fork[1] too). To make sure this doesn't get forgotten about, you may want to add it here: https://commitfest.postgresql.org/action/commitfest_view/open -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Sun, Aug 9, 2015 at 8:44 AM, Franck Verrot <franck@verrot.fr> wrote:
On Wed, Jul 1, 2015 at 12:30 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:What seems more likely to lead to a usable patch is to arrange for the
extra information you want to be emitted as error "context", via an error
context callback that gets installed at the right times. ...
...
with no need for int8in to be directly aware of the context. You should
try adapting that methodology for the cases you're worried about.
Hi Tom (and others),Sorry it took so long for me to follow up on this, hopefully I found a couplea hours today to try writing another patch.In any case, thanks for reviewing my first attempt and taking time to writesuch a detailed critique... I've learned a lot!I am now using the error context callback stack. The current column nameand column type are passed to the callback packed inside a new structureof type "TransformExprState".Those information are then passed to `errhint` and will be presented to theuser later on (in case of coercion failure).Please find the WIP patch attached.(I've pushed the patch on my GH fork[1] too).
Hi Franck,
I like the idea of having the column name.
I took this for a test drive, and had some comments.on the user visible parts.
The hints you add end in a new line, which then gives two new lines once they are emitted. This is contrary to how other HINTs are formatted.
Other HINTs are complete sentences (start with a capital letter, end with a period).
But I think these belong as CONTEXT or as DETAIL, not as HINT. The messages are giving me details about where (which column) the error occured, they aren't giving suggestions to me about what to do about it.
Cheers,
Jeff
Jeff Janes <jeff.janes@gmail.com> writes: > The hints you add end in a new line, which then gives two new lines once > they are emitted. This is contrary to how other HINTs are formatted. > Other HINTs are complete sentences (start with a capital letter, end with a > period). > But I think these belong as CONTEXT or as DETAIL, not as HINT. The > messages are giving me details about where (which column) the error > occured, they aren't giving suggestions to me about what to do about it. We have specific style guidelines about this: http://www.postgresql.org/docs/devel/static/error-message-reporting.html http://www.postgresql.org/docs/devel/static/error-style-guide.html regards, tom lane
On Wed, Aug 19, 2015 at 11:31 PM, Jeff Janes <jeff.janes@gmail.com> wrote:
I took this for a test drive, and had some comments.on the user visible parts.[...]But I think these belong as CONTEXT or as DETAIL, not as HINT. The messages are giving me details about where (which column) the error occurred, they aren't giving suggestions to me about what to do about it.
Hi Jeff,
Somehow my email never went through. So thank you for the test drive, I've tried to update my patch (that you will find attached) to reflect what you said (DETAIL instead of HINT).
Thanks Tom for pointing me at the docs, I clearly wasn't respecting the guidelines.
Cheers,
Franck
Attachment
Franck Verrot wrote: > On Wed, Aug 19, 2015 at 11:31 PM, Jeff Janes <jeff.janes@gmail.com> wrote: > > > > I took this for a test drive, and had some comments.on the user visible > > parts. > > [...] > > But I think these belong as CONTEXT or as DETAIL, not as HINT. The > > messages are giving me details about where (which column) the error > > occurred, they aren't giving suggestions to me about what to do about it. > > > > Hi Jeff, > > Somehow my email never went through. So thank you for the test drive, I've > tried to update my patch (that you will find attached) to reflect what you > said (DETAIL instead of HINT). I think you need errcontext(), not errdetail(). The most important difference is that you can have multiple CONTEXT lines but only one DETAIL; I think if you attach a second errdetail(), the first one is overwritten. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Tue, Sep 15, 2015 at 7:12 AM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
I think you need errcontext(), not errdetail(). The most important
difference is that you can have multiple CONTEXT lines but only one
DETAIL; I think if you attach a second errdetail(), the first one is
overwritten.
Indeed, the first errdetail() will be overwritten. Here's another try.
Thanks again for looking at my patches!
Franck Verrot
Attachment
On 2015-09-15 12:00:25 +0200, Franck Verrot wrote: > diff --git a/src/backend/parser/parse_target.c b/src/backend/parser/parse_target.c > index 1b3fcd6..78f82cd 100644 > --- a/src/backend/parser/parse_target.c > +++ b/src/backend/parser/parse_target.c > @@ -389,6 +389,17 @@ markTargetListOrigin(ParseState *pstate, TargetEntry *tle, > * omits the column name list. So we should usually prefer to use > * exprLocation(expr) for errors that can happen in a default INSERT. > */ > + > +void > +TransformExprCallback(void *arg) > +{ > + TransformExprState *state = (TransformExprState *) arg; > + > + errcontext("coercion failed for column \"%s\" of type %s", > + state->column_name, > + format_type_be(state->expected_type)); > +} Why is this not a static function? > Expr * > transformAssignedExpr(ParseState *pstate, > Expr *expr, > @@ -405,6 +416,14 @@ transformAssignedExpr(ParseState *pstate, > Oid attrcollation; /* collation of target column */ > ParseExprKind sv_expr_kind; > > + ErrorContextCallback errcallback; > + TransformExprState testate; Why the newline between the declarations? Why none afterwards? > diff --git a/src/include/parser/parse_target.h b/src/include/parser/parse_target.h > index a86b79d..5e12c12 100644 > --- a/src/include/parser/parse_target.h > +++ b/src/include/parser/parse_target.h > @@ -42,4 +42,11 @@ extern TupleDesc expandRecordVariable(ParseState *pstate, Var *var, > extern char *FigureColname(Node *node); > extern char *FigureIndexColname(Node *node); > > +/* Support for TransformExprCallback */ > +typedef struct TransformExprState > +{ > + const char *column_name; > + Oid expected_type; > +} TransformExprState; I see no need for this to be a struct defined in the header. Given that TransformExprCallback isn't public, and the whole thing is specific to TransformExprState... My major complaint though, is that this doesn't contain any tests... Could you update the patch to address these issues? Greetings, Andres Freund
On Sun, Oct 4, 2015 at 12:23 AM, Andres Freund <andres@anarazel.de> wrote: > On 2015-09-15 12:00:25 +0200, Franck Verrot wrote: >> diff --git a/src/backend/parser/parse_target.c b/src/backend/parser/parse_target.c >> index 1b3fcd6..78f82cd 100644 >> --- a/src/backend/parser/parse_target.c >> +++ b/src/backend/parser/parse_target.c >> @@ -389,6 +389,17 @@ markTargetListOrigin(ParseState *pstate, TargetEntry *tle, >> * omits the column name list. So we should usually prefer to use >> * exprLocation(expr) for errors that can happen in a default INSERT. >> */ >> + >> +void >> +TransformExprCallback(void *arg) >> +{ >> + TransformExprState *state = (TransformExprState *) arg; >> + >> + errcontext("coercion failed for column \"%s\" of type %s", >> + state->column_name, >> + format_type_be(state->expected_type)); >> +} > > Why is this not a static function? > >> Expr * >> transformAssignedExpr(ParseState *pstate, >> Expr *expr, >> @@ -405,6 +416,14 @@ transformAssignedExpr(ParseState *pstate, >> Oid attrcollation; /* collation of target column */ >> ParseExprKind sv_expr_kind; >> >> + ErrorContextCallback errcallback; >> + TransformExprState testate; > > Why the newline between the declarations? Why none afterwards? > >> diff --git a/src/include/parser/parse_target.h b/src/include/parser/parse_target.h >> index a86b79d..5e12c12 100644 >> --- a/src/include/parser/parse_target.h >> +++ b/src/include/parser/parse_target.h >> @@ -42,4 +42,11 @@ extern TupleDesc expandRecordVariable(ParseState *pstate, Var *var, >> extern char *FigureColname(Node *node); >> extern char *FigureIndexColname(Node *node); >> >> +/* Support for TransformExprCallback */ >> +typedef struct TransformExprState >> +{ >> + const char *column_name; >> + Oid expected_type; >> +} TransformExprState; > > I see no need for this to be a struct defined in the header. Given that > TransformExprCallback isn't public, and the whole thing is specific to > TransformExprState... > > > My major complaint though, is that this doesn't contain any tests... > > > Could you update the patch to address these issues? Patch is marked as returned with feedback, it has been two months since the last review, and comments have not been addressed. -- Michael
Thanks Andres for the review.
--
Michael, please find attached a revised patch addressing, amongst some other changes, the testing issue (`make check` passes now) and the visibility of the ` TransformExprState` struct.
Thanks,
Franck
On Tue, Dec 22, 2015 at 1:49 PM, Michael Paquier <michael.paquier@gmail.com> wrote:
Patch is marked as returned with feedback, it has been two monthsOn Sun, Oct 4, 2015 at 12:23 AM, Andres Freund <andres@anarazel.de> wrote:
> On 2015-09-15 12:00:25 +0200, Franck Verrot wrote:
>> diff --git a/src/backend/parser/parse_target.c b/src/backend/parser/parse_ target.c
>> index 1b3fcd6..78f82cd 100644
>> --- a/src/backend/parser/parse_target.c
>> +++ b/src/backend/parser/parse_target.c
>> @@ -389,6 +389,17 @@ markTargetListOrigin(ParseState *pstate, TargetEntry *tle,
>> * omits the column name list. So we should usually prefer to use
>> * exprLocation(expr) for errors that can happen in a default INSERT.
>> */
>> +
>> +void
>> +TransformExprCallback(void *arg)
>> +{
>> + TransformExprState *state = (TransformExprState *) arg;
>> +
>> + errcontext("coercion failed for column \"%s\" of type %s",
>> + state->column_name,
>> + format_type_be(state->expected_type));
>> +}
>
> Why is this not a static function?
>
>> Expr *
>> transformAssignedExpr(ParseState *pstate,
>> Expr *expr,
>> @@ -405,6 +416,14 @@ transformAssignedExpr(ParseState *pstate,
>> Oid attrcollation; /* collation of target column */
>> ParseExprKind sv_expr_kind;
>>
>> + ErrorContextCallback errcallback;
>> + TransformExprState testate;
>
> Why the newline between the declarations? Why none afterwards?
>
>> diff --git a/src/include/parser/parse_target.h b/src/include/parser/parse_ target.h
>> index a86b79d..5e12c12 100644
>> --- a/src/include/parser/parse_target.h
>> +++ b/src/include/parser/parse_target.h
>> @@ -42,4 +42,11 @@ extern TupleDesc expandRecordVariable(ParseState *pstate, Var *var,
>> extern char *FigureColname(Node *node);
>> extern char *FigureIndexColname(Node *node);
>>
>> +/* Support for TransformExprCallback */
>> +typedef struct TransformExprState
>> +{
>> + const char *column_name;
>> + Oid expected_type;
>> +} TransformExprState;
>
> I see no need for this to be a struct defined in the header. Given that
> TransformExprCallback isn't public, and the whole thing is specific to
> TransformExprState...
>
>
> My major complaint though, is that this doesn't contain any tests...
>
>
> Could you update the patch to address these issues?
since the last review, and comments have not been addressed.
--
Michael
Franck Verrot
Attachment
On Thu, Oct 6, 2016 at 2:58 PM, Franck Verrot <franck@verrot.fr> wrote: > Michael, please find attached a revised patch addressing, amongst some other > changes, the testing issue (`make check` passes now) and the visibility of > the ` TransformExprState` struct. + /* Set up callback to identify error line number. */ + errcallback.callback = TransformExprCallback; Er, no. You want to know at least column name here, not a line number *** /Users/mpaquier/git/postgres/src/test/regress/expected/xml_2.outMon Oct 17 11:32:26 2016 --- /Users/mpaquier/git/postgres/src/test/regress/results/xml.outMon Oct 17 15:58:42 2016 *************** *** 9,14 **** --- 9,15 ---- LINE 1: INSERT INTO xmltest VALUES (3, '<wrong'); ^ DETAIL: line 1:Couldn't find end of Start Tag wrong line 1 + CONTEXT: coercion failed for column "data" of type xml SELECT * FROM xmltest; id | data ----+-------------------- make check is still broken here. You did not update the expected output used when build is done with the --with-libxml switch. It would be good to check other cases as well. On top of that, I have found that a couple of other regression tests are broken in contrib/, citext being one. The context message is specifying only the column type and name. I think that it would be useful to provide as well the relation name. Imagine for example the case of a CTE using an INSERT query... Providing the query type would be also useful I think. You can look at state->p_is_insert for that. In short the context message could just have this shape: CONTEXT { INSERT | UPDATE } relname, column "col", type coltype. Andres wrote: > +/* Support for TransformExprCallback */ > +typedef struct TransformExprState > +{ > + const char *column_name; > + Oid expected_type; > +} TransformExprState; > I see no need for this to be a struct defined in the header. Given that > TransformExprCallback isn't public, and the whole thing is specific to > TransformExprState... That's a matter of taste, really. Personally I find cleaner to declare that with the other static declarations at the top of the fil, and keep the comments of transformAssignedExpr clean of everything. -- Michael
On Mon, Oct 17, 2016 at 3:18 AM, Michael Paquier <michael.paquier@gmail.com> wrote: > Andres wrote: >> +/* Support for TransformExprCallback */ >> +typedef struct TransformExprState >> +{ >> + const char *column_name; >> + Oid expected_type; >> +} TransformExprState; >> I see no need for this to be a struct defined in the header. Given that >> TransformExprCallback isn't public, and the whole thing is specific to >> TransformExprState... > > That's a matter of taste, really. Personally I find cleaner to declare > that with the other static declarations at the top of the fil, and > keep the comments of transformAssignedExpr clean of everything. It's pretty standard practice for PostgreSQL to keep declarations private to particular files whenever they are used only in that file. And I'd argue that it's good practice in general. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Wed, Oct 19, 2016 at 3:14 AM, Robert Haas <robertmhaas@gmail.com> wrote: > On Mon, Oct 17, 2016 at 3:18 AM, Michael Paquier > <michael.paquier@gmail.com> wrote: >> Andres wrote: >>> +/* Support for TransformExprCallback */ >>> +typedef struct TransformExprState >>> +{ >>> + const char *column_name; >>> + Oid expected_type; >>> +} TransformExprState; >>> I see no need for this to be a struct defined in the header. Given that >>> TransformExprCallback isn't public, and the whole thing is specific to >>> TransformExprState... >> >> That's a matter of taste, really. Personally I find cleaner to declare >> that with the other static declarations at the top of the fil, and >> keep the comments of transformAssignedExpr clean of everything. > > It's pretty standard practice for PostgreSQL to keep declarations > private to particular files whenever they are used only in that file. > And I'd argue that it's good practice in general. Yes, definitely. My comment was referring to the fact that the declaration of this structure was not at the top of this .c file as the last version of the patch is doing (would be better to declare them at the beginning of this .c file for clarity actually). It seems like I did not get Andres' review point, sorry for that. -- Michael
On Mon, Oct 17, 2016 at 12:48 PM, Michael Paquier <michael.paquier@gmail.com> wrote: > *** /Users/mpaquier/git/postgres/src/test/regress/expected/xml_2.out > Mon Oct 17 11:32:26 2016 > --- /Users/mpaquier/git/postgres/src/test/regress/results/xml.out > Mon Oct 17 15:58:42 2016 > *************** > *** 9,14 **** > --- 9,15 ---- > LINE 1: INSERT INTO xmltest VALUES (3, '<wrong'); > ^ > DETAIL: line 1: Couldn't find end of Start Tag wrong line 1 > + CONTEXT: coercion failed for column "data" of type xml > SELECT * FROM xmltest; > id | data > ----+-------------------- > make check is still broken here. You did not update the expected > output used when build is done with the --with-libxml switch. It would > be good to check other cases as well. > > On top of that, I have found that a couple of other regression tests > are broken in contrib/, citext being one. I've also tested with the patch. As Michael already pointed out, you should update the expected output for citext and xml regression tests. > > + /* Set up callback to identify error line number. */ > + errcallback.callback = TransformExprCallback; > Er, no. You want to know at least column name here, not a line number > Please change the comment accordingly. > The context message is specifying only the column type and name. I > think that it would be useful to provide as well the relation name. > Imagine for example the case of a CTE using an INSERT query... > Providing the query type would be also useful I think. You can look at > state->p_is_insert for that. In short the context message could just > have this shape: > CONTEXT { INSERT | UPDATE } relname, column "col", type coltype. > +1 for providing relation name in the context message. -- Thanks & Regards, Kuntal Ghosh EnterpriseDB: http://www.enterprisedb.com
On Wed, Oct 26, 2016 at 3:15 PM, Kuntal Ghosh <kuntalghosh.2007@gmail.com> wrote: > On Mon, Oct 17, 2016 at 12:48 PM, Michael Paquier > <michael.paquier@gmail.com> wrote: > >> *** /Users/mpaquier/git/postgres/src/test/regress/expected/xml_2.out >> Mon Oct 17 11:32:26 2016 >> --- /Users/mpaquier/git/postgres/src/test/regress/results/xml.out >> Mon Oct 17 15:58:42 2016 >> *************** >> *** 9,14 **** >> --- 9,15 ---- >> LINE 1: INSERT INTO xmltest VALUES (3, '<wrong'); >> ^ >> DETAIL: line 1: Couldn't find end of Start Tag wrong line 1 >> + CONTEXT: coercion failed for column "data" of type xml >> SELECT * FROM xmltest; >> id | data >> ----+-------------------- >> make check is still broken here. You did not update the expected >> output used when build is done with the --with-libxml switch. It would >> be good to check other cases as well. >> >> On top of that, I have found that a couple of other regression tests >> are broken in contrib/, citext being one. > I've also tested with the patch. As Michael already pointed out, you > should update the expected output for citext and xml regression tests. > >> >> + /* Set up callback to identify error line number. */ >> + errcallback.callback = TransformExprCallback; >> Er, no. You want to know at least column name here, not a line number >> > Please change the comment accordingly. > >> The context message is specifying only the column type and name. I >> think that it would be useful to provide as well the relation name. >> Imagine for example the case of a CTE using an INSERT query... >> Providing the query type would be also useful I think. You can look at >> state->p_is_insert for that. In short the context message could just >> have this shape: >> CONTEXT { INSERT | UPDATE } relname, column "col", type coltype. >> > +1 for providing relation name in the context message. Okay, so I have reworked the patch a bit and finished with the attached, adapting the context message to give more information. I have noticed as well a bug in the patch: the context callback was set before checking if the column used for transformation is checked on being a system column or not, the problem being that the callback could be called without the fields set. I have updated the regression tests that I found, the main portion of the patch being dedicated to that and being sure that all the alternate outputs are correctly refreshed. In this case int8, float4, float8, xml and contrib/citext are the ones impacted by the change with alternate outputs. I am passing that down to a committer for review. The patch looks large, but at 95% it involves diffs in the regression tests, alternative outputs taking a large role in the bloat. -- Michael
Attachment
On Sun, Oct 30, 2016 at 12:04 AM, Michael Paquier <michael.paquier@gmail.com> wrote:
Okay, so I have reworked the patch a bit and finished with the
attached, adapting the context message to give more information. I
have noticed as well a bug in the patch: the context callback was set
before checking if the column used for transformation is checked on
being a system column or not, the problem being that the callback
could be called without the fields set.
Interesting. I wasn't sure it was set at the right time indeed.
I have updated the regression tests that I found, the main portion of
the patch being dedicated to that and being sure that all the
alternate outputs are correctly refreshed. In this case int8, float4,
float8, xml and contrib/citext are the ones impacted by the change
with alternate outputs.
Thanks! I couldn't find time to update it last week, so thanks for chiming in.
I am passing that down to a committer for review. The patch looks
large, but at 95% it involves diffs in the regression tests,
alternative outputs taking a large role in the bloat.
Great, thanks Michael!
--
Franck
Michael Paquier <michael.paquier@gmail.com> writes: > I am passing that down to a committer for review. The patch looks > large, but at 95% it involves diffs in the regression tests, > alternative outputs taking a large role in the bloat. This is kind of cute, but it doesn't seem to cover very much territory, because it only catches errors that are found in the parse stage. For instance, it fails to cover Franck's original example: # create table foo(bar varchar(4), baz varchar(2)); CREATE TABLE # insert into foo values ('foo2!', 'ok'); ERROR: value too long for type character varying(4) That's still all you get, because the parser sets that up as a varchar literal with a runtime length coercion, and the failure doesn't occur till later. In this particular example it happens during the planner's constant-folding stage, but with a non-constant expression it would happen in the executor. Maybe it'd be all right to commit this anyway, but I'm afraid the most common reaction would be "why's it give me this info some of the time, but not when I really need it?" I'm inclined to think that an acceptable patch will need to provide context for the plan-time and run-time cases too, and I'm not very sure where would be a sane place to plug in for those cases. Less important comments: * I don't really see the point of including the column type name in the error message. We don't do that in the COPY context message this is based on. If we were going to print it, I should think we'd need the typmod as well --- eg, the difference between varchar(4) and varchar(4000) could be pretty relevant. * The coding order here is subtly wrong: + /* Set up callback to fetch more details regarding the error */ + errcallback.callback = TransformExprCallback; + errcallback.arg = (void *) &te_state; + errcallback.previous = error_context_stack; + error_context_stack = &errcallback; + te_state.relation_name = RelationGetRelationName(rd); + te_state.column_name = colname; + te_state.expected_type = attrtype; + te_state.is_insert = pstate->p_is_insert; The callback is "live" the instant you assign to error_context_stack; any data it depends on had better be valid before that. In this case you're effectively depending on the assumption that RelationGetRelationName can't throw an error. While it probably can't today, better style would be to set up te_state first, eg + /* Set up callback to fetch more details regarding the error */ + te_state.relation_name = RelationGetRelationName(rd); + te_state.column_name = colname; + te_state.expected_type = attrtype; + te_state.is_insert = pstate->p_is_insert; + errcallback.callback = TransformExprCallback; + errcallback.arg = (void *) &te_state; + errcallback.previous = error_context_stack; + error_context_stack = &errcallback; regards, tom lane
Michael Paquier <michael.paquier@gmail.com> writes:
> I am passing that down to a committer for review. The patch looks
> large, but at 95% it involves diffs in the regression tests,
> alternative outputs taking a large role in the bloat.
This is kind of cute, but it doesn't seem to cover very much territory,
because it only catches errors that are found in the parse stage.
For instance, it fails to cover Franck's original example:[...]
Maybe it'd be all right to commit this anyway, but I'm afraid the most
common reaction would be "why's it give me this info some of the time,
but not when I really need it?" I'm inclined to think that an acceptable
patch will need to provide context for the plan-time and run-time cases
too, and I'm not very sure where would be a sane place to plug in for
those cases.
Agreed.
David J.
On Sat, Nov 5, 2016 at 4:15 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > The callback is "live" the instant you assign to error_context_stack; > any data it depends on had better be valid before that. In this case > you're effectively depending on the assumption that > RelationGetRelationName can't throw an error. Argh, that's obvious. Thanks for the notice. -- Michael
I wrote: > Maybe it'd be all right to commit this anyway, but I'm afraid the most > common reaction would be "why's it give me this info some of the time, > but not when I really need it?" I'm inclined to think that an acceptable > patch will need to provide context for the plan-time and run-time cases > too, and I'm not very sure where would be a sane place to plug in for > those cases. I thought about that a little more. It would probably not be that hard to fix the plan-time-error case, because that type of error would happen while applying constant folding to the target list, ie basically here: parse->targetList = (List *) preprocess_expression(root, (Node *) parse->targetList, EXPRKIND_TARGET); Right now that processes the whole tlist indiscriminately, but it would not be very much additional code to make it iterate over the individual tlist elements and set a callback while processing one that corresponds directly to an insert or update target column. However, getting it to happen for run-time errors seems like a complete mess. The evaluation of the expression doesn't even happen in the ModifyTable node per se, but in some child plan node that has no idea that it's supplying data that's destined to get stored into a table column. I can think of ways around that --- the most obvious one is to invent some kind of wrapper expression node that has no impact on semantics but sets up an errcontext callback. But I'm sure Andres will object, because that would break his idea of getting rid of recursive-descent expression evaluation. And any way of doing this would probably create a measurable performance impact, which nobody would be happy about. So it seems to me that we have to decide whether supplying context only for assignment of constant expressions is good enough. If it isn't, we'd be best off to reject this patch rather than create an expectation among users that such context should be there. And personally, I think it probably isn't good enough --- non-constant expressions would be precisely the case where you most need some localization. The cases that are successfully annotated by the current patch seem to mostly already have error cursor information, which really is good enough IMO --- you can certainly figure out which column corresponds to the textual spot that the cursor is pointing at. I wonder whether we should not try to move in the direction of expanding the set of errors that we can provide error cursor info for. One aspect of that would be making sure that the text of the current statement is always available at runtime. I believe we do always have it somewhere now, but I'm not sure that it's passed through to the executor in any convenient way. The other aspect would then be to provide errlocation information based on the expression node that we're currently executing. That seems do-able probably. The overhead would likely consist of storing a pointer to the current node someplace where an error context callback could find it. Which is not zero cost, but I think it wouldn't be awful. regards, tom lane
I wrote: > ... I wonder whether we should > not try to move in the direction of expanding the set of errors that we > can provide error cursor info for. One aspect of that would be making > sure that the text of the current statement is always available at > runtime. I believe we do always have it somewhere now, but I'm not sure > that it's passed through to the executor in any convenient way. The > other aspect would then be to provide errlocation information based on > the expression node that we're currently executing. That seems do-able > probably. The overhead would likely consist of storing a pointer to the > current node someplace where an error context callback could find it. > Which is not zero cost, but I think it wouldn't be awful. Just to demonstrate what I'm talking about, I made a really dirty proof-of-concept patch, attached. With this, you get results like regression=# create table foo(bar varchar(4), baz varchar(2)); CREATE TABLE regression=# insert into foo values ('foo2!', 'ok'); ERROR: value too long for type character varying(4) LINE 1: insert into foo values ('foo2!', 'ok'); ^ which was the original complaint, and it also works for run-time cases: regression=# insert into foo values (random()::text, 'ok'); ERROR: value too long for type character varying(4) LINE 1: insert into foo values (random()::text, 'ok'); ^ If that seems like a satisfactory behavior, somebody could push forward with turning this into a committable patch. There's a lot to do: * I just used debug_query_string as the statement text source, which means it only works correctly for errors in directly client-issued statements. We'd need some work on appropriately passing the correct text down to execution. (It would be a good idea to decouple elog.c from debug_query_string while at it, I think.) * I didn't do anything about nested execution contexts. You'd want only the innermost one to set the cursor pointer, but as the code stands I think the outermost one would win. * Some attention needs to be paid to factorizing the support nicely --- multiple copies of the callback isn't good, and we might end up with many copies of the setup boilerplate. Maybe the parser's pcb_error_callback infrastructure would be a useful prototype. * I only bothered to hook in execution of function/operator nodes. That's probably a 90% solution already, but it's not meant to be the final version. Likely, doing anything about the last point should wait on Andres' work with linearizing expression execution. As this patch stands, you have to think carefully about where within an ExecEvalFoo function is the right place to set cur_expr, since it shouldn't get set till after control returns from subsidiary nodes. But those recursive calls would go away. regards, tom lane diff --git a/src/backend/executor/execQual.c b/src/backend/executor/execQual.c index 743e7d6..81aa0b8 100644 *** a/src/backend/executor/execQual.c --- b/src/backend/executor/execQual.c *************** *** 44,49 **** --- 44,50 ---- #include "executor/execdebug.h" #include "executor/nodeSubplan.h" #include "funcapi.h" + #include "mb/pg_wchar.h" #include "miscadmin.h" #include "nodes/makefuncs.h" #include "nodes/nodeFuncs.h" *************** *** 51,56 **** --- 52,58 ---- #include "parser/parse_coerce.h" #include "parser/parsetree.h" #include "pgstat.h" + #include "tcop/tcopprot.h" #include "utils/acl.h" #include "utils/builtins.h" #include "utils/date.h" *************** restart: *** 1772,1777 **** --- 1774,1782 ---- fcache->setArgsValid = false; } + /* Treat function as active beginning here */ + econtext->cur_expr = (ExprState *) fcache; + /* * Now call the function, passing the evaluated parameter values. */ *************** restart: *** 1969,1974 **** --- 1974,1980 ---- if (fcinfo->argnull[i]) { *isNull = true; + econtext->cur_expr = NULL; return (Datum) 0; } } *************** restart: *** 1983,1988 **** --- 1989,1996 ---- pgstat_end_function_usage(&fcusage, true); } + econtext->cur_expr = NULL; + return result; } *************** ExecMakeFunctionResultNoSets(FuncExprSta *** 2040,2045 **** --- 2048,2056 ---- } } + /* Treat function as active beginning here */ + econtext->cur_expr = (ExprState *) fcache; + pgstat_init_function_usage(fcinfo, &fcusage); fcinfo->isnull = false; *************** ExecMakeFunctionResultNoSets(FuncExprSta *** 2048,2053 **** --- 2059,2066 ---- pgstat_end_function_usage(&fcusage, true); + econtext->cur_expr = NULL; + return result; } *************** ExecPrepareExpr(Expr *node, EState *esta *** 5309,5314 **** --- 5322,5350 ---- * ---------------------------------------------------------------- */ + static void + ExprErrorCallback(void *arg) + { + ExprContext *econtext = (ExprContext *) arg; + + if (econtext->cur_expr) + { + int location = exprLocation((Node *) econtext->cur_expr->expr); + int pos; + + /* No-op if location was not provided */ + if (location < 0) + return; + /* Can't do anything if source text is not available */ + if (debug_query_string == NULL) + return; + /* Convert offset to character number */ + pos = pg_mbstrlen_with_len(debug_query_string, location) + 1; + /* And pass it to the ereport mechanism */ + errposition(pos); + } + } + /* ---------------------------------------------------------------- * ExecQual * *************** bool *** 5341,5346 **** --- 5377,5383 ---- ExecQual(List *qual, ExprContext *econtext, bool resultForNull) { bool result; + ErrorContextCallback errcallback; MemoryContext oldContext; ListCell *l; *************** ExecQual(List *qual, ExprContext *econte *** 5351,5356 **** --- 5388,5400 ---- EV_nodeDisplay(qual); EV_printf("\n"); + /* Setup error traceback support for ereport() */ + econtext->cur_expr = NULL; + errcallback.callback = ExprErrorCallback; + errcallback.arg = (void *) econtext; + errcallback.previous = error_context_stack; + error_context_stack = &errcallback; + /* * Run in short-lived per-tuple context while computing expressions. */ *************** ExecQual(List *qual, ExprContext *econte *** 5398,5403 **** --- 5442,5449 ---- MemoryContextSwitchTo(oldContext); + error_context_stack = errcallback.previous; + return result; } *************** ExecTargetList(List *targetlist, *** 5463,5472 **** --- 5509,5526 ---- ExprDoneCond *isDone) { Form_pg_attribute *att = tupdesc->attrs; + ErrorContextCallback errcallback; MemoryContext oldContext; ListCell *tl; bool haveDoneSets; + /* Setup error traceback support for ereport() */ + econtext->cur_expr = NULL; + errcallback.callback = ExprErrorCallback; + errcallback.arg = (void *) econtext; + errcallback.previous = error_context_stack; + error_context_stack = &errcallback; + /* * Run in short-lived per-tuple context while computing expressions. */ *************** ExecTargetList(List *targetlist, *** 5524,5529 **** --- 5578,5584 ---- */ *isDone = ExprEndResult; MemoryContextSwitchTo(oldContext); + error_context_stack = errcallback.previous; return false; } else *************** ExecTargetList(List *targetlist, *** 5587,5592 **** --- 5642,5648 ---- } MemoryContextSwitchTo(oldContext); + error_context_stack = errcallback.previous; return false; } } *************** ExecTargetList(List *targetlist, *** 5595,5600 **** --- 5651,5658 ---- /* Report success */ MemoryContextSwitchTo(oldContext); + error_context_stack = errcallback.previous; + return true; } *************** ExecProject(ProjectionInfo *projInfo, Ex *** 5616,5621 **** --- 5674,5680 ---- { TupleTableSlot *slot; ExprContext *econtext; + ErrorContextCallback errcallback; int numSimpleVars; /* *************** ExecProject(ProjectionInfo *projInfo, Ex *** 5629,5634 **** --- 5688,5700 ---- slot = projInfo->pi_slot; econtext = projInfo->pi_exprContext; + /* Setup error traceback support for ereport() */ + econtext->cur_expr = NULL; + errcallback.callback = ExprErrorCallback; + errcallback.arg = (void *) econtext; + errcallback.previous = error_context_stack; + error_context_stack = &errcallback; + /* Assume single result row until proven otherwise */ if (isDone) *isDone = ExprSingleResult; *************** ExecProject(ProjectionInfo *projInfo, Ex *** 5714,5722 **** --- 5780,5793 ---- slot->tts_isnull, projInfo->pi_itemIsDone, isDone)) + { + error_context_stack = errcallback.previous; return slot; /* no more result rows, return empty slot */ + } } + error_context_stack = errcallback.previous; + /* * Successfully formed a result row. Mark the result slot as containing a * valid virtual tuple. diff --git a/src/backend/optimizer/util/clauses.c b/src/backend/optimizer/util/clauses.c index 1688310..583cf43 100644 *** a/src/backend/optimizer/util/clauses.c --- b/src/backend/optimizer/util/clauses.c *************** *** 29,34 **** --- 29,35 ---- #include "executor/executor.h" #include "executor/functions.h" #include "funcapi.h" + #include "mb/pg_wchar.h" #include "miscadmin.h" #include "nodes/makefuncs.h" #include "nodes/nodeFuncs.h" *************** sql_inline_error_callback(void *arg) *** 4698,4703 **** --- 4699,4727 ---- errcontext("SQL function \"%s\" during inlining", callback_arg->proname); } + static void + ExprErrorCallback(void *arg) + { + ExprContext *econtext = (ExprContext *) arg; + + if (econtext->cur_expr) + { + int location = exprLocation((Node *) econtext->cur_expr->expr); + int pos; + + /* No-op if location was not provided */ + if (location < 0) + return; + /* Can't do anything if source text is not available */ + if (debug_query_string == NULL) + return; + /* Convert offset to character number */ + pos = pg_mbstrlen_with_len(debug_query_string, location) + 1; + /* And pass it to the ereport mechanism */ + errposition(pos); + } + } + /* * evaluate_expr: pre-evaluate a constant expression * *************** evaluate_expr(Expr *expr, Oid result_typ *** 4710,4715 **** --- 4734,4741 ---- { EState *estate; ExprState *exprstate; + ExprContext *econtext; + ErrorContextCallback errcallback; MemoryContext oldcontext; Datum const_val; bool const_is_null; *************** evaluate_expr(Expr *expr, Oid result_typ *** 4733,4738 **** --- 4759,4772 ---- */ exprstate = ExecInitExpr(expr, NULL); + /* Setup error traceback support for ereport() */ + econtext = GetPerTupleExprContext(estate); + econtext->cur_expr = NULL; + errcallback.callback = ExprErrorCallback; + errcallback.arg = (void *) econtext; + errcallback.previous = error_context_stack; + error_context_stack = &errcallback; + /* * And evaluate it. * *************** evaluate_expr(Expr *expr, Oid result_typ *** 4742,4750 **** * not depend on context, by definition, n'est ce pas? */ const_val = ExecEvalExprSwitchContext(exprstate, ! GetPerTupleExprContext(estate), &const_is_null, NULL); /* Get info needed about result datatype */ get_typlenbyval(result_type, &resultTypLen, &resultTypByVal); --- 4776,4786 ---- * not depend on context, by definition, n'est ce pas? */ const_val = ExecEvalExprSwitchContext(exprstate, ! econtext, &const_is_null, NULL); + error_context_stack = errcallback.previous; + /* Get info needed about result datatype */ get_typlenbyval(result_type, &resultTypLen, &resultTypByVal); diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h index f6f73f3..a390206 100644 *** a/src/include/nodes/execnodes.h --- b/src/include/nodes/execnodes.h *************** typedef struct ExprContext *** 147,152 **** --- 147,155 ---- Datum domainValue_datum; bool domainValue_isNull; + /* Currently executing expression node state, or NULL if indeterminate */ + struct ExprState *cur_expr; + /* Link to containing EState (NULL if a standalone ExprContext) */ struct EState *ecxt_estate;
On Sat, Nov 5, 2016 at 11:13 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
The cases that are successfully annotated by the current patch seem to
mostly already have error cursor information, which really is good enough
IMO --- you can certainly figure out which column corresponds to the
textual spot that the cursor is pointing at.
The original intent of that patch tried to cover the case where we insert records
made of dozens columns sharing the same type definition, and trying to understand
what is going on, at a glance, when we debugged something like this:
# create table probes (
id int,
pin_1 varchar(2),
pin_2 varchar(2),
...
pin_19 varchar(2),
pin_20 varchar(2));
CREATE TABLE
# insert into probes (
pin_1,
pin_2,
...
pin_19,
pin_20)
values ( <only valid values> );
INSERT 0 1
# insert into probes (
pin_1,
pin_2,
...
pin_19,
pin_20)
values ( <values, some subjects to type casting errors> );
ERROR: value too long for type character varying(2)
Relying on the cursor seems to be of little help I'm afraid.
Thanks for having looked into that, very useful to try understanding all
the mechanisms that are involved to make that happen.
Franck
Franck Verrot
Franck Verrot <franck@verrot.fr> writes: > On Sat, Nov 5, 2016 at 11:13 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> The cases that are successfully annotated by the current patch seem to >> mostly already have error cursor information, which really is good enough >> IMO --- you can certainly figure out which column corresponds to the >> textual spot that the cursor is pointing at. > The original intent of that patch tried to cover the case where we insert > records > made of dozens columns sharing the same type definition, and trying to > understand > what is going on, at a glance, when we debugged something like this: > ... > Relying on the cursor seems to be of little help I'm afraid. Well, it would be an improvement over what we've got now. Also, a feature similar to what I suggested would help in localizing many types of errors that have nothing to do with coercion to a target column type. regards, tom lane
On Sun, Nov 6, 2016 at 1:13 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> The original intent of that patch tried to cover the case where we insert
> records
> made of dozens columns sharing the same type definition, and trying to
> understand
> what is going on, at a glance, when we debugged something like this:
> ...
> Relying on the cursor seems to be of little help I'm afraid.
Well, it would be an improvement over what we've got now. Also, a feature
similar to what I suggested would help in localizing many types of errors
that have nothing to do with coercion to a target column type.
Yes, it's a neat improvement in any case.
Franck Verrot
On Mon, Nov 7, 2016 at 10:26 AM, Franck Verrot <franck@verrot.fr> wrote: > > > On Sun, Nov 6, 2016 at 1:13 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> >> > The original intent of that patch tried to cover the case where we >> > insert >> > records >> > made of dozens columns sharing the same type definition, and trying to >> > understand >> > what is going on, at a glance, when we debugged something like this: >> > ... >> > Relying on the cursor seems to be of little help I'm afraid. >> >> Well, it would be an improvement over what we've got now. Also, a feature >> similar to what I suggested would help in localizing many types of errors >> that have nothing to do with coercion to a target column type. > > > Yes, it's a neat improvement in any case. I have marked this patch as "Returned with Feedback" as it got its share of reviews. -- Michael