Thread: enhanced error fields
Hello here is patch with enhancing ErrorData structure. Now constraints errors and RI uses these fields Regards Pavel Stehule
Attachment
On Wed, May 9, 2012 at 9:33 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote: > here is patch with enhancing ErrorData structure. Now constraints > errors and RI uses these fields Please add this to https://commitfest.postgresql.org/action/commitfest_view/open -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 9 May 2012 14:33, Pavel Stehule <pavel.stehule@gmail.com> wrote: > here is patch with enhancing ErrorData structure. Now constraints > errors and RI uses these fields So I took a look at the patch eelog-2012-05-09.diff today. All of the following remarks apply to it alone. The patch has bitrotted some, due to the fact that Tom bashed around ri_triggers.c somewhat in recent weeks. I took the opportunity to resolve merge conflicts, and attach a revised patch for everyone's convenience. The regression tests pass when the patch is applied. The first thing I noticed about the patch was that inline functions are used freely. While I personally don't find this unreasonable, we recently revisited the question of whether or not it is necessary to continue to support compilers that do not support something equivalent to GNU C inline functions (or that cannot be made to support them via macro hacks). The outcome of that was that it remains necessary to use macros to provide non-inline equivalent functions. For a good example of how that was dealt with quite extensively, see the sortsupport commit: http://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=c6e3ac11b60ac4a8942ab964252d51c1c0bd8845 In particular, take a look at the new file sortsupport.h . However, whatever we might some day allow with inline functions, the use of "extern inline", a C99 feature (or, according to some, anti-feature) seems like a nonstarter into the foreseeable future, unfortunately. Perhaps more to the point, are whatever performance benefit is to be had by the use of inline functions here actually going to pay for the maintenance cost? We already have functions declarations like this, in the same header as your proposed new extern inline functions: extern int geterrcode(void); This function's definition is trivial, and it would probably look like a good candidate for inlining to the compiler, if it had the chance (which, incidentally, it *still* may). However, why bother writing code to support (or encourage) this, considering that this is hardly a performance critical code-path, particularly unlikely to make any appreciable difference? Other style gripes: There are various points at which 80 columns are exceeded, contrary to our style guidelines. Sorry to have to mention it, but the comments need to be cleaned up (I am of course aware that English is not your first language, so that's not a big deal, and I don't think the content of the comments is lacking). The patch does essentially work as advertised. Sites that use the new infrastructure are limited to integrity constraint violations (which includes referential integrity constraint violations). So, based on your description, I'd have imagined that all of the sites using errcodes under this banner would have been covered: /* Class 23 - Integrity Constraint Violation */ This would be a reasonably well-defined place to require that this new infrastructure be used going forward (emphasis on the well-defined). Note that the authors of third-party database drivers define exception classes whose structure reflects these errcodes.h codes. To be inconsistent here seems unacceptable, since some future client of, say, pqxx (the example that I am personally most familiar with) might reasonably hope to always see some relation name when they call the e.relation_name() of some pqxx::integrity_constraint_violation object. If we were to commit the patch as-is, that would not be possible, because the following such sites that have not been touched: src/backend/commands/typecmds.c 2233: (errcode(ERRCODE_NOT_NULL_VIOLATION), src/backend/commands/tablecmds.c 3809: (errcode(ERRCODE_NOT_NULL_VIOLATION), src/backend/executor/execQual.c 3786: (errcode(ERRCODE_NOT_NULL_VIOLATION), src/backend/utils/adt/domains.c 126: (errcode(ERRCODE_NOT_NULL_VIOLATION), .... src/backend/utils/sort/tuplesort.c 3088: (errcode(ERRCODE_UNIQUE_VIOLATION), .... src/backend/commands/typecmds.c 2602: (errcode(ERRCODE_CHECK_VIOLATION), src/backend/commands/tablecmds.c 3823: (errcode(ERRCODE_CHECK_VIOLATION), 6633: (errcode(ERRCODE_CHECK_VIOLATION), src/backend/executor/execQual.c 3815: (errcode(ERRCODE_CHECK_VIOLATION), src/backend/utils/adt/domains.c 162: (errcode(ERRCODE_CHECK_VIOLATION), This function appears to be entirely vestigial, and can be removed, as it is never called: + extern inline int schema_table_column_error(const char *schema_name, const char *table_name, + const char *colname); This function is also vestigial and unused: + extern inline int rel_column_error(Oid table_oid, const char *colname); I have taken the liberty of removing both functions for you within the attached revision - I hope that's okay. Further gripes with the mechanism you've chosen: * Couldn't constraint_relation_error(Relation rel) just be implemented in terms of constraint_error(Relation rel, const char* cname)? * I doubt that relation_column_error() and friends belong in relcache.c at all, but if they do, then their prototypes belong in relcache.h, not rel.h * This seems rather broken to me: + static inline void + set_field(char **ptr, const char *str, bool overwrite) + { Why doesn't the function take "char *ptr" as its first argument? This looks like a modularity violation, since it would be perfectly possible to write the function as suggested. * Those functions use underscores rather than CamelCase, which is not consistent with the code that surround either the definitions or declarations. * ereport is used so frequently that it occurs to me that it would be nice to build some error-detection code into this expansion of the mechanism, to detect incorrect use (at least on debug configurations). So maybe constraint_relation_error() lives alongside errdetail() within elog.c. Maybe they return values of each function are some magic value, that is later read within errfinish(int dummy,...) via varargs. From there, on debug configurations, raise a warning if each argument is in a less than idiomatic order (so that we formalise the order). I'm not sure if that's worth the hassle of formalising, but it's worth considering, particularly as there are calls to make our error reporting mechanism more sophisticated. In the original thread on this (which was, regrettably, sidetracked by my tangential complaints about error severity), Robert expressed concerns about the performance impact of this patch, when plpgsql exception blocks were entered. I think it's a reasonable thing to be concerned about in general, and I'll address it when I address eelog-plpgsql-2012-05-09.diff separately. -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services
Attachment
On 2 July 2012 15:19, Peter Geoghegan <peter@2ndquadrant.com> wrote: > On 9 May 2012 14:33, Pavel Stehule <pavel.stehule@gmail.com> wrote: >> here is patch with enhancing ErrorData structure. Now constraints >> errors and RI uses these fields > > So I took a look at the patch eelog-2012-05-09.diff today. All of the > following remarks apply to it alone. I decided to follow through and take a look at eelog-plpgsql-2012-05-09.diff today too, while I have all of this swapped into my head. This patch is not an atomic unit - it builds upon the first patch. I successfully merged the local feature branch that I'd created for eelog-2012-05-09.diff without any merge conflicts, and I can build Postgres and get the regression tests to pass (including a couple of new ones, for this added functionality for plggsql - the functionality is testing exclusively using the new (9.2) "get stacked diagnostics" and "raise custom exception 'some_custom_exception' using...." feature). Since that feature branch had all my revisions committed, my observations about redundancies in the other base patch still stand - the 2 functions mentioned did not exist for the benefit of this further patch either. There is a typo here: + case PLPGSQL_RAISEOPTION_TRIGGER_SCHEMA: + printf(" TRIGGER_SCHENA = "); + break; } I'm not sure about this inconsistency within unreserved_keyword: For routines: + | K_DIAG_ROUTINE_NAME + | K_DIAG_ROUTINE_SCHEMA .... For triggers: + | K_DIAG_TRIGGER_NAME + | K_DIAG_TRIGGER_SCHEMA .... For tables: + | K_DIAG_SCHEMA_NAME . . **SNIP** . + | K_DIAG_TABLE_NAME The same inconsistency exists within the anonymous enum that contains PLPGSQL_GETDIAG_TABLE_NAME (and other constants), as well as the new token keywords within plpgsql's gram.y . The doc changes need a little work here too. I'm not sure that I agree with the extensive use of the term "routine" in all of these constants - sure, information_schema has a view called "routines". But wouldn't it be more appropriate to use a Postgres-centric term within our own code? So, what about the concern about performance taking a hit when plpgsql exception blocks are entered as a result of this patch? Well, while I think that an effort to reduce the overhead of PL exception handling would be worthwhile, these patches do not appear to alter things appreciable (though the overhead *is* measurable): [peter@peterlaptop eelog]$ ls exceptions.sql test_eelog_outer.sql Patch (eelog-plpgsql): [peter@peterlaptop eelog]$ pgbench -T 300 -f exceptions.sql -c 10 -n transaction type: Custom query scaling factor: 1 query mode: simple number of clients: 10 number of threads: 1 duration: 300 s number of transactions actually processed: 305756 tps = 1019.026055 (including connections establishing) tps = 1019.090135 (excluding connections establishing) Master: [peter@peterlaptop eelog]$ pgbench -T 300 -f exceptions.sql -c 10 -n transaction type: Custom query scaling factor: 1 query mode: simple number of clients: 10 number of threads: 1 duration: 300 s number of transactions actually processed: 308376 tps = 1027.908182 (including connections establishing) tps = 1027.977879 (excluding connections establishing) An archive with simple scripts for repeating this are attached, if anyone is interested. -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services
Attachment
2012/7/2 Peter Geoghegan <peter@2ndquadrant.com>: > On 2 July 2012 15:19, Peter Geoghegan <peter@2ndquadrant.com> wrote: >> On 9 May 2012 14:33, Pavel Stehule <pavel.stehule@gmail.com> wrote: >>> here is patch with enhancing ErrorData structure. Now constraints >>> errors and RI uses these fields >> >> So I took a look at the patch eelog-2012-05-09.diff today. All of the >> following remarks apply to it alone. > > I decided to follow through and take a look at > eelog-plpgsql-2012-05-09.diff today too, while I have all of this > swapped into my head. > > This patch is not an atomic unit - it builds upon the first patch. I > successfully merged the local feature branch that I'd created for > eelog-2012-05-09.diff without any merge conflicts, and I can build > Postgres and get the regression tests to pass (including a couple of > new ones, for this added functionality for plggsql - the functionality > is testing exclusively using the new (9.2) "get stacked diagnostics" > and "raise custom exception 'some_custom_exception' using...." > feature). > > Since that feature branch had all my revisions committed, my > observations about redundancies in the other base patch still stand - > the 2 functions mentioned did not exist for the benefit of this > further patch either. > > There is a typo here: > > + case PLPGSQL_RAISEOPTION_TRIGGER_SCHEMA: > + printf(" TRIGGER_SCHENA = "); > + break; > } > > > I'm not sure about this inconsistency within unreserved_keyword: > > For routines: > + | K_DIAG_ROUTINE_NAME > + | K_DIAG_ROUTINE_SCHEMA > .... > > For triggers: > + | K_DIAG_TRIGGER_NAME > + | K_DIAG_TRIGGER_SCHEMA > .... > > For tables: > + | K_DIAG_SCHEMA_NAME > . > . **SNIP** > . > + | K_DIAG_TABLE_NAME > > The same inconsistency exists within the anonymous enum that contains > PLPGSQL_GETDIAG_TABLE_NAME (and other constants), as well as the new > token keywords within plpgsql's gram.y . > > The doc changes need a little work here too. > > I'm not sure that I agree with the extensive use of the term "routine" > in all of these constants - sure, information_schema has a view called > "routines". But wouldn't it be more appropriate to use a > Postgres-centric term within our own code? > > So, what about the concern about performance taking a hit when plpgsql > exception blocks are entered as a result of this patch? Well, while I > think that an effort to reduce the overhead of PL exception handling > would be worthwhile, these patches do not appear to alter things > appreciable (though the overhead *is* measurable): > > [peter@peterlaptop eelog]$ ls > exceptions.sql test_eelog_outer.sql > > Patch (eelog-plpgsql): > > [peter@peterlaptop eelog]$ pgbench -T 300 -f exceptions.sql -c 10 -n > transaction type: Custom query > scaling factor: 1 > query mode: simple > number of clients: 10 > number of threads: 1 > duration: 300 s > number of transactions actually processed: 305756 > tps = 1019.026055 (including connections establishing) > tps = 1019.090135 (excluding connections establishing) > > Master: > > [peter@peterlaptop eelog]$ pgbench -T 300 -f exceptions.sql -c 10 -n > transaction type: Custom query > scaling factor: 1 > query mode: simple > number of clients: 10 > number of threads: 1 > duration: 300 s > number of transactions actually processed: 308376 > tps = 1027.908182 (including connections establishing) > tps = 1027.977879 (excluding connections establishing) > > An archive with simple scripts for repeating this are attached, if > anyone is interested. yes, I think so slowdown is not significant for usual situations and patterns. I got about 3% slowdown on code like begin insert into tab -- raise exception exception when others .. ... end; and only when all inserts fails. Regards Pavel > > -- > Peter Geoghegan http://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Training and Services
Hello Peter, thank you very much for review 2012/7/2 Peter Geoghegan <peter@2ndquadrant.com>: > On 9 May 2012 14:33, Pavel Stehule <pavel.stehule@gmail.com> wrote: >> here is patch with enhancing ErrorData structure. Now constraints >> errors and RI uses these fields > > So I took a look at the patch eelog-2012-05-09.diff today. All of the > following remarks apply to it alone. > > The patch has bitrotted some, due to the fact that Tom bashed around > ri_triggers.c somewhat in recent weeks. I took the opportunity to > resolve merge conflicts, and attach a revised patch for everyone's > convenience. > > The regression tests pass when the patch is applied. > > The first thing I noticed about the patch was that inline functions > are used freely. While I personally don't find this unreasonable, we > recently revisited the question of whether or not it is necessary to > continue to support compilers that do not support something equivalent > to GNU C inline functions (or that cannot be made to support them via > macro hacks). The outcome of that was that it remains necessary to use > macros to provide non-inline equivalent functions. For a good example > of how that was dealt with quite extensively, see the sortsupport > commit: using "inline" keyword is probably premature optimization in this context and better to remove it. This code is not on critical path. > > http://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=c6e3ac11b60ac4a8942ab964252d51c1c0bd8845 > > In particular, take a look at the new file sortsupport.h . > > However, whatever we might some day allow with inline functions, the > use of "extern inline", a C99 feature (or, according to some, > anti-feature) seems like a nonstarter into the foreseeable future, > unfortunately. Perhaps more to the point, are whatever performance > benefit is to be had by the use of inline functions here actually > going to pay for the maintenance cost? We already have functions > declarations like this, in the same header as your proposed new extern > inline functions: > > extern int geterrcode(void); > > This function's definition is trivial, and it would probably look like > a good candidate for inlining to the compiler, if it had the chance > (which, incidentally, it *still* may). However, why bother writing > code to support (or encourage) this, considering that this is hardly a > performance critical code-path, particularly unlikely to make any > appreciable difference? > > Other style gripes: There are various points at which 80 columns are > exceeded, contrary to our style guidelines. Sorry to have to mention > it, but the comments need to be cleaned up (I am of course aware that > English is not your first language, so that's not a big deal, and I > don't think the content of the comments is lacking). it is ok. I'll reformat my code. > > The patch does essentially work as advertised. Sites that use the new > infrastructure are limited to integrity constraint violations (which > includes referential integrity constraint violations). So, based on > your description, I'd have imagined that all of the sites using > errcodes under this banner would have been covered: > > /* Class 23 - Integrity Constraint Violation */ > > This would be a reasonably well-defined place to require that this new > infrastructure be used going forward (emphasis on the well-defined). > Note that the authors of third-party database drivers define exception > classes whose structure reflects these errcodes.h codes. To be > inconsistent here seems unacceptable, since some future client of, > say, pqxx (the example that I am personally most familiar with) might > reasonably hope to always see some relation name when they call the > e.relation_name() of some pqxx::integrity_constraint_violation object. > If we were to commit the patch as-is, that would not be possible, > because the following such sites that have not been touched: > > src/backend/commands/typecmds.c > 2233: (errcode(ERRCODE_NOT_NULL_VIOLATION), > > src/backend/commands/tablecmds.c > 3809: (errcode(ERRCODE_NOT_NULL_VIOLATION), > > src/backend/executor/execQual.c > 3786: (errcode(ERRCODE_NOT_NULL_VIOLATION), > > src/backend/utils/adt/domains.c > 126: (errcode(ERRCODE_NOT_NULL_VIOLATION), > > .... > > src/backend/utils/sort/tuplesort.c > 3088: (errcode(ERRCODE_UNIQUE_VIOLATION), > > .... > > src/backend/commands/typecmds.c > 2602: (errcode(ERRCODE_CHECK_VIOLATION), > > src/backend/commands/tablecmds.c > 3823: (errcode(ERRCODE_CHECK_VIOLATION), > 6633: (errcode(ERRCODE_CHECK_VIOLATION), > > src/backend/executor/execQual.c > 3815: (errcode(ERRCODE_CHECK_VIOLATION), > > src/backend/utils/adt/domains.c > 162: (errcode(ERRCODE_CHECK_VIOLATION), yes, it should be fixed > > This function appears to be entirely vestigial, and can be removed, as > it is never called: > > + extern inline int schema_table_column_error(const char *schema_name, > const char *table_name, > + const char *colname); > > This function is also vestigial and unused: > > + extern inline int rel_column_error(Oid table_oid, const char *colname); > > I have taken the liberty of removing both functions for you within the > attached revision - I hope that's okay. > > Further gripes with the mechanism you've chosen: > > * Couldn't constraint_relation_error(Relation rel) just be implemented > in terms of constraint_error(Relation rel, const char* cname)? > > * I doubt that relation_column_error() and friends belong in > relcache.c at all, but if they do, then their prototypes belong in > relcache.h, not rel.h > > * This seems rather broken to me: > > + static inline void > + set_field(char **ptr, const char *str, bool overwrite) > + { > > Why doesn't the function take "char *ptr" as its first argument? This > looks like a modularity violation, since it would be perfectly > possible to write the function as suggested. > No, it is not possible - I have to modify structure's fields - so I need addresses of pointers > * Those functions use underscores rather than CamelCase, which is not > consistent with the code that surround either the definitions or > declarations. It is hard question again - functions related to Relation usually use CamelCase, but functions related error processing use underscores - and I used it, because they used together with ereport. > > * ereport is used so frequently that it occurs to me that it would be > nice to build some error-detection code into this expansion of the > mechanism, to detect incorrect use (at least on debug configurations). > So maybe constraint_relation_error() lives alongside errdetail() > within elog.c. Maybe they return values of each function are some > magic value, that is later read within errfinish(int dummy,...) via > varargs. From there, on debug configurations, raise a warning if each > argument is in a less than idiomatic order (so that we formalise the > order). I'm not sure if that's worth the hassle of formalising, but > it's worth considering, particularly as there are calls to make our > error reporting mechanism more sophisticated. It is question. If I move constraint_relation_error to elog.c, then I have to include utils/rel.h to utils/elog.h. It was a issue previous version - criticised by Tom So current logic is - if you use "rel.h" and related structs, then you can set these fields in ErrorData. Regards Pavel > > In the original thread on this (which was, regrettably, sidetracked by > my tangential complaints about error severity), Robert expressed > concerns about the performance impact of this patch, when plpgsql > exception blocks were entered. I think it's a reasonable thing to be > concerned about in general, and I'll address it when I address > eelog-plpgsql-2012-05-09.diff separately. > > -- > Peter Geoghegan http://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Training and Services
2012/7/2 Peter Geoghegan <peter@2ndquadrant.com>: > On 2 July 2012 15:19, Peter Geoghegan <peter@2ndquadrant.com> wrote: >> On 9 May 2012 14:33, Pavel Stehule <pavel.stehule@gmail.com> wrote: >>> here is patch with enhancing ErrorData structure. Now constraints >>> errors and RI uses these fields >> >> So I took a look at the patch eelog-2012-05-09.diff today. All of the >> following remarks apply to it alone. > > I decided to follow through and take a look at > eelog-plpgsql-2012-05-09.diff today too, while I have all of this > swapped into my head. > > This patch is not an atomic unit - it builds upon the first patch. I > successfully merged the local feature branch that I'd created for > eelog-2012-05-09.diff without any merge conflicts, and I can build > Postgres and get the regression tests to pass (including a couple of > new ones, for this added functionality for plggsql - the functionality > is testing exclusively using the new (9.2) "get stacked diagnostics" > and "raise custom exception 'some_custom_exception' using...." > feature). > > Since that feature branch had all my revisions committed, my > observations about redundancies in the other base patch still stand - > the 2 functions mentioned did not exist for the benefit of this > further patch either. > > There is a typo here: > > + case PLPGSQL_RAISEOPTION_TRIGGER_SCHEMA: > + printf(" TRIGGER_SCHENA = "); > + break; > } > > > I'm not sure about this inconsistency within unreserved_keyword: > > For routines: > + | K_DIAG_ROUTINE_NAME > + | K_DIAG_ROUTINE_SCHEMA > .... > > For triggers: > + | K_DIAG_TRIGGER_NAME > + | K_DIAG_TRIGGER_SCHEMA > .... > > For tables: > + | K_DIAG_SCHEMA_NAME > . > . **SNIP** > . > + | K_DIAG_TABLE_NAME This inconsistency is based on ANSI SQL design - we can use own identifiers, but I prefer identifiers based on keyword strings. > > The same inconsistency exists within the anonymous enum that contains > PLPGSQL_GETDIAG_TABLE_NAME (and other constants), as well as the new > token keywords within plpgsql's gram.y . see standard, please :) - it is not consistent. But it use short and clean names, and it is relative widely used. > > The doc changes need a little work here too. please, if you can enhance documentation, please, do it. I am not native speaker. > > I'm not sure that I agree with the extensive use of the term "routine" > in all of these constants - sure, information_schema has a view called > "routines". But wouldn't it be more appropriate to use a > Postgres-centric term within our own code? for me - routine is general world for functions and procedures. So using routine in PostgreSQL is ok. And I believe so we will support procedures too some day. > > So, what about the concern about performance taking a hit when plpgsql > exception blocks are entered as a result of this patch? Well, while I > think that an effort to reduce the overhead of PL exception handling > would be worthwhile, these patches do not appear to alter things > appreciable (though the overhead *is* measurable): > > [peter@peterlaptop eelog]$ ls > exceptions.sql test_eelog_outer.sql > > Patch (eelog-plpgsql): > > [peter@peterlaptop eelog]$ pgbench -T 300 -f exceptions.sql -c 10 -n > transaction type: Custom query > scaling factor: 1 > query mode: simple > number of clients: 10 > number of threads: 1 > duration: 300 s > number of transactions actually processed: 305756 > tps = 1019.026055 (including connections establishing) > tps = 1019.090135 (excluding connections establishing) > > Master: > > [peter@peterlaptop eelog]$ pgbench -T 300 -f exceptions.sql -c 10 -n > transaction type: Custom query > scaling factor: 1 > query mode: simple > number of clients: 10 > number of threads: 1 > duration: 300 s > number of transactions actually processed: 308376 > tps = 1027.908182 (including connections establishing) > tps = 1027.977879 (excluding connections establishing) > > An archive with simple scripts for repeating this are attached, if > anyone is interested. thank you for performance testing. It verify my own testing. Thank you for review, Regards Pavel > > -- > Peter Geoghegan http://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Training and Services
Hi, On Monday, July 02, 2012 04:19:56 PM Peter Geoghegan wrote: > The first thing I noticed about the patch was that inline functions > are used freely. While I personally don't find this unreasonable, we > recently revisited the question of whether or not it is necessary to > continue to support compilers that do not support something equivalent > to GNU C inline functions (or that cannot be made to support them via > macro hacks). The outcome of that was that it remains necessary to use > macros to provide non-inline equivalent functions. For a good example > of how that was dealt with quite extensively, see the sortsupport > commit: > > http://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=c6e3ac11b60ac > 4a8942ab964252d51c1c0bd8845 > > In particular, take a look at the new file sortsupport.h . I actually find sortsupport.h not to be a good example in general because it duplicates the code. Unless youre dealing with minor amounts of code playing macro tricks like I did in the ilist stuff seems to be a better idea. Greetings, Andres -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Excerpts from Pavel Stehule's message of mar jul 03 12:26:57 -0400 2012: > 2012/7/2 Peter Geoghegan <peter@2ndquadrant.com>: > > * ereport is used so frequently that it occurs to me that it would be > > nice to build some error-detection code into this expansion of the > > mechanism, to detect incorrect use (at least on debug configurations). > > So maybe constraint_relation_error() lives alongside errdetail() > > within elog.c. Maybe they return values of each function are some > > magic value, that is later read within errfinish(int dummy,...) via > > varargs. From there, on debug configurations, raise a warning if each > > argument is in a less than idiomatic order (so that we formalise the > > order). I'm not sure if that's worth the hassle of formalising, but > > it's worth considering, particularly as there are calls to make our > > error reporting mechanism more sophisticated. > > It is question. If I move constraint_relation_error to elog.c, then I > have to include utils/rel.h to utils/elog.h. It was a issue previous > version - criticised by Tom Including rel.h in elog.h is really really bad. Even if it was only relcache.h it would be bad, because elog.h is included *everywhere*. > So current logic is - if you use "rel.h" and related structs, then you > can set these fields in ErrorData. Hm. These new functions do not operate on Relations at all, so having them on relcache.c doesn't seem to me very good ... How about putting the functions in elog.c as Peter suggests, and the declarations in a new header (say relationerror.h or something like that)? That new header would #include relcache.h so if you need to set those fields you include the new header and you have everything you need. -- Álvaro Herrera <alvherre@commandprompt.com> The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support
2012/7/3 Alvaro Herrera <alvherre@commandprompt.com>: > > Excerpts from Pavel Stehule's message of mar jul 03 12:26:57 -0400 2012: > >> 2012/7/2 Peter Geoghegan <peter@2ndquadrant.com>: > >> > * ereport is used so frequently that it occurs to me that it would be >> > nice to build some error-detection code into this expansion of the >> > mechanism, to detect incorrect use (at least on debug configurations). >> > So maybe constraint_relation_error() lives alongside errdetail() >> > within elog.c. Maybe they return values of each function are some >> > magic value, that is later read within errfinish(int dummy,...) via >> > varargs. From there, on debug configurations, raise a warning if each >> > argument is in a less than idiomatic order (so that we formalise the >> > order). I'm not sure if that's worth the hassle of formalising, but >> > it's worth considering, particularly as there are calls to make our >> > error reporting mechanism more sophisticated. >> >> It is question. If I move constraint_relation_error to elog.c, then I >> have to include utils/rel.h to utils/elog.h. It was a issue previous >> version - criticised by Tom > > Including rel.h in elog.h is really really bad. Even if it was only > relcache.h it would be bad, because elog.h is included *everywhere*. > >> So current logic is - if you use "rel.h" and related structs, then you >> can set these fields in ErrorData. > > Hm. These new functions do not operate on Relations at all, so having > them on relcache.c doesn't seem to me very good ... > > How about putting the functions in elog.c as Peter suggests, and the > declarations in a new header (say relationerror.h or something like > that)? That new header would #include relcache.h so if you need to set > those fields you include the new header and you have everything you > need. > it could be Pavel > -- > Álvaro Herrera <alvherre@commandprompt.com> > The PostgreSQL Company - Command Prompt, Inc. > PostgreSQL Replication, Consulting, Custom Development, 24x7 support
On 3 July 2012 17:26, Pavel Stehule <pavel.stehule@gmail.com> wrote: > Hello Peter, > > thank you very much for review No problem. I'll do some copy-editing of comments and doc changes when you produce another revision. -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services
Peter Geoghegan <peter@2ndquadrant.com> writes: > So I took a look at the patch eelog-2012-05-09.diff today. All of the > following remarks apply to it alone. I've been trying out this patch for my own interest (I'm very pleased to see work on this feature), and I have a couple of suggestions from a user's point of view. First: if a not null constraint is violated, the error report includes CONSTRAINT NAME 'not_null_violation'. I think I would find it more useful if CONSTRAINT NAME were left unset rather than given a value that doesn't correspond to a real constraint. A client program can tell it's a null constraint violation from the SQLSTATE. Second: in the case where a foreign-key constraint is violated by a change in the primary-key table, the error report gives the following information: TABLE NAME: name of primary-key table SCHEMA NAME: schema of primary-key table CONSTRAINT NAME: name of foreign-keyconstraint CONSTRAINT SCHEMA: schema of foreign-key table It doesn't include the name of the foreign-key table (except in the human-readable error message). But in principle you need to know that table name to reliably identify the constraint that was violated. I think what's going on is a mismatch between the way the constraint namespace works in the SQL standard and in PostgreSQL: it looks like the standard expects constraint names to be unique within a schema, while PostgreSQL only requires them to be unique within a table. (A similar issue makes information_schema less useful than the pg_ tables for foreign key constraints.) So I think it would be helpful to go beyond the standard in this case and include the foreign-key table name somewhere in the report. Possibly the enhanced-error reports could somehow add the table name to the string in the CONSTRAINT NAME field, so that the interface PostgreSQL provides looks like the one the standard envisages (which ought to make it easier to write cross-database client code). Or it might be simpler to just add a new enhanced-error field; I can imagine cases where that table name would be the main thing I'd be interested in. -M-
Hello 2012/7/3 Matthew Woodcraft <matthew@woodcraft.me.uk>: > Peter Geoghegan <peter@2ndquadrant.com> writes: >> So I took a look at the patch eelog-2012-05-09.diff today. All of the >> following remarks apply to it alone. > > I've been trying out this patch for my own interest (I'm very pleased to > see work on this feature), and I have a couple of suggestions from a > user's point of view. > > > First: if a not null constraint is violated, the error report includes > CONSTRAINT NAME 'not_null_violation'. I think I would find it more > useful if CONSTRAINT NAME were left unset rather than given a value that > doesn't correspond to a real constraint. A client program can tell it's > a null constraint violation from the SQLSTATE. > I don't think so generation some special name is good idea. In this case - important values are in COLUMN_NAME, TABLE_NAME, SCHEMA_NAME postgres=# create table ff(a int not null); CREATE TABLE postgres=# \set VERBOSITY verbose postgres=# insert into ff values(null); ERROR: 23502: null value in column "a" violates not-null constraint DETAIL: Failing row contains (null). LOCATION: ExecConstraints, execMain.c:1527 COLUMN NAME: a TABLE NAME: ff SCHEMA NAME: public CONSTRAINT NAME: not_null_violation CONSTRAINT SCHEMA: public > > Second: in the case where a foreign-key constraint is violated by a > change in the primary-key table, the error report gives the following > information: > > TABLE NAME: name of primary-key table > SCHEMA NAME: schema of primary-key table > CONSTRAINT NAME: name of foreign-key constraint > CONSTRAINT SCHEMA: schema of foreign-key table > postgres=# create table a1(a int primary key); NOTICE: 00000: CREATE TABLE / PRIMARY KEY will create implicit index "a1_pkey" for table "a1" LOCATION: DefineIndex, indexcmds.c:600 CREATE TABLE postgres=# create table a2(a int references a1(a)); CREATE TABLE postgres=# insert into a2 values(10); ERROR: 23503: insert or update on table "a2" violates foreign key constraint "a2_a_fkey" DETAIL: Key (a)=(10) is not present in table "a1". LOCATION: ri_ReportViolation, ri_triggers.c:3228 TABLE NAME: a2 SCHEMA NAME: public CONSTRAINT NAME: a2_a_fkey CONSTRAINT SCHEMA: public postgres=# \d a2 Table "public.a2"Column │ Type │ Modifiers ────────┼─────────┼───────────a │ integer │ Foreign-key constraints: "a2_a_fkey" FOREIGN KEY (a) REFERENCES a1(a) I agree so access to related table is not simple, but you know constraint name, and you can take referenced table from constraint definition. so any special column is not necessary. It can be done in future, but for this moment I would add only really necessary fields. > It doesn't include the name of the foreign-key table (except in the > human-readable error message). But in principle you need to know that > table name to reliably identify the constraint that was violated. > > I think what's going on is a mismatch between the way the constraint > namespace works in the SQL standard and in PostgreSQL: it looks like the > standard expects constraint names to be unique within a schema, while > PostgreSQL only requires them to be unique within a table. (A similar > issue makes information_schema less useful than the pg_ tables for > foreign key constraints.) > > So I think it would be helpful to go beyond the standard in this case > and include the foreign-key table name somewhere in the report. > > Possibly the enhanced-error reports could somehow add the table name to > the string in the CONSTRAINT NAME field, so that the interface > PostgreSQL provides looks like the one the standard envisages (which > ought to make it easier to write cross-database client code). same situation is with triggers I prefer add two new fields CONSTRAINT_TABLE and TRIGGER_TABLE so NAME, TABLE and SCHEMA is unique Regards and thank you for comments Pavel > > Or it might be simpler to just add a new enhanced-error field; I can > imagine cases where that table name would be the main thing I'd be > interested in. > > -M-
Excerpts from Pavel Stehule's message of mié jul 04 05:33:48 -0400 2012: > Hello > > 2012/7/3 Matthew Woodcraft <matthew@woodcraft.me.uk>: > > Peter Geoghegan <peter@2ndquadrant.com> writes: > >> So I took a look at the patch eelog-2012-05-09.diff today. All of the > >> following remarks apply to it alone. > > > > I've been trying out this patch for my own interest (I'm very pleased to > > see work on this feature), and I have a couple of suggestions from a > > user's point of view. > > > > > > First: if a not null constraint is violated, the error report includes > > CONSTRAINT NAME 'not_null_violation'. I think I would find it more > > useful if CONSTRAINT NAME were left unset rather than given a value that > > doesn't correspond to a real constraint. A client program can tell it's > > a null constraint violation from the SQLSTATE. > > > > I don't think so generation some special name is good idea. In this > case - important values are in COLUMN_NAME, TABLE_NAME, SCHEMA_NAME > > postgres=# create table ff(a int not null); > CREATE TABLE > postgres=# \set VERBOSITY verbose > postgres=# insert into ff values(null); > ERROR: 23502: null value in column "a" violates not-null constraint > DETAIL: Failing row contains (null). > LOCATION: ExecConstraints, execMain.c:1527 > COLUMN NAME: a > TABLE NAME: ff > SCHEMA NAME: public > CONSTRAINT NAME: not_null_violation > CONSTRAINT SCHEMA: public I think if you don't have a true constraint name to use here, you shouldn't use anything. When and if we get NOT NULL constraints catalogued, we can add a constraint name field as a new error field. In other words +1 for Matthew's opinion. -- Álvaro Herrera <alvherre@commandprompt.com> The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Alvaro Herrera <alvherre@commandprompt.com> writes: > I think if you don't have a true constraint name to use here, you > shouldn't use anything. Yeah, I agree. Don't invent a value, just omit the field. regards, tom lane
2012/7/4 Tom Lane <tgl@sss.pgh.pa.us>: > Alvaro Herrera <alvherre@commandprompt.com> writes: >> I think if you don't have a true constraint name to use here, you >> shouldn't use anything. > > Yeah, I agree. Don't invent a value, just omit the field. ok Pavel > > regards, tom lane
Hello there is a updated patch: * renamed auxiliary functions and moved it elog.c - header is new file "relerror.h" * new fields "constraint_table" and "trigger_table" - constraints and triggers are related to relation in pg, not just to schema * removed using implicit constraints without unique name * better coverage of enhancing errors in source code * removed "inline" keywords > > /* Class 23 - Integrity Constraint Violation */ > > This would be a reasonably well-defined place to require that this new > infrastructure be used going forward (emphasis on the well-defined). > Note that the authors of third-party database drivers define exception > classes whose structure reflects these errcodes.h codes. To be > inconsistent here seems unacceptable, since some future client of, > say, pqxx (the example that I am personally most familiar with) might > reasonably hope to always see some relation name when they call the > e.relation_name() of some pqxx::integrity_constraint_violation object. > If we were to commit the patch as-is, that would not be possible, > because the following such sites that have not been touched: > > > src/backend/executor/execQual.c > 3786: (errcode(ERRCODE_NOT_NULL_VIOLATION), > > src/backend/utils/adt/domains.c > 126: (errcode(ERRCODE_NOT_NULL_VIOLATION), > src/backend/executor/execQual.c > 3815: (errcode(ERRCODE_CHECK_VIOLATION), > > src/backend/utils/adt/domains.c > 162: (errcode(ERRCODE_CHECK_VIOLATION), > these exceptions are related to domains - we has not adequate fields now - and these fields are not in standards it needs some like DOMAIN_NAME and DOMAIN_SCHEMA ??? Regards Pavel
Attachment
Hello * fixed typo * support two new fields: constraint_table and trigger_table * routine_table, routine_schema, trigger_name, trigger_table, trigger_schema has value when exception coming from plpgsql. Regards Pavel 2012/7/2 Peter Geoghegan <peter@2ndquadrant.com>: > On 2 July 2012 15:19, Peter Geoghegan <peter@2ndquadrant.com> wrote: >> On 9 May 2012 14:33, Pavel Stehule <pavel.stehule@gmail.com> wrote: >>> here is patch with enhancing ErrorData structure. Now constraints >>> errors and RI uses these fields >> >> So I took a look at the patch eelog-2012-05-09.diff today. All of the >> following remarks apply to it alone. > > I decided to follow through and take a look at > eelog-plpgsql-2012-05-09.diff today too, while I have all of this > swapped into my head. > > This patch is not an atomic unit - it builds upon the first patch. I > successfully merged the local feature branch that I'd created for > eelog-2012-05-09.diff without any merge conflicts, and I can build > Postgres and get the regression tests to pass (including a couple of > new ones, for this added functionality for plggsql - the functionality > is testing exclusively using the new (9.2) "get stacked diagnostics" > and "raise custom exception 'some_custom_exception' using...." > feature). > > Since that feature branch had all my revisions committed, my > observations about redundancies in the other base patch still stand - > the 2 functions mentioned did not exist for the benefit of this > further patch either. > > There is a typo here: > > + case PLPGSQL_RAISEOPTION_TRIGGER_SCHEMA: > + printf(" TRIGGER_SCHENA = "); > + break; > } > > > I'm not sure about this inconsistency within unreserved_keyword: > > For routines: > + | K_DIAG_ROUTINE_NAME > + | K_DIAG_ROUTINE_SCHEMA > .... > > For triggers: > + | K_DIAG_TRIGGER_NAME > + | K_DIAG_TRIGGER_SCHEMA > .... > > For tables: > + | K_DIAG_SCHEMA_NAME > . > . **SNIP** > . > + | K_DIAG_TABLE_NAME > > The same inconsistency exists within the anonymous enum that contains > PLPGSQL_GETDIAG_TABLE_NAME (and other constants), as well as the new > token keywords within plpgsql's gram.y . > > The doc changes need a little work here too. > > I'm not sure that I agree with the extensive use of the term "routine" > in all of these constants - sure, information_schema has a view called > "routines". But wouldn't it be more appropriate to use a > Postgres-centric term within our own code? > > So, what about the concern about performance taking a hit when plpgsql > exception blocks are entered as a result of this patch? Well, while I > think that an effort to reduce the overhead of PL exception handling > would be worthwhile, these patches do not appear to alter things > appreciable (though the overhead *is* measurable): > > [peter@peterlaptop eelog]$ ls > exceptions.sql test_eelog_outer.sql > > Patch (eelog-plpgsql): > > [peter@peterlaptop eelog]$ pgbench -T 300 -f exceptions.sql -c 10 -n > transaction type: Custom query > scaling factor: 1 > query mode: simple > number of clients: 10 > number of threads: 1 > duration: 300 s > number of transactions actually processed: 305756 > tps = 1019.026055 (including connections establishing) > tps = 1019.090135 (excluding connections establishing) > > Master: > > [peter@peterlaptop eelog]$ pgbench -T 300 -f exceptions.sql -c 10 -n > transaction type: Custom query > scaling factor: 1 > query mode: simple > number of clients: 10 > number of threads: 1 > duration: 300 s > number of transactions actually processed: 308376 > tps = 1027.908182 (including connections establishing) > tps = 1027.977879 (excluding connections establishing) > > An archive with simple scripts for repeating this are attached, if > anyone is interested. > > -- > Peter Geoghegan http://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Training and Services
Attachment
Attached is a revision of this patch, with a few clean-ups, mostly to the wording of certain things. On 5 July 2012 17:41, Pavel Stehule <pavel.stehule@gmail.com> wrote: > * renamed auxiliary functions and moved it elog.c - header is new file > "relerror.h" In my revision, I've just added a pre-declaration and removed the dedicated header, which didn't make too much sense to me: + /* Pre-declare Relation, in order to avoid a build dependency on rel.h. */ + typedef struct RelationData *Relation; This works fine, and is precedented. See src/include/storage/buffile.h, for example. If you think it's unreasonable that none of the functions now added to elog.h are callable without also including rel.h, consider that all call sites are naturally already doing that, for the errmsg() string itself. Besides, this is a perfectly valid way of reducing build dependencies, or at least a more valid way than creating a new header that does not really represent a natural new division for these new functions, IMHO. Opaque pointers are ordinarily used to encapsulate things in C, rather than to prevent build dependencies, but I believe that's only because in general that's something that C programmers are more likely to want. > * new fields "constraint_table" and "trigger_table" - constraints and > triggers are related to relation in pg, not just to schema I've added some remarks to that effect in the docs of my revision for your consideration. > * better coverage of enhancing errors in source code Good. I think it's important that we nail down just where these are expected to be available. It would be nice if there was a quick and easy answer to the question "Just what ErrorResponse fields should this new "sub-category of class 23" ereport() site have?". We clearly have yet to work those details out. I have another concern with this patch. log_error_verbosity appears to be intended as an exact analogue of client verbosity (as set by PQsetErrorVerbosity()). While this generally holds for this patch, there is one notable exception: You always log all of these new fields within write_csvlog(), even if (Log_error_verbosity < PGERROR_VERBOSE). Why? There will be a bunch of commas in most CSV logs once this happens, so that the schema of the log is consistent. That is kind of ugly, but I don't see a way around it. We already do this for location and application_name (though that's separately controlled by the application_name guc). I haven't touched that in the attached revision, as I'd like to hear what you have to say. Another problem that will need to be fixed is that of the follow values: +#define PG_DIAG_COLUMN_NAME 'c' +#define PG_DIAG_TABLE_NAME 't' +#define PG_DIAG_SCHEMA_NAME 's' +#define PG_DIAG_CONSTRAINT_NAME 'n' +#define PG_DIAG_CONSTRAINT_TABLE 'o' +#define PG_DIAG_CONSTRAINT_SCHEMA 'm' +#define PG_DIAG_ROUTINE_NAME 'r' +#define PG_DIAG_ROUTINE_SCHEMA 'u' +#define PG_DIAG_TRIGGER_NAME 'g' +#define PG_DIAG_TRIGGER_TABLE 'i' +#define PG_DIAG_TRIGGER_SCHEMA 'h' Not all appear to have a way of setting the value within the ereport interface. For example, there is nothing like "errrelation_column()" (or "errrelcol()", as I call it) to set PG_DIAG_ROUTINE_NAME. This is something I haven't touched. >> src/backend/utils/adt/domains.c >> 162: (errcode(ERRCODE_CHECK_VIOLATION), >> > > these exceptions are related to domains - we has not adequate fields > now - and these fields are not in standards > > it needs some like DOMAIN_NAME and DOMAIN_SCHEMA ??? Hmm. I'm not sure that it's worth it. I took a look at recent JDBC documentation, because I'd expect it to offer the most complete support for exposing this in exception classes. Turns out that it does not expose things at as fine a granularity as you have here at all, which is disappointing. It seems to suppose that just a vendor code and "cause" will be sufficient. If you're using this one proprietary database, there is a command that'll let you get diagnostics. The wisdom for users of another proprietary database seems to be that you just use stored procedures. So I agree that CONSTRAINT NAME should be unset in the event of uncatalogued, unnamed not null integrity constraint violations. The standard seems to be loose on this, and I think we'll have to be too. However, I'd find it pretty intolerable if we were inconsistent between ereport() callsites that were *effectively the same error* - this could result in a user's application breaking based on the phase of the moon. Do you suppose it's worth stashing the last set of these fields to one side, and exposing this through an SQL utility command, or even a bundled function? I don't imagine that it's essential to have that right away, but it's something to consider. -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services
Attachment
2012/7/7 Peter Geoghegan <peter@2ndquadrant.com>: > Attached is a revision of this patch, with a few clean-ups, mostly to > the wording of certain things. > > On 5 July 2012 17:41, Pavel Stehule <pavel.stehule@gmail.com> wrote: >> * renamed auxiliary functions and moved it elog.c - header is new file >> "relerror.h" > > In my revision, I've just added a pre-declaration and removed the > dedicated header, which didn't make too much sense to me: > > + /* Pre-declare Relation, in order to avoid a build dependency on rel.h. */ > + typedef struct RelationData *Relation; > > This works fine, and is precedented. See > src/include/storage/buffile.h, for example. If you think it's > unreasonable that none of the functions now added to elog.h are > callable without also including rel.h, consider that all call sites > are naturally already doing that, for the errmsg() string itself. > Besides, this is a perfectly valid way of reducing build dependencies, > or at least a more valid way than creating a new header that does not > really represent a natural new division for these new functions, IMHO. > Opaque pointers are ordinarily used to encapsulate things in C, rather > than to prevent build dependencies, but I believe that's only because > in general that's something that C programmers are more likely to > want. > It is question for Alvaro or Tom. I have not strong opinion on it. >> * new fields "constraint_table" and "trigger_table" - constraints and >> triggers are related to relation in pg, not just to schema > > I've added some remarks to that effect in the docs of my revision for > your consideration. > >> * better coverage of enhancing errors in source code > > Good. I think it's important that we nail down just where these are > expected to be available. It would be nice if there was a quick and > easy answer to the question "Just what ErrorResponse fields should > this new "sub-category of class 23" ereport() site have?". We clearly > have yet to work those details out. > > I have another concern with this patch. log_error_verbosity appears to > be intended as an exact analogue of client verbosity (as set by > PQsetErrorVerbosity()). While this generally holds for this patch, > there is one notable exception: You always log all of these new fields > within write_csvlog(), even if (Log_error_verbosity < > PGERROR_VERBOSE). Why? There will be a bunch of commas in most CSV > logs once this happens, so that the schema of the log is consistent. > That is kind of ugly, but I don't see a way around it. We already do > this for location and application_name (though that's separately > controlled by the application_name guc). I haven't touched that in the > attached revision, as I'd like to hear what you have to say. it is bug - these new fields should be used only when verbosity is >= VERBOSE > > Another problem that will need to be fixed is that of the follow values: > > +#define PG_DIAG_COLUMN_NAME 'c' > +#define PG_DIAG_TABLE_NAME 't' > +#define PG_DIAG_SCHEMA_NAME 's' > +#define PG_DIAG_CONSTRAINT_NAME 'n' > +#define PG_DIAG_CONSTRAINT_TABLE 'o' > +#define PG_DIAG_CONSTRAINT_SCHEMA 'm' > +#define PG_DIAG_ROUTINE_NAME 'r' > +#define PG_DIAG_ROUTINE_SCHEMA 'u' > +#define PG_DIAG_TRIGGER_NAME 'g' > +#define PG_DIAG_TRIGGER_TABLE 'i' > +#define PG_DIAG_TRIGGER_SCHEMA 'h' > > Not all appear to have a way of setting the value within the ereport > interface. For example, there is nothing like "errrelation_column()" > (or "errrelcol()", as I call it) to set PG_DIAG_ROUTINE_NAME. This is > something I haven't touched. When I sent this patch first time, then one issue was new functions for these fields. Tom proposal was using a generic function for these new fields. These fields holds separated values, but in almost all cases some combinations are used - "ROUTINE_NAME, ROUTINE_SCHEMA", "TABLE_NAME, TABLE_SCHEMA" - so these fields are not independent - this is difference from original ErrorData fields - so axillary functions doesn't follow these fields - because it is not practical. > >>> src/backend/utils/adt/domains.c >>> 162: (errcode(ERRCODE_CHECK_VIOLATION), >>> >> >> these exceptions are related to domains - we has not adequate fields >> now - and these fields are not in standards >> >> it needs some like DOMAIN_NAME and DOMAIN_SCHEMA ??? > > Hmm. I'm not sure that it's worth it. > > I took a look at recent JDBC documentation, because I'd expect it to > offer the most complete support for exposing this in exception > classes. Turns out that it does not expose things at as fine a > granularity as you have here at all, which is disappointing. It seems > to suppose that just a vendor code and "cause" will be sufficient. If > you're using this one proprietary database, there is a command that'll > let you get diagnostics. The wisdom for users of another proprietary > database seems to be that you just use stored procedures. So I agree > that CONSTRAINT NAME should be unset in the event of uncatalogued, > unnamed not null integrity constraint violations. The standard seems > to be loose on this, and I think we'll have to be too. However, I'd > find it pretty intolerable if we were inconsistent between ereport() > callsites that were *effectively the same error* - this could result > in a user's application breaking based on the phase of the moon. > > Do you suppose it's worth stashing the last set of these fields to one > side, and exposing this through an SQL utility command, or even a > bundled function? I don't imagine that it's essential to have that > right away, but it's something to consider. I understand, but fixing any ereport in core is difficult for processing. So coverage only some subset is practical (first stage) - with some basic infrastructure in core all other patches with better covering will be simpler for review and for commit too. RI and constraints is more often use cases where you would to parse error messages - these will be covered in first stage. Regards Pavel > > -- > Peter Geoghegan http://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Training and Services
On 7 July 2012 13:57, Pavel Stehule <pavel.stehule@gmail.com> wrote: >> In my revision, I've just added a pre-declaration and removed the >> dedicated header, which didn't make too much sense to me: >> >> + /* Pre-declare Relation, in order to avoid a build dependency on rel.h. */ >> + typedef struct RelationData *Relation; >> Opaque pointers are ordinarily used to encapsulate things in C, rather >> than to prevent build dependencies, but I believe that's only because >> in general that's something that C programmers are more likely to >> want. >> > > It is question for Alvaro or Tom. I have not strong opinion on it. Fair enough. >> You always log all of these new fields within write_csvlog(), even if (Log_error_verbosity < >> PGERROR_VERBOSE). Why? > it is bug - these new fields should be used only when verbosity is >= VERBOSE Please fix it. >> +#define PG_DIAG_TRIGGER_SCHEMA 'h' >> >> Not all appear to have a way of setting the value within the ereport >> interface. For example, there is nothing like "errrelation_column()" >> (or "errrelcol()", as I call it) to set PG_DIAG_ROUTINE_NAME. This is >> something I haven't touched. > > When I sent this patch first time, then one issue was new functions > for these fields. Tom proposal was using a generic function for these > new fields. These fields holds separated values, but in almost all > cases some combinations are used - "ROUTINE_NAME, ROUTINE_SCHEMA", > "TABLE_NAME, TABLE_SCHEMA" - so these fields are not independent - > this is difference from original ErrorData fields - so axillary > functions doesn't follow these fields - because it is not practical. Maybe it isn't practical to do it that way, but I think that we need to have a way of setting the fields from an ereport callsite. I am willing to accept that it may make sense to add existing ereport sites by piecemeal, in later patches, but I think you should figure out how regular ereport sites are supposed to do this before anything is committed. We need to nail down the interface first. > I understand, but fixing any ereport in core is difficult for > processing. So coverage only some subset is practical (first stage) - > with some basic infrastructure in core all other patches with better > covering will be simpler for review and for commit too. RI and > constraints is more often use cases where you would to parse error > messages - these will be covered in first stage. Okay. What subset? I would hope that it was a well-defined subset. -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services
Excerpts from Peter Geoghegan's message of mar jul 10 14:56:40 -0400 2012: > On 7 July 2012 13:57, Pavel Stehule <pavel.stehule@gmail.com> wrote: > >> +#define PG_DIAG_TRIGGER_SCHEMA 'h' > >> > >> Not all appear to have a way of setting the value within the ereport > >> interface. For example, there is nothing like "errrelation_column()" > >> (or "errrelcol()", as I call it) to set PG_DIAG_ROUTINE_NAME. This is > >> something I haven't touched. > > > > When I sent this patch first time, then one issue was new functions > > for these fields. Tom proposal was using a generic function for these > > new fields. These fields holds separated values, but in almost all > > cases some combinations are used - "ROUTINE_NAME, ROUTINE_SCHEMA", > > "TABLE_NAME, TABLE_SCHEMA" - so these fields are not independent - > > this is difference from original ErrorData fields - so axillary > > functions doesn't follow these fields - because it is not practical. > > Maybe it isn't practical to do it that way, but I think that we need > to have a way of setting the fields from an ereport callsite. I am > willing to accept that it may make sense to add existing ereport sites > by piecemeal, in later patches, but I think you should figure out how > regular ereport sites are supposed to do this before anything is > committed. We need to nail down the interface first. I think we should just define constants for the set of fields the patch currently uses. When and if we later add new fields to other callsites, we can define more constants. FWIW about the new include: I feel a strong dislike about the forward declaration you suggest. Defining Relation in elog.h seems completely out of place. The one you suggested as precedent (BufFile) is completely unlike it, in that the declaration is clearly placed in the header (buffile.h) of the module that works with the struct in question. -- Álvaro Herrera <alvherre@commandprompt.com> The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support
On 10 July 2012 20:28, Alvaro Herrera <alvherre@commandprompt.com> wrote: > I think we should just define constants for the set of fields the patch > currently uses. When and if we later add new fields to other callsites, > we can define more constants. Fair enough. Let's do that. > FWIW about the new include: I feel a strong dislike about the forward > declaration you suggest. Defining Relation in elog.h seems completely > out of place. The one you suggested as precedent (BufFile) is > completely unlike it, in that the declaration is clearly placed in the > header (buffile.h) of the module that works with the struct in question. I haven't defined Relation in elog.h; I have pre-declared it there. Maybe that isn't to your taste, but there is surely something to be said for adding exactly one line of code in preference to adding an entire new header file, and having a bunch of existing files include that new header. That said, it's not as if I feel strongly about it. -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services
Excerpts from Peter Geoghegan's message of mar jul 10 15:54:59 -0400 2012: > On 10 July 2012 20:28, Alvaro Herrera <alvherre@commandprompt.com> wrote: > > FWIW about the new include: I feel a strong dislike about the forward > > declaration you suggest. Defining Relation in elog.h seems completely > > out of place. The one you suggested as precedent (BufFile) is > > completely unlike it, in that the declaration is clearly placed in the > > header (buffile.h) of the module that works with the struct in question. > > I haven't defined Relation in elog.h; I have pre-declared it there. > Maybe that isn't to your taste, but there is surely something to be > said for adding exactly one line of code in preference to adding an > entire new header file, and having a bunch of existing files include > that new header. That is true. I'd like to hear others' opinions. -- Álvaro Herrera <alvherre@commandprompt.com> The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support
On 10 July 2012 21:26, Alvaro Herrera <alvherre@commandprompt.com> wrote: >> I haven't defined Relation in elog.h; I have pre-declared it there. >> Maybe that isn't to your taste, but there is surely something to be >> said for adding exactly one line of code in preference to adding an >> entire new header file, and having a bunch of existing files include >> that new header. > > That is true. I'd like to hear others' opinions. It seems that the code, exactly as written, relies upon a GNU extension that didn't make it into the standard until C11 - the redefinition of a typedef. Clang warns about this. Still, it ought to be possible, if not entirely straightforward, to use a pre-declaration all the same. -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services
Alvaro Herrera <alvherre@commandprompt.com> writes: > FWIW about the new include: I feel a strong dislike about the forward > declaration you suggest. Defining Relation in elog.h seems completely > out of place. Agreed. Maybe a reasonable solution is to allow some ereport helper functions (or, really, wrappers for the helper functions) to be declared someplace else than elog.h. They'd likely need to be implemented someplace else than elog.c, too, so this doesn't seem unreasonable. The generic helper function approach doesn't seem too unreasonable for this: elog.h/.c would provide something like err_generic_string(int fieldid, const char *str) and then someplace else could provide functions built on this that insert table/schema/column/constraint/etc names into suitable fields. regards, tom lane
Hello * renamed erritem to err_generic_string * fixed CSVlog generation * new file /utils/error/relerror.c with axillary functions - declarations are in utils/rel.h Regards Pavel 2012/7/11 Tom Lane <tgl@sss.pgh.pa.us>: > Alvaro Herrera <alvherre@commandprompt.com> writes: >> FWIW about the new include: I feel a strong dislike about the forward >> declaration you suggest. Defining Relation in elog.h seems completely >> out of place. > > Agreed. Maybe a reasonable solution is to allow some ereport helper > functions (or, really, wrappers for the helper functions) to be declared > someplace else than elog.h. They'd likely need to be implemented > someplace else than elog.c, too, so this doesn't seem unreasonable. > > The generic helper function approach doesn't seem too unreasonable for > this: elog.h/.c would provide something like > > err_generic_string(int fieldid, const char *str) > > and then someplace else could provide functions built on this that > insert table/schema/column/constraint/etc names into suitable fields. > > regards, tom lane
Attachment
On 18 July 2012 19:13, Pavel Stehule <pavel.stehule@gmail.com> wrote: > * renamed erritem to err_generic_string > * fixed CSVlog generation > * new file /utils/error/relerror.c with axillary functions - > declarations are in utils/rel.h Why has this revision retained none of my editorialisations? In particular, none of the documentation updates that I made were retained. You also haven't included changes where I attempted to make very large ereport statements (often with verbose use of ternary conditionals) clearer, nor have you included my adjustments to normalise the appearance of new code to be consistent with existing code in various ways. You don't have to agree with all of those things of course, but you should have at least commented on them. I didn't spend time cleaning things up only to have those changes ignored. I'm particularly surprised that the documentation alterations were not retained, as you yourself asked me to make those revisions. /* file error location */ - if (Log_error_verbosity >= PGERROR_VERBOSE) + { StringInfoData msgbuf; Why have you retained the scope here? Couldn't you have just moved the single declaration instead? -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services
2012/7/23 Peter Geoghegan <peter@2ndquadrant.com>: > On 18 July 2012 19:13, Pavel Stehule <pavel.stehule@gmail.com> wrote: >> * renamed erritem to err_generic_string >> * fixed CSVlog generation >> * new file /utils/error/relerror.c with axillary functions - >> declarations are in utils/rel.h > > Why has this revision retained none of my editorialisations? In > particular, none of the documentation updates that I made were > retained. I am sorry, I can't do merge - I try to use diff but without success, so I did structure changes and merge with your patch postponed > > You also haven't included changes where I attempted to make very large > ereport statements (often with verbose use of ternary conditionals) > clearer, nor have you included my adjustments to normalise the > appearance of new code to be consistent with existing code in various > ways. > > You don't have to agree with all of those things of course, but you > should have at least commented on them. I didn't spend time cleaning > things up only to have those changes ignored. I'm particularly > surprised that the documentation alterations were not retained, as you > yourself asked me to make those revisions. again, I am sorry - my last patch should to define structure (files) - because it was significant topic. When we will find a agreement, then I'll merge changes messages, comments, doc > > /* file error location */ > - if (Log_error_verbosity >= PGERROR_VERBOSE) > + > { > StringInfoData msgbuf; > > > Why have you retained the scope here? Couldn't you have just moved the > single declaration instead? I am not sure, I have to recheck it. Regards Pavel > > -- > Peter Geoghegan http://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Training and Services
On Mon, Jul 23, 2012 at 7:03 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote: >> You don't have to agree with all of those things of course, but you >> should have at least commented on them. I didn't spend time cleaning >> things up only to have those changes ignored. I'm particularly >> surprised that the documentation alterations were not retained, as you >> yourself asked me to make those revisions. > > again, I am sorry - my last patch should to define structure (files) - > because it was significant topic. When we will find a agreement, then > I'll merge changes messages, comments, doc It seems clear that it's not reasonable to expect any more review to be done here this CommitFest, given that the changes from reviews already made haven't been incorporated into the patch. Therefore, I'm going to mark this one Returned with Feedback. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 23 July 2012 15:23, Robert Haas <robertmhaas@gmail.com> wrote: > It seems clear that it's not reasonable to expect any more review to > be done here this CommitFest, given that the changes from reviews > already made haven't been incorporated into the patch. Therefore, I'm > going to mark this one Returned with Feedback. I agree that that's the right thing to do at this point. -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services
Hello here is updated patch - merge comments, docs, formatting, some identifiers from Peter Geoghegan's patch Regards Pavel 2012/7/18 Pavel Stehule <pavel.stehule@gmail.com>: > Hello > > * renamed erritem to err_generic_string > * fixed CSVlog generation > * new file /utils/error/relerror.c with axillary functions - > declarations are in utils/rel.h > > Regards > > Pavel > > 2012/7/11 Tom Lane <tgl@sss.pgh.pa.us>: >> Alvaro Herrera <alvherre@commandprompt.com> writes: >>> FWIW about the new include: I feel a strong dislike about the forward >>> declaration you suggest. Defining Relation in elog.h seems completely >>> out of place. >> >> Agreed. Maybe a reasonable solution is to allow some ereport helper >> functions (or, really, wrappers for the helper functions) to be declared >> someplace else than elog.h. They'd likely need to be implemented >> someplace else than elog.c, too, so this doesn't seem unreasonable. >> >> The generic helper function approach doesn't seem too unreasonable for >> this: elog.h/.c would provide something like >> >> err_generic_string(int fieldid, const char *str) >> >> and then someplace else could provide functions built on this that >> insert table/schema/column/constraint/etc names into suitable fields. >> >> regards, tom lane
Attachment
On 20 August 2012 14:09, Pavel Stehule <pavel.stehule@gmail.com> wrote: > (eelog-2012-08-20.diff) This patch has the following reference to relerror.c: """"""" *** a/src/include/utils/rel.h --- b/src/include/utils/rel.h *************** *** 394,397 **** typedef struct StdRdOptions --- 394,402 ---- extern void RelationIncrementReferenceCount(Relation rel); extern void RelationDecrementReferenceCount(Relationrel); + /* routines in utils/error/relerror.c */ """"""" Indeed, the new C file has been separately added to a makefile: ! OBJS = assert.o elog.o relerror.o However, the new file itself does not appear in this patch. Therefore, the code does not compile. Please post a revision with the new file included. -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services
Hello Peter here is updated patch, sorry for missing file Regards Pavel 2012/10/8 Peter Geoghegan <peter@2ndquadrant.com>: > On 20 August 2012 14:09, Pavel Stehule <pavel.stehule@gmail.com> wrote: >> (eelog-2012-08-20.diff) > > This patch has the following reference to relerror.c: > > """"""" > > *** a/src/include/utils/rel.h > --- b/src/include/utils/rel.h > *************** > *** 394,397 **** typedef struct StdRdOptions > --- 394,402 ---- > extern void RelationIncrementReferenceCount(Relation rel); > extern void RelationDecrementReferenceCount(Relation rel); > > + /* routines in utils/error/relerror.c */ > > """"""" > > Indeed, the new C file has been separately added to a makefile: > > ! OBJS = assert.o elog.o relerror.o > > However, the new file itself does not appear in this patch. Therefore, > the code does not compile. Please post a revision with the new file > included. > > -- > Peter Geoghegan http://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Training and Services
Attachment
On 10 October 2012 14:56, Pavel Stehule <pavel.stehule@gmail.com> wrote: > (eelog3.diff) This looks better. You need a better comment here: + * relerror.c + * relation error loging functions + * I'm still not satisfied with the lack of firm guarantees about what errcodes one can assume these fields will be available for. I suggest that this be explicitly documented within errcodes.h. For example, right now if some client code wants to discriminate against a certain check constraint in its error-handling code (typically to provide a user-level message), it might do something like this: try {... } catch(CheckViolation e) { // This works: if (e.constraint_name == "postive_balance") MessageBox("Your account must have a positive balance."); // This is based on a check constraint in a domain, and is therefore broken right now: else if (e.constraint_name == "valid_barcode") MessageBox("You inputted an invalidbarcode - check digit was wrong"); } This is broken right now, simply because the application cannot rely on the constraint name being available, since for no particular reason some of the ERRCODE_CHECK_VIOLATION ereport sites (like in execQual.c) don't provide an errconstraint(). What is needed is a coding standard that says "ERRCODE_CHECK_VIOLATION ereport call sites need to have an errconstraint()". Without this, the patch is quite pointless. My mind is not 100% closed to the idea that we provide these extra fields on a "best-effort" basis per errcode, but it's pretty close to there. Why should we allow this unreasonable inconsistency? The usage pattern that I describe above is a real one, and I thought that the whole point was to support it. I have previously outlined places where this type of inconsistency exists, in an earlier revision. [1] If you want to phase in the introduction of requiring that all relevant callsites use this infrastructure, I guess I'm okay with that. However, I'm going to have to insist that for each of these new fields, for any errcode you identify as requiring the field, either all callsites get all relevant fields, or no call sites get all relevant fields, and that each errcode be documented as such. So you should probably just bite the bullet and figure out a reasonable and comprehensive set of rules on adding these fields based on errcode. Loosey goosey isn't going to cut it. I'm having a difficult time imagining why we'd only have the constraint/tablename for these errcodes (with one exception, noted below): /* Class 23 - Integrity Constraint Violation */ #define ERRCODE_INTEGRITY_CONSTRAINT_VIOLATION MAKE_SQLSTATE('2','3','0','0','0') #define ERRCODE_RESTRICT_VIOLATION MAKE_SQLSTATE('2','3','0','0','1') #define ERRCODE_NOT_NULL_VIOLATION MAKE_SQLSTATE('2','3','5','0','2') #define ERRCODE_FOREIGN_KEY_VIOLATION MAKE_SQLSTATE('2','3','5','0','3') #define ERRCODE_UNIQUE_VIOLATION MAKE_SQLSTATE('2','3','5','0','5') #define ERRCODE_CHECK_VIOLATION MAKE_SQLSTATE('2','3','5','1','4') #define ERRCODE_EXCLUSION_VIOLATION MAKE_SQLSTATE('2','3','P','0','1') You previously defending some omissions [2] on the basis that they involved domains, so some fields were unavailable. This doesn't appear to be quite valid, though. For example, consider this untouched callsite within execQual, that relates to a domain: if (!conIsNull && !DatumGetBool(conResult)) ereport(ERROR, (errcode(ERRCODE_CHECK_VIOLATION), errmsg("value for domain %s violates checkconstraint \"%s\"", format_type_be(ctest->resulttype), con->name))); There is no reason why you couldn't have at least given the constraint name. It might represent an unreasonable burden for you to determine the table that these constraints relate to by going through the rabbit hole of executor state, since we haven't had any complaints about this information being available within error messages before, AFAIK. If that is the case, the general non-availability of this information for domains needs to be documented. I guess that's logical enough, since there doesn't necessarily have to be a table involved in the event of a domain constraint violation. However, there does have to be a constraint, for example, involved. FWIW, I happen to think that not-null constraints at the domain level are kind of stupid (though check constraints are great), but what do I know... Anyway, the bottom line is that authors of Postgres client libraries (and their users) ought to have a reasonable set of guarantees about when this information is available. If that means you have to make one or two explicit, documented exceptions to my previous "all or nothing" proviso, such as "table names won't be available in the event of domain constraints", so be it. I'm going to suggest you add a note to both the docs and errcodes.h regarding all this in your next revision. People need to be adding these fields for all errcodes that *require* them going forward. If, in the future, ERRCODE_UNIQUE_VIOLATION errors, for example, cannot supply a constraint name that was violated, then that is, almost by definition, the wrong errcode to use. [1] http://archives.postgresql.org/message-id/CAEYLb_VJK+AZe6fO_s0Md0ge5D=RenDf7wg+g4NxN8mhKQ4Gzg@mail.gmail.com [2] CAFj8pRDtTDvoSvJT8PP08mQ_LW2HaOmWXvRUdoYLhk9xF7KMyw@mail.gmail.com -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services
Hello 2012/10/11 Peter Geoghegan <peter@2ndquadrant.com>: > On 10 October 2012 14:56, Pavel Stehule <pavel.stehule@gmail.com> wrote: >> (eelog3.diff) > > This looks better. > > You need a better comment here: > > + * relerror.c > + * relation error loging functions > + * > > I'm still not satisfied with the lack of firm guarantees about what > errcodes one can assume these fields will be available for. I suggest > that this be explicitly documented within errcodes.h. For example, > right now if some client code wants to discriminate against a certain > check constraint in its error-handling code (typically to provide a > user-level message), it might do something like this: > > try > { > ... > } > catch(CheckViolation e) > { > // This works: > if (e.constraint_name == "postive_balance") > MessageBox("Your account must have a positive balance."); > // This is based on a check constraint in a domain, and is > therefore broken right now: > else if (e.constraint_name == "valid_barcode") > MessageBox("You inputted an invalid barcode - check digit was wrong"); > } > > This is broken right now, simply because the application cannot rely > on the constraint name being available, since for no particular reason > some of the ERRCODE_CHECK_VIOLATION ereport sites (like in execQual.c) > don't provide an errconstraint(). What is needed is a coding standard > that says "ERRCODE_CHECK_VIOLATION ereport call sites need to have an > errconstraint()". Without this, the patch is quite pointless. I understand to your request, but I don't thing so this request is 100% valid. Check violation is good example. Constraint names are "optional" in PostgreSQL - so we cannot require constraint_name. One from first prototypes I used generated name for NULL constraints and it was rejected - because It can be confusing, because a user doesn't find these names in catalogue. I agree with it now - it better show nothing, than show some phantom. More - a design of these feature from SQL/PSM and ANSI/SQL is not too strict. There is no exception, when you asking any unfilled value - you get a empty string instead. And more - there are no info in standard, what fields are optional and what fields are mandatory. And although we don't checking consistence of exception fields, I think so this patch is very usable. I have a three text fields now: message, detail, hint - and I can do same error, that you are described. This patch doesn't change it. But it creates a few new basic variables (for all possible exceptions), that can be used for simplification of error processing. It is not silver bullet. And it is not C++. Creating some new tool for checking consistency of exceptions is not good way - and you are newer ensure consistency of custom exceptions. > > My mind is not 100% closed to the idea that we provide these extra > fields on a "best-effort" basis per errcode, but it's pretty close to > there. Why should we allow this unreasonable inconsistency? The usage > pattern that I describe above is a real one, and I thought that the > whole point was to support it. > > I have previously outlined places where this type of inconsistency > exists, in an earlier revision. [1] > > If you want to phase in the introduction of requiring that all > relevant callsites use this infrastructure, I guess I'm okay with > that. However, I'm going to have to insist that for each of these new > fields, for any errcode you identify as requiring the field, either > all callsites get all relevant fields, or no call sites get all > relevant fields, and that each errcode be documented as such. So you > should probably just bite the bullet and figure out a reasonable and > comprehensive set of rules on adding these fields based on errcode. > Loosey goosey isn't going to cut it. > > I'm having a difficult time imagining why we'd only have the > constraint/tablename for these errcodes (with one exception, noted > below): > > /* Class 23 - Integrity Constraint Violation */ > #define ERRCODE_INTEGRITY_CONSTRAINT_VIOLATION > MAKE_SQLSTATE('2','3','0','0','0') > #define ERRCODE_RESTRICT_VIOLATION MAKE_SQLSTATE('2','3','0','0','1') > #define ERRCODE_NOT_NULL_VIOLATION MAKE_SQLSTATE('2','3','5','0','2') > #define ERRCODE_FOREIGN_KEY_VIOLATION MAKE_SQLSTATE('2','3','5','0','3') > #define ERRCODE_UNIQUE_VIOLATION MAKE_SQLSTATE('2','3','5','0','5') > #define ERRCODE_CHECK_VIOLATION MAKE_SQLSTATE('2','3','5','1','4') > #define ERRCODE_EXCLUSION_VIOLATION MAKE_SQLSTATE('2','3','P','0','1') ERRCODE_UNIQUE_VIOLATION and ERRCODE_EXCLUSION_VIOLATION should be related to index relation, not parent relation. Then we don't need set COLUMN_NAME, that can be expression or more columns. > > You previously defending some omissions [2] on the basis that they > involved domains, so some fields were unavailable. This doesn't appear > to be quite valid, though. For example, consider this untouched > callsite within execQual, that relates to a domain: > > if (!conIsNull && > !DatumGetBool(conResult)) > ereport(ERROR, > (errcode(ERRCODE_CHECK_VIOLATION), > errmsg("value for domain %s violates check constraint\"%s\"", > format_type_be(ctest->resulttype), > con->name))); > > There is no reason why you couldn't have at least given the constraint > name. It might represent an unreasonable burden for you to determine > the table that these constraints relate to by going through the rabbit > hole of executor state, since we haven't had any complaints about this > information being available within error messages before, AFAIK. If > that is the case, the general non-availability of this information for > domains needs to be documented. I guess that's logical enough, since > there doesn't necessarily have to be a table involved in the event of > a domain constraint violation. However, there does have to be a > constraint, for example, involved. yes, CONSTRAINT_NAME in this case should be used. TABLE_NAME can be or should not be empty, but this information is not available, because some facts can be changed in rewriter stage. > > FWIW, I happen to think that not-null constraints at the domain level > are kind of stupid (though check constraints are great), but what do I > know... > > Anyway, the bottom line is that authors of Postgres client libraries > (and their users) ought to have a reasonable set of guarantees about > when this information is available. If that means you have to make one > or two explicit, documented exceptions to my previous "all or nothing" > proviso, such as "table names won't be available in the event of > domain constraints", so be it. > > I'm going to suggest you add a note to both the docs and errcodes.h > regarding all this in your next revision. People need to be adding > these fields for all errcodes that *require* them going forward. If, > in the future, ERRCODE_UNIQUE_VIOLATION errors, for example, cannot > supply a constraint name that was violated, then that is, almost by > definition, the wrong errcode to use. I can agree, so some documentation is necessary (maybe some table) - now we have not described context of all errors. Other needs a searching of some consensus - or searching solution - our syntax allows some variations that are unsupported in ANSI/SQL - and then we have to use some generated name or we don't show this information. It is a basic and most important question. So first we have to find reply to following question: this patch should to follow our current implementation of exceptions or we modify exceptions to be more close to ANSI/SQL (and we have to modify model)? Regards Pavel > > [1] http://archives.postgresql.org/message-id/CAEYLb_VJK+AZe6fO_s0Md0ge5D=RenDf7wg+g4NxN8mhKQ4Gzg@mail.gmail.com > > [2] CAFj8pRDtTDvoSvJT8PP08mQ_LW2HaOmWXvRUdoYLhk9xF7KMyw@mail.gmail.com > > -- > Peter Geoghegan http://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Training and Services
On 12 October 2012 20:27, Pavel Stehule <pavel.stehule@gmail.com> wrote: > I understand to your request, but I don't thing so this request is > 100% valid. Check violation is good example. Constraint names are > "optional" in PostgreSQL - so we cannot require constraint_name. One > from first prototypes I used generated name for NULL constraints and > it was rejected - because It can be confusing, because a user doesn't > find these names in catalogue. I agree with it now - it better show > nothing, than show some phantom. More - a design of these feature from > SQL/PSM and ANSI/SQL is not too strict. There is no exception, when > you asking any unfilled value - you get a empty string instead. That's beside the point. NOT NULL constraints are not catalogued (for now), so sure, the only reasonable thing to do is to have an empty string in that case. Since no one expects to be able to get the name of a NOT NULL constraint anyway, that isn't a problem. Once again, the problem, in particular, is that there is no well-defined set of rules that client code can follow to be sure that a field name they're interested in will be available at all. In all revisions thus far, you have seemingly arbitrarily decided to not add some some fields in some places. I mentioned already that some ERRCODE_CHECK_VIOLATION sites didn't name a constraint - other places don't name a table when one is available, as with some ERRCODE_NOT_NULL_VIOLATION sites. These fields need to be added, and what's more, the rules for where they need to be added need to be coherently described. So, that's about 3 sentences of extra documentation, saying to both users and hackers (at the very least): * NOT NULL constraints won't have a CONSTRAINT_NAME available, since they aren't catalogued. * Domains won't have a TABLE_NAME available, even though there may actually be a table name associated with the error. Have I missed one? That all seems pretty simple to me, and I don't see what the problem is. > And although we don't checking consistence of exception fields, I > think so this patch is very usable. I have a three text fields now: > message, detail, hint - and I can do same error, that you are > described. This patch doesn't change it. But it creates a few new > basic variables (for all possible exceptions), that can be used for > simplification of error processing. It is not silver bullet. And it is > not C++. The simplification of error processing is that they can now reliably get these fields - they don't have to use some kludge like parsing a (possibly localised) error message to look for a check constraint name. I'm not asking you to add run-time verification - I'm asking you to institute a coding standard, that is limited to backend code, and to document what assumptions applications can make. To my mind, if the user cannot rely on the fields accurately indicating error conditions according to some coherent set of rules (in actuality, one or two simple and obvious exceptions, only one of which is slightly surprising), then this patch is not only not helpful, it's harmful. If they're only available according to some completely arbitrary and obscure criteria, (like the fact that you've included one ERRCODE_CHECK_VIOLATION site but not another), that's a footgun. > Creating some new tool for checking consistency of exceptions > is not good way - and you are newer ensure consistency of custom > exceptions. Pointing out that some extension author, or pl/pgsql function, could in principle ignore the documentation I'm asking you to write and not supply a constraint, while raising their own, say, magical ERRCODE_CHECK_VIOLATION is a bit of a cop-out. I should also mention that things like ERRCODE_INTERNAL_ERROR are naturally going to be a bit fuzzy, and that's fine. Nobody is ever going to expect those anyway. > yes, CONSTRAINT_NAME in this case should be used. TABLE_NAME can be or > should not be empty, but this information is not available, because > some facts can be changed in rewriter stage. Right, or because you could do this and get an exception: select 'foo'::bar_domain; > I can agree, so some documentation is necessary (maybe some table) - > now we have not described context of all errors. Other needs a > searching of some consensus - or searching solution - our syntax > allows some variations that are unsupported in ANSI/SQL - and then we > have to use some generated name or we don't show this information. It > is a basic and most important question. Can you give an example of when a generated name might be used, beyond the example you've already given (that is, NULL constraints)? I'm not completely opposed to the idea of a generated name. I think that it's very much a secondary issue, though. > So first we have to find reply > to following question: this patch should to follow our current > implementation of exceptions or we modify exceptions to be more close > to ANSI/SQL (and we have to modify model)? What does the option of following the SQL standard offer us? What have I said that is fundamentally incompatible with how things work in this patch right now? -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services
I think that we're both going to be busy next week, since we're both attending pgconf.eu. For that reason, I would like to spend some time tomorrow to get something in shape, that I can mark "ready for committer". I'd like to get this patch committed during this commitfest. You are welcome to do this work instead. I want to avoid a redundant effort. Let me know if you think that that's a good idea. -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services
Hello 2012/10/20 Peter Geoghegan <peter@2ndquadrant.com>: > I think that we're both going to be busy next week, since we're both > attending pgconf.eu. For that reason, I would like to spend some time > tomorrow to get something in shape, that I can mark "ready for > committer". I'd like to get this patch committed during this > commitfest. You are welcome to do this work instead. I want to avoid a > redundant effort. I invite a materialization of your ideas :) - and I have to work on preparing presentation for pgconf.eu :( Regards Pavel > > Let me know if you think that that's a good idea. > > -- > Peter Geoghegan http://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Training and Services
Peter Geoghegan escribió: > > I think that we're both going to be busy next week, since we're both > attending pgconf.eu. For that reason, I would like to spend some time > tomorrow to get something in shape, that I can mark "ready for > committer". I'd like to get this patch committed during this > commitfest. You are welcome to do this work instead. I want to avoid a > redundant effort. > > Let me know if you think that that's a good idea. I guess you didn't get around to it. Here are my own notes about this patch. * Why doesn't errconstraint() set the err table directly? Having to call errrel() separately seems pointless. I proposeerrconstraint sets both things; when the two tables differ, call errconstraint first and then errrel() to overwrite. * Some constraints do not have an associated relation name; for example constraints on domains. I think we should reportthe constraint name there, if one exists (in domain_check_input, ExecEvalCoerceToDomain it doesn't). How about errconstraint()does not take a relation, and have a new errconstraintrel() that receives the relation to which the constraintis attached. Alternatively, have errconstraint() accept a NULL relation for the cases where there is none. * The distinction between oldrel/newrel in certain callers seems useless; for example if we're rewriting a table due to ALTERTABLE, both the new and old rel have the same name. That could be cleaned up. * Some errrel() calls are getting an index ... is this sane? I think we should be reporting the table name, not the indexname. * There are some pointless whitespace changes in elog.h. I suggest passing everything through pgindent. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On 24 October 2012 23:29, Alvaro Herrera <alvherre@2ndquadrant.com> wrote: >> >> Let me know if you think that that's a good idea. > > I guess you didn't get around to it. I did get some work on this done, which does change things somewhat. In particular, I think that the need to have so many new fields is questionable, and so have removed some. I will get around to this, and will incorporate those ideas. The errrel() calls with index relations are not sane, but that's just an oversight. The next revision will actually do this: + Assert(table->rd_rel->relkind == RELKIND_RELATION); -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services
Peter Geoghegan escribió: > On 24 October 2012 23:29, Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > >> > >> Let me know if you think that that's a good idea. > > > > I guess you didn't get around to it. > > I did get some work on this done, which does change things somewhat. > In particular, I think that the need to have so many new fields is > questionable, and so have removed some. I will get around to this, and > will incorporate those ideas. Excellent. We will await the next version of this patch then ... during the upcoming commitfest. I'm closing it as returned with feedback. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
So, I took a look at this, and made some more changes. I have a hard time seeing the utility of some fields that were in this patch, and so they have been removed from this revision. Consider, for example: + constraint_table text, + constraint_schema text, While constraint_name remains, I find it really hard to believe that applications need a perfectly unambiguous idea of what constraint they're dealing with. If applications have a constraint that has a different domain-specific meaning depending on what schema it is in, while a table in one schema uses that constraint in another, well, that's a fairly awful design. Is it something that we really need to care about? You mentioned the SQL standard, but as far as I know the SQL standard has nothing to say about these fields within something like errdata - their names are lifted from information_schema, but being unambiguous is hardly so critical here. We want to identify an object for the purposes of displaying an error message only. Caring deeply about ambiguity seems wrong-headed to me in this context. + routine_name text, + trigger_name text, + trigger_table text, + trigger_schema text, The whole point of an exception (which ereport() is very similar to) is to decouple errors from error handling as much as possible - any application that magically takes a different course of action based on where that error occurred as reported by the error itself (and not the location of where handling the error occurs) seems kind of wrong-headed to me. Sure, a backtrace may be supplied, but that's only for diagnostic purposes, and a human can just as easily get that information already. If you can present an example of this information actually being present in a way that is amenable to that, either in any other RDBMS or any major programming language or framework, I might reconsider. This just leaves: + column_name text, + table_name text, + constraint_name text, + schema_name text, This seems like enough information for any reasonable use of this infrastructure, and I highly doubt anyone would avail of the other fields if they remained. I guess an application might have done something with "constraint_table", as with foreign keys for example, but I just can't see that being used when constraint_name can be used. Removing these fields has also allowed me to remove the "avoid setting field at lower point in the callstack" logic, which was itself kind of ugly. Since fields like routine_name only set themselves at the top of the callstack, the potential for astonishing outcomes was pretty high - what if you expect one routine_name, but then that routine follows a slightly different code-path one day and you get another function setting routine_name and undermining that expectation? There were some bugs left in the patch eelog3.diff, mostly due to things like setting table name to what is actually an index name. As I mentioned, we now assert that: + Assert(table->rd_rel->relkind == RELKIND_RELATION); in the functions within relerror.c. I have tightened up where these fields are available, and appropriately documented that for the benefit of both application developers and developers of client libraries. I went so far as to hack the Perl scripts that generate .sgml and .h files from errcodes.txt (i.e. the infrastrucutre that produces tables of errcodes in various places from a single authoritative place) - I have instituted a coding standard so that these fields are reliably available and have documented that requirement at both the user and hacker level. It would be nice if a Perl hacker could eyeball those changes - this is my first time writing Perl, and I suspect it may be worth having someone else to polish the Perl code a bit. I have emphasized the need for consistency and a sane contract for application developers and third-party client driver authors - they *need* to know that certain fields will always be available, or at least won't be unavailable due to a change in the phase of the moon. errcodes.txt now says: + # Postgres coding standards mandate that certain fields be available in all + # instances for some of the Class 23 errcodes, documented under "Requirement: " + # here. Some other errcode's ereport sites may, at their own discretion, make + # errcolumn, errtable, errconstraint and errschema fields available too. + # Furthermore, it is possible to make some fields available beyond those + # formally required at callsites involving these Class 23 errcodes with + # "Requirements: ". Section: Class 23 - Integrity Constraint Violation ! Requirement: unused 23000 E ERRCODE_INTEGRITY_CONSTRAINT_VIOLATION integrity_constraint_violation + Requirement: unused 23001 E ERRCODE_RESTRICT_VIOLATION restrict_violation + # Note that requirements for ERRCODE_NOT_NULL do not apply to domains: + Requirement: schema_name, table_name 23502 E ERRCODE_NOT_NULL_VIOLATION not_null_violation + Requirement: schema_name, table_name, constraint_name 23503 E ERRCODE_FOREIGN_KEY_VIOLATION foreign_key_violation + Requirement: schema_name, table_name, constraint_name 23505 E ERRCODE_UNIQUE_VIOLATION unique_violation + Requirement: constraint_name 23514 E ERRCODE_CHECK_VIOLATION check_violation + Requirement: schema_name, table_name, constraint_name 23P01 E ERRCODE_EXCLUSION_VIOLATION exclusion_violation Now, there are one or two places where these fields are not actually available even though they're formally required according to a literal reading of the above. This is only because there is clearly no such field sensibly available, even in principle - to my mind this cannot be a problem, because the application developer cannot have any reasonable expectation of a field being set. I'm really talking about two cases in particular: * For ERRCODE_NOT_NULL_VIOLATION, we don't actually provide schema_name and table_name in the event of domains. This was previously identified as an issue. If it is judged better to not have any requirements there at all, so be it. * For the validateDomainConstraint() ERRCODE_CHECK_VIOLATION ereport call, we may not provide a constraint name iff a Constraint.connname is NULL. Since there isn't a constraint name to give even in principle, and this is an isolated case, this seems reasonable. To make logging less verbose, TABLE NAME isn't consistently split out as a separate field - this seems fine to me, since application code doesn't target logs: + if (edata->column_name && edata->table_name) + { + log_line_prefix(&buf, edata); + appendStringInfo(&buf, _("COLUMN NAME: %s:%s\n"), + edata->table_name, edata->column_name); + } + else if (edata->table_name) + { + log_line_prefix(&buf, edata); + appendStringInfo(&buf, _("TABLE NAME: %s\n"), + edata->table_name); + } I used pgindent to selectively indent parts of the codebase affected by this patch. I am about ready to mark this one "ready for committer", but it would be nice at this point to get some buy-in on the basic view of how these things should work that informed this revision. Does anyone disagree with my contention that there should be a sane, well-defined contract, or any of the details of what that should look like? Was I right to suggest that some of the set of fields that appeared in Pavel's eelog3.diff revision are unnecessary? I'm sorry it took me as long as it did to get back to you on this. -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services
Attachment
> -----Original Message----- > From: pgsql-hackers-owner@postgresql.org [mailto:pgsql-hackers- > owner@postgresql.org] On Behalf Of Peter Geoghegan > Sent: Monday, December 10, 2012 3:29 PM > To: Pavel Stehule > Cc: PostgreSQL Hackers; Alvaro Herrera; Tom Lane > Subject: Re: [HACKERS] enhanced error fields > > Now, there are one or two places where these fields are not actually > available even though they're formally required according to a literal reading > of the above. This is only because there is clearly no such field sensibly > available, even in principle - to my mind this cannot be a problem, because > the application developer cannot have any reasonable expectation of a field > being set. I'm really talking about two cases in particular: > > * For ERRCODE_NOT_NULL_VIOLATION, we don't actually provide > schema_name and table_name in the event of domains. This was previously > identified as an issue. If it is judged better to not have any requirements > there at all, so be it. > > * For the validateDomainConstraint() ERRCODE_CHECK_VIOLATION ereport > call, we may not provide a constraint name iff a Constraint.connname is > NULL. Since there isn't a constraint name to give even in principle, and this is > an isolated case, this seems reasonable. > Just skimming this topic but if these enhanced error fields are going to be used by software, and we have 99% adherence to a standard, then my first reaction is why not just supply "<Not Applicable>" (or "<Not Available>" as appropriate) instead of suppressing the field altogether in these (and possibly other, future) cases and make adherence for these fields 100%? From an "ease-of-use" aspect for the API if I can simply always query each of those fields and know I will be receiving a string it does at least seem theoretically easier to interface with. If I am expecting special string values (enclosed in symbols making them invalid identifiers) I can then handle those as desired without either receiving an error or a NULL when I go to poll the missing field if those couple of instances. I may be paranoid or mistaken regarding how this work but figured I'd at least throw it out for consideration. David J.
On 10 December 2012 20:52, David Johnston <polobo@yahoo.com> wrote: > Just skimming this topic but if these enhanced error fields are going to be > used by software, and we have 99% adherence to a standard, then my first > reaction is why not just supply "<Not Applicable>" (or "<Not Available>" as > appropriate) instead of suppressing the field altogether in these (and > possibly other, future) cases and make adherence for these fields 100%? Well, this is an area that the standard doesn't have anything to say about. The standard defines errcodes, but not what fields they make available. I suppose you could say that the patch proposes that they become a Postgres extension to the standard. I probably could have found more places to set the fields. Certainly, I've already set them in some places where they are not actually required to be set by the new contract of errcodes.txt/errcodes.h. My immediate concern is getting something minimal committed, though. I think the latest revision handles all of the important cases (i.e. cases where applications may want a well-principled way of detecting a particular kind of error, like an error due to the violation of a particular, known constraint). -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services
Hello 2012/12/10 Peter Geoghegan <peter@2ndquadrant.com>: > So, I took a look at this, and made some more changes. > > I have a hard time seeing the utility of some fields that were in this > patch, and so they have been removed from this revision. > > Consider, for example: > > + constraint_table text, > + constraint_schema text, > > While constraint_name remains, I find it really hard to believe that > applications need a perfectly unambiguous idea of what constraint > they're dealing with. If applications have a constraint that has a > different domain-specific meaning depending on what schema it is in, > while a table in one schema uses that constraint in another, well, > that's a fairly awful design. Is it something that we really need to > care about? You mentioned the SQL standard, but as far as I know the > SQL standard has nothing to say about these fields within something > like errdata - their names are lifted from information_schema, but > being unambiguous is hardly so critical here. We want to identify an > object for the purposes of displaying an error message only. Caring > deeply about ambiguity seems wrong-headed to me in this context. > > + routine_name text, > + trigger_name text, > + trigger_table text, > + trigger_schema text, > > The whole point of an exception (which ereport() is very similar to) > is to decouple errors from error handling as much as possible - any > application that magically takes a different course of action based on > where that error occurred as reported by the error itself (and not the > location of where handling the error occurs) seems kind of > wrong-headed to me. Sure, a backtrace may be supplied, but that's only > for diagnostic purposes, and a human can just as easily get that > information already. If you can present an example of this information > actually being present in a way that is amenable to that, either in > any other RDBMS or any major programming language or framework, I > might reconsider. > > This just leaves: > > + column_name text, > + table_name text, > + constraint_name text, > + schema_name text, I agree so lot of mentioned fields are difficult defined in PostgreSQL environment due different meaning between ANSI SQL and PostgreSQL and because PostgreSQL doesn't support some ANSI SQL features - assertions. There are two basic aspects of design * usability - we would to remove necessity of parsing error messages for getting interesting data, it decrease dependency on current error messages - then I am thinking so some informations about triggers or functions where exception was born are practical (current implementation is different task - we can find better solution than I proposed highly probably) * compatibility - personally, lot of diagnostics fields are very vague defined, so here can be lot of issues - I have no experience with DB2 or Oracle, so I cannot to speak more, but we should to say, what we expect. > > This seems like enough information for any reasonable use of this > infrastructure, and I highly doubt anyone would avail of the other > fields if they remained. I guess an application might have done > something with "constraint_table", as with foreign keys for example, > but I just can't see that being used when constraint_name can be used. > I afraid so constraint name is not afraid postgres=# create table a (a int primary key); CREATE TABLE postgres=# create table b (b int references a(a)); CREATE TABLE postgres=# insert into b values (10); ERROR: insert or update on table "b" violates foreign key constraint "b_b_fkey" DETAIL: Key (b)=(10) is not present in table "a". postgres=# \set VERBOSITY verbose postgres=# insert into b values (10); ERROR: 23503: insert or update on table "b" violates foreign key constraint "b_b_fkey" DETAIL: Key (b)=(10) is not present in table "a". LOCATION: ri_ReportViolation, ri_triggers.c:3222 postgres=# insert into a values(10); INSERT 0 1 postgres=# insert into b values (10); INSERT 0 1 postgres=# delete from a; ERROR: 23503: update or delete on table "a" violates foreign key constraint "b_b_fkey" on table "b" DETAIL: Key (a)=(10) is still referenced from table "b". LOCATION: ri_ReportViolation, ri_triggers.c:3232 there are two different bugs - with same ERRCODE and constraint name - once is problem on "b", second on "a" so constraint_table is interesting information > Removing these fields has also allowed me to remove the "avoid setting > field at lower point in the callstack" logic, which was itself kind of > ugly. Since fields like routine_name only set themselves at the top of > the callstack, the potential for astonishing outcomes was pretty high > - what if you expect one routine_name, but then that routine follows a > slightly different code-path one day and you get another function > setting routine_name and undermining that expectation? > My implementation is probably too dumb - but a question - "where exception is coming from?" is relative typical - but we can to clean implementation. I see a issue in definition. Usually I am interesting about most deep custom code - not most deep code. so I am not against to removing some fields, but constraint_table and routine_name is useful and we should to produce this information. > There were some bugs left in the patch eelog3.diff, mostly due to > things like setting table name to what is actually an index name. As I > mentioned, we now assert that: > > + Assert(table->rd_rel->relkind == RELKIND_RELATION); > > in the functions within relerror.c. > > I have tightened up where these fields are available, and > appropriately documented that for the benefit of both application > developers and developers of client libraries. I went so far as to > hack the Perl scripts that generate .sgml and .h files from > errcodes.txt (i.e. the infrastrucutre that produces tables of errcodes > in various places from a single authoritative place) - I have > instituted a coding standard so that these fields are reliably > available and have documented that requirement at both the user and > hacker level. > > It would be nice if a Perl hacker could eyeball those changes - this > is my first time writing Perl, and I suspect it may be worth having > someone else to polish the Perl code a bit. > > I have emphasized the need for consistency and a sane contract for > application developers and third-party client driver authors - they > *need* to know that certain fields will always be available, or at > least won't be unavailable due to a change in the phase of the moon. > > errcodes.txt now says: > > + # Postgres coding standards mandate that certain fields be available in all > + # instances for some of the Class 23 errcodes, documented under > "Requirement: " > + # here. Some other errcode's ereport sites may, at their own discretion, make > + # errcolumn, errtable, errconstraint and errschema fields available too. > + # Furthermore, it is possible to make some fields available beyond those > + # formally required at callsites involving these Class 23 errcodes with > + # "Requirements: ". > Section: Class 23 - Integrity Constraint Violation > ! Requirement: unused > 23000 E ERRCODE_INTEGRITY_CONSTRAINT_VIOLATION > integrity_constraint_violation > + Requirement: unused > 23001 E ERRCODE_RESTRICT_VIOLATION > restrict_violation > + # Note that requirements for ERRCODE_NOT_NULL do not apply to domains: > + Requirement: schema_name, table_name > 23502 E ERRCODE_NOT_NULL_VIOLATION > not_null_violation > + Requirement: schema_name, table_name, constraint_name > 23503 E ERRCODE_FOREIGN_KEY_VIOLATION > foreign_key_violation > + Requirement: schema_name, table_name, constraint_name > 23505 E ERRCODE_UNIQUE_VIOLATION > unique_violation > + Requirement: constraint_name > 23514 E ERRCODE_CHECK_VIOLATION > check_violation > + Requirement: schema_name, table_name, constraint_name > 23P01 E ERRCODE_EXCLUSION_VIOLATION > exclusion_violation > > Now, there are one or two places where these fields are not actually > available even though they're formally required according to a literal > reading of the above. This is only because there is clearly no such > field sensibly available, even in principle - to my mind this cannot > be a problem, because the application developer cannot have any > reasonable expectation of a field being set. I'm really talking about > two cases in particular: > > * For ERRCODE_NOT_NULL_VIOLATION, we don't actually provide > schema_name and table_name in the event of domains. This was > previously identified as an issue. If it is judged better to not have > any requirements there at all, so be it. > > * For the validateDomainConstraint() ERRCODE_CHECK_VIOLATION ereport > call, we may not provide a constraint name iff a Constraint.connname > is NULL. Since there isn't a constraint name to give even in > principle, and this is an isolated case, this seems reasonable. ok > > To make logging less verbose, TABLE NAME isn't consistently split out > as a separate field - this seems fine to me, since application code > doesn't target logs: > > + if (edata->column_name && edata->table_name) > + { > + log_line_prefix(&buf, edata); > + appendStringInfo(&buf, _("COLUMN NAME: %s:%s\n"), > + edata->table_name, edata->column_name); > + } > + else if (edata->table_name) > + { > + log_line_prefix(&buf, edata); > + appendStringInfo(&buf, _("TABLE NAME: %s\n"), > + edata->table_name); > + } uff, no - I don't like it - it is not consistent and I don't think so one row more is a issue. But is a question if we don't need a second "VERBOSE" level for showing these fields - maybe VERBOSE_ENHANCED or some similar we don't need log these fields usually ?? > > I used pgindent to selectively indent parts of the codebase affected > by this patch. I am about ready to mark this one "ready for > committer", but it would be nice at this point to get some buy-in on > the basic view of how these things should work that informed this > revision. Does anyone disagree with my contention that there should be > a sane, well-defined contract, or any of the details of what that > should look like? Was I right to suggest that some of the set of > fields that appeared in Pavel's eelog3.diff revision are unnecessary? > > I'm sorry it took me as long as it did to get back to you on this. > It is ok, thank you very much Regards Pavel > -- > Peter Geoghegan http://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Training and Services
Hello 2012/12/10 Peter Geoghegan <peter@2ndquadrant.com>: > On 10 December 2012 20:52, David Johnston <polobo@yahoo.com> wrote: >> Just skimming this topic but if these enhanced error fields are going to be >> used by software, and we have 99% adherence to a standard, then my first >> reaction is why not just supply "<Not Applicable>" (or "<Not Available>" as >> appropriate) instead of suppressing the field altogether in these (and >> possibly other, future) cases and make adherence for these fields 100%? > > Well, this is an area that the standard doesn't have anything to say > about. The standard defines errcodes, but not what fields they make > available. I suppose you could say that the patch proposes that they > become a Postgres extension to the standard. standard speaking relative cleanly about content of diagnostics fields - it should to be empty string or value. We don't know what and how to be filled from standards, but we know, so "<Not Applicable>" or some similar is not compatible with ANSI SQL Regards Pavel > > I probably could have found more places to set the fields. Certainly, > I've already set them in some places where they are not actually > required to be set by the new contract of errcodes.txt/errcodes.h. My > immediate concern is getting something minimal committed, though. I > think the latest revision handles all of the important cases (i.e. > cases where applications may want a well-principled way of detecting a > particular kind of error, like an error due to the violation of a > particular, known constraint). > > -- > Peter Geoghegan http://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Training and Services
On 11 December 2012 13:01, Pavel Stehule <pavel.stehule@gmail.com> wrote: > There are two basic aspects of design > > * usability - we would to remove necessity of parsing error messages > for getting interesting data, it decrease dependency on current error > messages - then I am thinking so some informations about triggers or > functions where exception was born are practical (current > implementation is different task - we can find better solution than I > proposed highly probably) > > * compatibility - personally, lot of diagnostics fields are very vague > defined, so here can be lot of issues - I have no experience with DB2 > or Oracle, so I cannot to speak more, but we should to say, what we > expect. Me neither. > I afraid so constraint name is not afraid I'm sorry, I don't follow you here. > postgres=# create table a (a int primary key); > CREATE TABLE > postgres=# create table b (b int references a(a)); > CREATE TABLE > postgres=# insert into b values (10); > ERROR: insert or update on table "b" violates foreign key constraint "b_b_fkey" > DETAIL: Key (b)=(10) is not present in table "a". > postgres=# \set VERBOSITY verbose > postgres=# insert into b values (10); > ERROR: 23503: insert or update on table "b" violates foreign key > constraint "b_b_fkey" > DETAIL: Key (b)=(10) is not present in table "a". > LOCATION: ri_ReportViolation, ri_triggers.c:3222 > postgres=# insert into a values(10); > INSERT 0 1 > postgres=# insert into b values (10); > INSERT 0 1 > postgres=# delete from a; > ERROR: 23503: update or delete on table "a" violates foreign key > constraint "b_b_fkey" on table "b" > DETAIL: Key (a)=(10) is still referenced from table "b". > LOCATION: ri_ReportViolation, ri_triggers.c:3232 > > there are two different bugs - with same ERRCODE and constraint name - > once is problem on "b", second on "a" > > so constraint_table is interesting information Really? I'm not so sure that that's a distinct that the user would have to care about, or at least care much about. I defer to others here. Certainly, I am generally conscious of the fact that we don't want to create an excessive number of fields. > My implementation is probably too dumb - but a question - "where > exception is coming from?" is relative typical - but we can to clean > implementation. I see a issue in definition. Usually I am interesting > about most deep custom code - not most deep code. I think that knowing where an exception comes from is not useful information for end-users. Therefore, I am unconvinced of the need to present information to end users that is based on that. That's my difficulty. > so I am not against to removing some fields, but constraint_table and > routine_name is useful and we should to produce this information. I'm afraid that you and I differ on that. >> To make logging less verbose, TABLE NAME isn't consistently split out >> as a separate field - this seems fine to me, since application code >> doesn't target logs: > uff, no - I don't like it - it is not consistent and I don't think so > one row more is a issue. But is a question if we don't need a second > "VERBOSE" level for showing these fields - maybe VERBOSE_ENHANCED or > some similar VERBOSE_ENHANCED? I'm not sure about that. If you think it's important for them to be consistent, I concede the point. > we don't need log these fields usually ?? Well, I don't think we *usually* need to log *any* verbose fields. -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services
Hello Peter I rechecked your version eelog4 and I am thinking so it is ok. From my view it can be enough for application developer and I am not against to commit this (or little bit reduced version) as first step. As plpgsql developer I really miss a fields "routine_name, routine_schema and trigger_name". These fields can be simply implemented without any impact on performance. Because routine_name and routine_schema is not unique in postgres, I propose third field "routine_oid". My patch eelog5 is based on your eelog4 with enhancing for these mentioned fields - 5KB more - but only PL/pgSQL is supported now. I would to find a acceptable design now. Second - I don't see any value for forwarding these new fields (column_name, table_name, constraint_name, schema_name, routine_name, routine_schema, routine_id, trigger_name) to log or to csvlog and I propose don't push it to log. We can drop logging in next iteration if you agree. What do you thinking about new version? Regards Pavel 2012/12/10 Peter Geoghegan <peter@2ndquadrant.com>: > So, I took a look at this, and made some more changes. > > I have a hard time seeing the utility of some fields that were in this > patch, and so they have been removed from this revision. > > Consider, for example: > > + constraint_table text, > + constraint_schema text, > > While constraint_name remains, I find it really hard to believe that > applications need a perfectly unambiguous idea of what constraint > they're dealing with. If applications have a constraint that has a > different domain-specific meaning depending on what schema it is in, > while a table in one schema uses that constraint in another, well, > that's a fairly awful design. Is it something that we really need to > care about? You mentioned the SQL standard, but as far as I know the > SQL standard has nothing to say about these fields within something > like errdata - their names are lifted from information_schema, but > being unambiguous is hardly so critical here. We want to identify an > object for the purposes of displaying an error message only. Caring > deeply about ambiguity seems wrong-headed to me in this context. > > + routine_name text, > + trigger_name text, > + trigger_table text, > + trigger_schema text, > > The whole point of an exception (which ereport() is very similar to) > is to decouple errors from error handling as much as possible - any > application that magically takes a different course of action based on > where that error occurred as reported by the error itself (and not the > location of where handling the error occurs) seems kind of > wrong-headed to me. Sure, a backtrace may be supplied, but that's only > for diagnostic purposes, and a human can just as easily get that > information already. If you can present an example of this information > actually being present in a way that is amenable to that, either in > any other RDBMS or any major programming language or framework, I > might reconsider. > > This just leaves: > > + column_name text, > + table_name text, > + constraint_name text, > + schema_name text, > > This seems like enough information for any reasonable use of this > infrastructure, and I highly doubt anyone would avail of the other > fields if they remained. I guess an application might have done > something with "constraint_table", as with foreign keys for example, > but I just can't see that being used when constraint_name can be used. > > Removing these fields has also allowed me to remove the "avoid setting > field at lower point in the callstack" logic, which was itself kind of > ugly. Since fields like routine_name only set themselves at the top of > the callstack, the potential for astonishing outcomes was pretty high > - what if you expect one routine_name, but then that routine follows a > slightly different code-path one day and you get another function > setting routine_name and undermining that expectation? > > There were some bugs left in the patch eelog3.diff, mostly due to > things like setting table name to what is actually an index name. As I > mentioned, we now assert that: > > + Assert(table->rd_rel->relkind == RELKIND_RELATION); > > in the functions within relerror.c. > > I have tightened up where these fields are available, and > appropriately documented that for the benefit of both application > developers and developers of client libraries. I went so far as to > hack the Perl scripts that generate .sgml and .h files from > errcodes.txt (i.e. the infrastrucutre that produces tables of errcodes > in various places from a single authoritative place) - I have > instituted a coding standard so that these fields are reliably > available and have documented that requirement at both the user and > hacker level. > > It would be nice if a Perl hacker could eyeball those changes - this > is my first time writing Perl, and I suspect it may be worth having > someone else to polish the Perl code a bit. > > I have emphasized the need for consistency and a sane contract for > application developers and third-party client driver authors - they > *need* to know that certain fields will always be available, or at > least won't be unavailable due to a change in the phase of the moon. > > errcodes.txt now says: > > + # Postgres coding standards mandate that certain fields be available in all > + # instances for some of the Class 23 errcodes, documented under > "Requirement: " > + # here. Some other errcode's ereport sites may, at their own discretion, make > + # errcolumn, errtable, errconstraint and errschema fields available too. > + # Furthermore, it is possible to make some fields available beyond those > + # formally required at callsites involving these Class 23 errcodes with > + # "Requirements: ". > Section: Class 23 - Integrity Constraint Violation > ! Requirement: unused > 23000 E ERRCODE_INTEGRITY_CONSTRAINT_VIOLATION > integrity_constraint_violation > + Requirement: unused > 23001 E ERRCODE_RESTRICT_VIOLATION > restrict_violation > + # Note that requirements for ERRCODE_NOT_NULL do not apply to domains: > + Requirement: schema_name, table_name > 23502 E ERRCODE_NOT_NULL_VIOLATION > not_null_violation > + Requirement: schema_name, table_name, constraint_name > 23503 E ERRCODE_FOREIGN_KEY_VIOLATION > foreign_key_violation > + Requirement: schema_name, table_name, constraint_name > 23505 E ERRCODE_UNIQUE_VIOLATION > unique_violation > + Requirement: constraint_name > 23514 E ERRCODE_CHECK_VIOLATION > check_violation > + Requirement: schema_name, table_name, constraint_name > 23P01 E ERRCODE_EXCLUSION_VIOLATION > exclusion_violation > > Now, there are one or two places where these fields are not actually > available even though they're formally required according to a literal > reading of the above. This is only because there is clearly no such > field sensibly available, even in principle - to my mind this cannot > be a problem, because the application developer cannot have any > reasonable expectation of a field being set. I'm really talking about > two cases in particular: > > * For ERRCODE_NOT_NULL_VIOLATION, we don't actually provide > schema_name and table_name in the event of domains. This was > previously identified as an issue. If it is judged better to not have > any requirements there at all, so be it. > > * For the validateDomainConstraint() ERRCODE_CHECK_VIOLATION ereport > call, we may not provide a constraint name iff a Constraint.connname > is NULL. Since there isn't a constraint name to give even in > principle, and this is an isolated case, this seems reasonable. > > To make logging less verbose, TABLE NAME isn't consistently split out > as a separate field - this seems fine to me, since application code > doesn't target logs: > > + if (edata->column_name && edata->table_name) > + { > + log_line_prefix(&buf, edata); > + appendStringInfo(&buf, _("COLUMN NAME: %s:%s\n"), > + edata->table_name, edata->column_name); > + } > + else if (edata->table_name) > + { > + log_line_prefix(&buf, edata); > + appendStringInfo(&buf, _("TABLE NAME: %s\n"), > + edata->table_name); > + } > > I used pgindent to selectively indent parts of the codebase affected > by this patch. I am about ready to mark this one "ready for > committer", but it would be nice at this point to get some buy-in on > the basic view of how these things should work that informed this > revision. Does anyone disagree with my contention that there should be > a sane, well-defined contract, or any of the details of what that > should look like? Was I right to suggest that some of the set of > fields that appeared in Pavel's eelog3.diff revision are unnecessary? > > I'm sorry it took me as long as it did to get back to you on this. > > -- > Peter Geoghegan http://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Training and Services
Attachment
Pavel, Peter, To be honest, I haven't been following this thread at all, but I'm kind of amazed that this patch seems to be progressing. I spent quite a bit of time previously trying to change the CSV log structure to add a single column and this patch is adding a whole slew of them. My prior patch also allowed customization of the CSV log output, which seems like it would be even more valuable if we're going to be adding a bunch more fields. I know there are dissenting viewpoints on that, however, as application authors would prefer to not have to deal with variable CSV output formats, which I can understand. As regards this specific patch, I trust that the protocol error response changes won't have any impact on existing clients? Do we anticipate something like the JDBC driver having any issues? If there's any chance that we're going to break a client by sending it things which it won't understand, it seems we'd need to bump the protocol (which hasn't been done in quite some time and would likely needs its own discussion...). Regarding qualifying what we're returning- I tend to agree with Pavel that if we're going to provide this information, we should do it in a fully qualified manner. I can certainly see situations where constraint names are repeated in environments and it may not be clear what schema it's in (think application development with a dev and a test schema in the same database), and the same could hold for constraints against tables (though having those repeated would certainly be more rare, since you can't repeat a named constraint in a given schema if it has an index behind it, though you could with simple CHECK constraints). Again, my feeling is that if we're going to provide such information, it should be fully-qualified. There are some additional concerns regarding the patch itself that I have (do we really want to modify ereport() in this way? How can we implement something which scales better than just adding more and more parameters?) but I think we need to figure out exactly what we're agreed to be doing with this patch and get buy-in from everyone first. Thanks, Stephen * Pavel Stehule (pavel.stehule@gmail.com) wrote: > I rechecked your version eelog4 and I am thinking so it is ok. From my > view it can be enough for application developer and I am not against > to commit this (or little bit reduced version) as first step. > > As plpgsql developer I really miss a fields "routine_name, > routine_schema and trigger_name". These fields can be simply > implemented without any impact on performance. Because routine_name > and routine_schema is not unique in postgres, I propose third field > "routine_oid". My patch eelog5 is based on your eelog4 with enhancing > for these mentioned fields - 5KB more - but only PL/pgSQL is supported > now. I would to find a acceptable design now. > > Second - I don't see any value for forwarding these new fields > (column_name, table_name, constraint_name, schema_name, routine_name, > routine_schema, routine_id, trigger_name) to log or to csvlog and I > propose don't push it to log. We can drop logging in next iteration if > you agree. > > What do you thinking about new version? > > Regards > > Pavel > > > > 2012/12/10 Peter Geoghegan <peter@2ndquadrant.com>: > > So, I took a look at this, and made some more changes. > > > > I have a hard time seeing the utility of some fields that were in this > > patch, and so they have been removed from this revision. > > > > Consider, for example: > > > > + constraint_table text, > > + constraint_schema text, > > > > While constraint_name remains, I find it really hard to believe that > > applications need a perfectly unambiguous idea of what constraint > > they're dealing with. If applications have a constraint that has a > > different domain-specific meaning depending on what schema it is in, > > while a table in one schema uses that constraint in another, well, > > that's a fairly awful design. Is it something that we really need to > > care about? You mentioned the SQL standard, but as far as I know the > > SQL standard has nothing to say about these fields within something > > like errdata - their names are lifted from information_schema, but > > being unambiguous is hardly so critical here. We want to identify an > > object for the purposes of displaying an error message only. Caring > > deeply about ambiguity seems wrong-headed to me in this context. > > > > + routine_name text, > > + trigger_name text, > > + trigger_table text, > > + trigger_schema text, > > > > The whole point of an exception (which ereport() is very similar to) > > is to decouple errors from error handling as much as possible - any > > application that magically takes a different course of action based on > > where that error occurred as reported by the error itself (and not the > > location of where handling the error occurs) seems kind of > > wrong-headed to me. Sure, a backtrace may be supplied, but that's only > > for diagnostic purposes, and a human can just as easily get that > > information already. If you can present an example of this information > > actually being present in a way that is amenable to that, either in > > any other RDBMS or any major programming language or framework, I > > might reconsider. > > > > This just leaves: > > > > + column_name text, > > + table_name text, > > + constraint_name text, > > + schema_name text, > > > > This seems like enough information for any reasonable use of this > > infrastructure, and I highly doubt anyone would avail of the other > > fields if they remained. I guess an application might have done > > something with "constraint_table", as with foreign keys for example, > > but I just can't see that being used when constraint_name can be used. > > > > Removing these fields has also allowed me to remove the "avoid setting > > field at lower point in the callstack" logic, which was itself kind of > > ugly. Since fields like routine_name only set themselves at the top of > > the callstack, the potential for astonishing outcomes was pretty high > > - what if you expect one routine_name, but then that routine follows a > > slightly different code-path one day and you get another function > > setting routine_name and undermining that expectation? > > > > There were some bugs left in the patch eelog3.diff, mostly due to > > things like setting table name to what is actually an index name. As I > > mentioned, we now assert that: > > > > + Assert(table->rd_rel->relkind == RELKIND_RELATION); > > > > in the functions within relerror.c. > > > > I have tightened up where these fields are available, and > > appropriately documented that for the benefit of both application > > developers and developers of client libraries. I went so far as to > > hack the Perl scripts that generate .sgml and .h files from > > errcodes.txt (i.e. the infrastrucutre that produces tables of errcodes > > in various places from a single authoritative place) - I have > > instituted a coding standard so that these fields are reliably > > available and have documented that requirement at both the user and > > hacker level. > > > > It would be nice if a Perl hacker could eyeball those changes - this > > is my first time writing Perl, and I suspect it may be worth having > > someone else to polish the Perl code a bit. > > > > I have emphasized the need for consistency and a sane contract for > > application developers and third-party client driver authors - they > > *need* to know that certain fields will always be available, or at > > least won't be unavailable due to a change in the phase of the moon. > > > > errcodes.txt now says: > > > > + # Postgres coding standards mandate that certain fields be available in all > > + # instances for some of the Class 23 errcodes, documented under > > "Requirement: " > > + # here. Some other errcode's ereport sites may, at their own discretion, make > > + # errcolumn, errtable, errconstraint and errschema fields available too. > > + # Furthermore, it is possible to make some fields available beyond those > > + # formally required at callsites involving these Class 23 errcodes with > > + # "Requirements: ". > > Section: Class 23 - Integrity Constraint Violation > > ! Requirement: unused > > 23000 E ERRCODE_INTEGRITY_CONSTRAINT_VIOLATION > > integrity_constraint_violation > > + Requirement: unused > > 23001 E ERRCODE_RESTRICT_VIOLATION > > restrict_violation > > + # Note that requirements for ERRCODE_NOT_NULL do not apply to domains: > > + Requirement: schema_name, table_name > > 23502 E ERRCODE_NOT_NULL_VIOLATION > > not_null_violation > > + Requirement: schema_name, table_name, constraint_name > > 23503 E ERRCODE_FOREIGN_KEY_VIOLATION > > foreign_key_violation > > + Requirement: schema_name, table_name, constraint_name > > 23505 E ERRCODE_UNIQUE_VIOLATION > > unique_violation > > + Requirement: constraint_name > > 23514 E ERRCODE_CHECK_VIOLATION > > check_violation > > + Requirement: schema_name, table_name, constraint_name > > 23P01 E ERRCODE_EXCLUSION_VIOLATION > > exclusion_violation > > > > Now, there are one or two places where these fields are not actually > > available even though they're formally required according to a literal > > reading of the above. This is only because there is clearly no such > > field sensibly available, even in principle - to my mind this cannot > > be a problem, because the application developer cannot have any > > reasonable expectation of a field being set. I'm really talking about > > two cases in particular: > > > > * For ERRCODE_NOT_NULL_VIOLATION, we don't actually provide > > schema_name and table_name in the event of domains. This was > > previously identified as an issue. If it is judged better to not have > > any requirements there at all, so be it. > > > > * For the validateDomainConstraint() ERRCODE_CHECK_VIOLATION ereport > > call, we may not provide a constraint name iff a Constraint.connname > > is NULL. Since there isn't a constraint name to give even in > > principle, and this is an isolated case, this seems reasonable. > > > > To make logging less verbose, TABLE NAME isn't consistently split out > > as a separate field - this seems fine to me, since application code > > doesn't target logs: > > > > + if (edata->column_name && edata->table_name) > > + { > > + log_line_prefix(&buf, edata); > > + appendStringInfo(&buf, _("COLUMN NAME: %s:%s\n"), > > + edata->table_name, edata->column_name); > > + } > > + else if (edata->table_name) > > + { > > + log_line_prefix(&buf, edata); > > + appendStringInfo(&buf, _("TABLE NAME: %s\n"), > > + edata->table_name); > > + } > > > > I used pgindent to selectively indent parts of the codebase affected > > by this patch. I am about ready to mark this one "ready for > > committer", but it would be nice at this point to get some buy-in on > > the basic view of how these things should work that informed this > > revision. Does anyone disagree with my contention that there should be > > a sane, well-defined contract, or any of the details of what that > > should look like? Was I right to suggest that some of the set of > > fields that appeared in Pavel's eelog3.diff revision are unnecessary? > > > > I'm sorry it took me as long as it did to get back to you on this. > > > > -- > > Peter Geoghegan http://www.2ndQuadrant.com/ > > PostgreSQL Development, 24x7 Support, Training and Services > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers
Hello 2012/12/28 Stephen Frost <sfrost@snowman.net>: > Pavel, Peter, > > To be honest, I haven't been following this thread at all, but I'm kind > of amazed that this patch seems to be progressing. I spent quite a bit > of time previously trying to change the CSV log structure to add a > single column and this patch is adding a whole slew of them. My prior > patch also allowed customization of the CSV log output, which seems like > it would be even more valuable if we're going to be adding a bunch more > fields. I know there are dissenting viewpoints on that, however, as > application authors would prefer to not have to deal with variable CSV > output formats, which I can understand. Some smarter design of decision what will be stored to CSV and what now can be great. I am not thinking so these enhanced fields has value pro typical DBA and should be stored to CSV only when somebody need it. But it is different story - although it can simplify our work because then we don't need to solve this issue. > > As regards this specific patch, I trust that the protocol error response > changes won't have any impact on existing clients? Do we anticipate > something like the JDBC driver having any issues? If there's any chance > that we're going to break a client by sending it things which it won't > understand, it seems we'd need to bump the protocol (which hasn't been > done in quite some time and would likely needs its own discussion...). > Depends on implementation. Implementations designed similar to libpq should not have a problems. > Regarding qualifying what we're returning- I tend to agree with Pavel > that if we're going to provide this information, we should do it in a > fully qualified manner. I can certainly see situations where constraint > names are repeated in environments and it may not be clear what schema > it's in (think application development with a dev and a test schema in > the same database), and the same could hold for constraints against > tables (though having those repeated would certainly be more rare, since > you can't repeat a named constraint in a given schema if it has an index > behind it, though you could with simple CHECK constraints). Again, my > feeling is that if we're going to provide such information, it should be > fully-qualified. > > There are some additional concerns regarding the patch itself that I > have (do we really want to modify ereport() in this way? How can we > implement something which scales better than just adding more and more > parameters?) but I think we need to figure out exactly what we're agreed > to be doing with this patch and get buy-in from everyone first. Related fields are fundamental - and I am thinking so is good to optimize it - it has semantic tag, and tag has one char size. Some set of fields is given by ANSI - we can select some subset because not all fields described by ANSI has sense in pg. I don't would to enhance range of this patch too much and I don't would to redesign current concept of error handling. I am thinking so it works well and I have no negative feedback from PostgreSQL users that I know. But probably only reserved memory for ErrorData is limit for "serialized custom fields" - I believe so we will have a associative array - and one field can be of this type for any custom fields. Regards Pavel > > Thanks, > > Stephen > > * Pavel Stehule (pavel.stehule@gmail.com) wrote: >> I rechecked your version eelog4 and I am thinking so it is ok. From my >> view it can be enough for application developer and I am not against >> to commit this (or little bit reduced version) as first step. >> >> As plpgsql developer I really miss a fields "routine_name, >> routine_schema and trigger_name". These fields can be simply >> implemented without any impact on performance. Because routine_name >> and routine_schema is not unique in postgres, I propose third field >> "routine_oid". My patch eelog5 is based on your eelog4 with enhancing >> for these mentioned fields - 5KB more - but only PL/pgSQL is supported >> now. I would to find a acceptable design now. >> >> Second - I don't see any value for forwarding these new fields >> (column_name, table_name, constraint_name, schema_name, routine_name, >> routine_schema, routine_id, trigger_name) to log or to csvlog and I >> propose don't push it to log. We can drop logging in next iteration if >> you agree. >> >> What do you thinking about new version? >> >> Regards >> >> Pavel >> >> >> >> 2012/12/10 Peter Geoghegan <peter@2ndquadrant.com>: >> > So, I took a look at this, and made some more changes. >> > >> > I have a hard time seeing the utility of some fields that were in this >> > patch, and so they have been removed from this revision. >> > >> > Consider, for example: >> > >> > + constraint_table text, >> > + constraint_schema text, >> > >> > While constraint_name remains, I find it really hard to believe that >> > applications need a perfectly unambiguous idea of what constraint >> > they're dealing with. If applications have a constraint that has a >> > different domain-specific meaning depending on what schema it is in, >> > while a table in one schema uses that constraint in another, well, >> > that's a fairly awful design. Is it something that we really need to >> > care about? You mentioned the SQL standard, but as far as I know the >> > SQL standard has nothing to say about these fields within something >> > like errdata - their names are lifted from information_schema, but >> > being unambiguous is hardly so critical here. We want to identify an >> > object for the purposes of displaying an error message only. Caring >> > deeply about ambiguity seems wrong-headed to me in this context. >> > >> > + routine_name text, >> > + trigger_name text, >> > + trigger_table text, >> > + trigger_schema text, >> > >> > The whole point of an exception (which ereport() is very similar to) >> > is to decouple errors from error handling as much as possible - any >> > application that magically takes a different course of action based on >> > where that error occurred as reported by the error itself (and not the >> > location of where handling the error occurs) seems kind of >> > wrong-headed to me. Sure, a backtrace may be supplied, but that's only >> > for diagnostic purposes, and a human can just as easily get that >> > information already. If you can present an example of this information >> > actually being present in a way that is amenable to that, either in >> > any other RDBMS or any major programming language or framework, I >> > might reconsider. >> > >> > This just leaves: >> > >> > + column_name text, >> > + table_name text, >> > + constraint_name text, >> > + schema_name text, >> > >> > This seems like enough information for any reasonable use of this >> > infrastructure, and I highly doubt anyone would avail of the other >> > fields if they remained. I guess an application might have done >> > something with "constraint_table", as with foreign keys for example, >> > but I just can't see that being used when constraint_name can be used. >> > >> > Removing these fields has also allowed me to remove the "avoid setting >> > field at lower point in the callstack" logic, which was itself kind of >> > ugly. Since fields like routine_name only set themselves at the top of >> > the callstack, the potential for astonishing outcomes was pretty high >> > - what if you expect one routine_name, but then that routine follows a >> > slightly different code-path one day and you get another function >> > setting routine_name and undermining that expectation? >> > >> > There were some bugs left in the patch eelog3.diff, mostly due to >> > things like setting table name to what is actually an index name. As I >> > mentioned, we now assert that: >> > >> > + Assert(table->rd_rel->relkind == RELKIND_RELATION); >> > >> > in the functions within relerror.c. >> > >> > I have tightened up where these fields are available, and >> > appropriately documented that for the benefit of both application >> > developers and developers of client libraries. I went so far as to >> > hack the Perl scripts that generate .sgml and .h files from >> > errcodes.txt (i.e. the infrastrucutre that produces tables of errcodes >> > in various places from a single authoritative place) - I have >> > instituted a coding standard so that these fields are reliably >> > available and have documented that requirement at both the user and >> > hacker level. >> > >> > It would be nice if a Perl hacker could eyeball those changes - this >> > is my first time writing Perl, and I suspect it may be worth having >> > someone else to polish the Perl code a bit. >> > >> > I have emphasized the need for consistency and a sane contract for >> > application developers and third-party client driver authors - they >> > *need* to know that certain fields will always be available, or at >> > least won't be unavailable due to a change in the phase of the moon. >> > >> > errcodes.txt now says: >> > >> > + # Postgres coding standards mandate that certain fields be available in all >> > + # instances for some of the Class 23 errcodes, documented under >> > "Requirement: " >> > + # here. Some other errcode's ereport sites may, at their own discretion, make >> > + # errcolumn, errtable, errconstraint and errschema fields available too. >> > + # Furthermore, it is possible to make some fields available beyond those >> > + # formally required at callsites involving these Class 23 errcodes with >> > + # "Requirements: ". >> > Section: Class 23 - Integrity Constraint Violation >> > ! Requirement: unused >> > 23000 E ERRCODE_INTEGRITY_CONSTRAINT_VIOLATION >> > integrity_constraint_violation >> > + Requirement: unused >> > 23001 E ERRCODE_RESTRICT_VIOLATION >> > restrict_violation >> > + # Note that requirements for ERRCODE_NOT_NULL do not apply to domains: >> > + Requirement: schema_name, table_name >> > 23502 E ERRCODE_NOT_NULL_VIOLATION >> > not_null_violation >> > + Requirement: schema_name, table_name, constraint_name >> > 23503 E ERRCODE_FOREIGN_KEY_VIOLATION >> > foreign_key_violation >> > + Requirement: schema_name, table_name, constraint_name >> > 23505 E ERRCODE_UNIQUE_VIOLATION >> > unique_violation >> > + Requirement: constraint_name >> > 23514 E ERRCODE_CHECK_VIOLATION >> > check_violation >> > + Requirement: schema_name, table_name, constraint_name >> > 23P01 E ERRCODE_EXCLUSION_VIOLATION >> > exclusion_violation >> > >> > Now, there are one or two places where these fields are not actually >> > available even though they're formally required according to a literal >> > reading of the above. This is only because there is clearly no such >> > field sensibly available, even in principle - to my mind this cannot >> > be a problem, because the application developer cannot have any >> > reasonable expectation of a field being set. I'm really talking about >> > two cases in particular: >> > >> > * For ERRCODE_NOT_NULL_VIOLATION, we don't actually provide >> > schema_name and table_name in the event of domains. This was >> > previously identified as an issue. If it is judged better to not have >> > any requirements there at all, so be it. >> > >> > * For the validateDomainConstraint() ERRCODE_CHECK_VIOLATION ereport >> > call, we may not provide a constraint name iff a Constraint.connname >> > is NULL. Since there isn't a constraint name to give even in >> > principle, and this is an isolated case, this seems reasonable. >> > >> > To make logging less verbose, TABLE NAME isn't consistently split out >> > as a separate field - this seems fine to me, since application code >> > doesn't target logs: >> > >> > + if (edata->column_name && edata->table_name) >> > + { >> > + log_line_prefix(&buf, edata); >> > + appendStringInfo(&buf, _("COLUMN NAME: %s:%s\n"), >> > + edata->table_name, edata->column_name); >> > + } >> > + else if (edata->table_name) >> > + { >> > + log_line_prefix(&buf, edata); >> > + appendStringInfo(&buf, _("TABLE NAME: %s\n"), >> > + edata->table_name); >> > + } >> > >> > I used pgindent to selectively indent parts of the codebase affected >> > by this patch. I am about ready to mark this one "ready for >> > committer", but it would be nice at this point to get some buy-in on >> > the basic view of how these things should work that informed this >> > revision. Does anyone disagree with my contention that there should be >> > a sane, well-defined contract, or any of the details of what that >> > should look like? Was I right to suggest that some of the set of >> > fields that appeared in Pavel's eelog3.diff revision are unnecessary? >> > >> > I'm sorry it took me as long as it did to get back to you on this. >> > >> > -- >> > Peter Geoghegan http://www.2ndQuadrant.com/ >> > PostgreSQL Development, 24x7 Support, Training and Services > > >> >> -- >> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) >> To make changes to your subscription: >> http://www.postgresql.org/mailpref/pgsql-hackers >
Stephen Frost <sfrost@snowman.net> writes: > To be honest, I haven't been following this thread at all, but I'm kind > of amazed that this patch seems to be progressing. I spent quite a bit > of time previously trying to change the CSV log structure to add a > single column and this patch is adding a whole slew of them. I don't think that part's been agreed to at all; it seems entirely possible that it'll get dropped if/when the patch gets committed. I'm not convinced that having these fields in the log is worth the compatibility hit of changing the CSV column set. > As regards this specific patch, I trust that the protocol error response > changes won't have any impact on existing clients? Do we anticipate > something like the JDBC driver having any issues? If there's any chance > that we're going to break a client by sending it things which it won't > understand, it seems we'd need to bump the protocol (which hasn't been > done in quite some time and would likely needs its own discussion...). I think we are okay on this. The protocol definition has said from the very beginning "Since more field types might be added in future, frontends should silently ignore fields of unrecognized type". A client that fails to do that is buggy, not grounds for bumping the protocol. I haven't been following the patch so have no thoughts about your other comments. regards, tom lane
On 28 December 2012 14:00, Stephen Frost <sfrost@snowman.net> wrote: > There are some additional concerns regarding the patch itself that I > have (do we really want to modify ereport() in this way? How can we > implement something which scales better than just adding more and more > parameters?) but I think we need to figure out exactly what we're agreed > to be doing with this patch and get buy-in from everyone first. I don't think that the need to scale beyond what we have in my revision really exists. Some of the ereport sites are a bit unwieldy, but I don't see that there is much that can be done about that - you need to specify the information somewhere, and it makes sense to do it at that point. The field names are frequently expanded in the error message presented to the user anyway. -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services
On 28 December 2012 15:57, Tom Lane <tgl@sss.pgh.pa.us> wrote: > I don't think that part's been agreed to at all; it seems entirely > possible that it'll get dropped if/when the patch gets committed. > I'm not convinced that having these fields in the log is worth > the compatibility hit of changing the CSV column set. I am willing to let that go, because it just doesn't seem important to me. I just thought that if we were going to break compatibility, we might as well not hold back on the inclusion of fields. I'm not sure where this leaves the regular log. Pavel wants to remove that, too. I defer to others. Now, as to the question of whether we need to make sure that everything is always fully qualified - I respectfully disagree with Stephen, and maintain my position set out in the last round of feedback [1], which is that we don't need to fully qualify everything, because *for the purposes of error handling*, which is what I think we should care about, these fields are sufficient: + column_name text, + table_name text, + constraint_name text, + schema_name text, If you use the same constraint name everywhere, you might get the same error message. The situations in which this scheme might fall down are too implausible for me to want to bloat up all those ereport sites any further (something that Stephen also complained about). I think that the major outstanding issues are concerning whether or not I have the API here right. I make explicit guarantees as to the availability of certain fields for certain errcodes (a small number of "Class 23 - Integrity Constraint Violation" codes). No one has really said anything about that, though I think it's important. I also continue to think that we shouldn't have "routine_name", "routine_schema" and "trigger_name" fields - I think it's wrong-headed to have an exception handler magically act on knowledge about where the exception came from that has been baked into the exception - is there any sort of precedent for this? Pavel disagrees here. Again, I defer to others. [1] Post of revision "eelog4.diff": http://archives.postgresql.org/message-id/CAEYLb_UM9Z8vitJcKAOgG2shAB1N-71dozNhj2PJm2Ls96QVPg@mail.gmail.com -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services
> > I think that the major outstanding issues are concerning whether or > not I have the API here right. I make explicit guarantees as to the > availability of certain fields for certain errcodes (a small number of > "Class 23 - Integrity Constraint Violation" codes). No one has really > said anything about that, though I think it's important. > > I also continue to think that we shouldn't have "routine_name", > "routine_schema" and "trigger_name" fields - I think it's wrong-headed > to have an exception handler magically act on knowledge about where > the exception came from that has been baked into the exception - is > there any sort of precedent for this? Pavel disagrees here. Again, I > defer to others. depends on perspective - lines of error context are taken by same mechanism and some other fields from ErrorData too. I have not strong opinion if last implementation is the best - but I am sure about context. Developer usually should to know, where exception was created but interesting is position in custom code - so it usually different function than function that raises exception. If I understand you, we have a fields that has behave that you expected - filename and funcname. And I have not used these fields for application programming. Regards Pavel > > [1] Post of revision "eelog4.diff": > http://archives.postgresql.org/message-id/CAEYLb_UM9Z8vitJcKAOgG2shAB1N-71dozNhj2PJm2Ls96QVPg@mail.gmail.com > -- > Peter Geoghegan http://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Training and Services
On 28 December 2012 18:40, Pavel Stehule <pavel.stehule@gmail.com> wrote: > If > I understand you, we have a fields that has behave that you expected - > filename and funcname. And I have not used these fields for > application programming. No, that's not what I mean. What I mean is that it seems questionable to want to do anything *within* an exception handler on the basis of where an exception was raised from. You should just place exception handlers more carefully instead. Are you aware of any popular programming language that provides this kind of information? Can you tell in a well-principled way what function a Python exception originated from, for example? These are the built-in Python 2 exception classes: http://docs.python.org/2/library/exceptions.html None of the Python built-in exception types have this kind of information available from fields or anything. Now, you might say that Python isn't Postgres, and you'd be right. I'm fairly confident that you'll find no precedent for a routine_name field in any code that is even roughly analogous to the elog infrastructure, though, because acting on the basis of what particular function the exception was raised from seems quite hazardous - it breaks any kind of encapsulation that might have existed. If one person slightly refactors their code, it could break the code of another person who has never even met or talked to the first. -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services
2012/12/28 Peter Geoghegan <peter@2ndquadrant.com>: > On 28 December 2012 18:40, Pavel Stehule <pavel.stehule@gmail.com> wrote: >> If >> I understand you, we have a fields that has behave that you expected - >> filename and funcname. And I have not used these fields for >> application programming. > > No, that's not what I mean. What I mean is that it seems questionable > to want to do anything *within* an exception handler on the basis of > where an exception was raised from. You should just place exception > handlers more carefully instead. > > Are you aware of any popular programming language that provides this > kind of information? Can you tell in a well-principled way what > function a Python exception originated from, for example? These are > the built-in Python 2 exception classes: > > http://docs.python.org/2/library/exceptions.html > for this subject ANSI SQL is more relevant source or manual for DB2 or Oracle. Design of Python and native PL languages are different. Python can use complex nested structures. PL - PL/pgSQL or PL/PSM is designed for work with simply scalar types. So these environments are not comparable. > None of the Python built-in exception types have this kind of > information available from fields or anything. Now, you might say that > Python isn't Postgres, and you'd be right. I'm fairly confident that > you'll find no precedent for a routine_name field in any code that is > even roughly analogous to the elog infrastructure, though, because > acting on the basis of what particular function the exception was > raised from seems quite hazardous - it breaks any kind of > encapsulation that might have existed. > > If one person slightly refactors their code, it could break the code > of another person who has never even met or talked to the first. yes, it is not valid argument, I am sorry. Lot of error fields doesn't work if developer doesn't respect some coding standards. It is not just routine_name. Only when your implementation is correct, then code works. Best regards Pavel > > -- > Peter Geoghegan http://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Training and Services
On 28 December 2012 19:23, Pavel Stehule <pavel.stehule@gmail.com> wrote: > for this subject ANSI SQL is more relevant source or manual for DB2 or > Oracle. Design of Python and native PL languages are different. Python > can use complex nested structures. PL - PL/pgSQL or PL/PSM is designed > for work with simply scalar types. So these environments are not > comparable. I don't see how the fact that Python can use nested data structures has any bearing (you could argue that plpgsql does too, fwiw). Please direct me towards the manual of DB2 or Oracle where it says that something like routine_name is exposed for error handling purposes. Correct me if I'm mistaken, but I don't believe that ANSI SQL has anything to say about any of these error fields. You've just lifted the names of the fields from various information_schema catalogs, which is hardly the same thing. -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services
2012/12/28 Peter Geoghegan <peter@2ndquadrant.com>: > On 28 December 2012 19:23, Pavel Stehule <pavel.stehule@gmail.com> wrote: >> for this subject ANSI SQL is more relevant source or manual for DB2 or >> Oracle. Design of Python and native PL languages are different. Python >> can use complex nested structures. PL - PL/pgSQL or PL/PSM is designed >> for work with simply scalar types. So these environments are not >> comparable. > > I don't see how the fact that Python can use nested data structures > has any bearing (you could argue that plpgsql does too, fwiw). > > Please direct me towards the manual of DB2 or Oracle where it says > that something like routine_name is exposed for error handling > purposes. Correct me if I'm mistaken, but I don't believe that ANSI > SQL has anything to say about any of these error fields. You've just > lifted the names of the fields from various information_schema > catalogs, which is hardly the same thing. http://publib.boulder.ibm.com/infocenter/iseries/v5r3/index.jsp?topic=%2Fdb2%2Frbafzmstgetdiag.htm http://savage.net.au/SQL/sql-2003-2.bnf - GET DIAGNOSTICS statement SQL: 1999, Jim Melton,Alan R. Simon - description of GET DIAGNOSTICS statement ANSI SQL - Feature F122, "Enhanced diagnostics management" - table identifier>s for use with <get diagnostics statement> - and related description Regards Pavel please, search > > -- > Peter Geoghegan http://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Training and Services
On 12/10/12 4:23 PM, Peter Geoghegan wrote: > Well, this is an area that the standard doesn't have anything to say > about. The standard defines errcodes, but not what fields they make > available. I suppose you could say that the patch proposes that they > become a Postgres extension to the standard. The standard certainly does define extra fields for errors; see <get diagnostics statement>. We have a GET DIAGNOSTICS statement in PL/pgSQL, so there is certainly precedent for all of this.
On 12/28/12 2:03 PM, Peter Geoghegan wrote: > No, that's not what I mean. What I mean is that it seems questionable > to want to do anything *within* an exception handler on the basis of > where an exception was raised from. Isn't that the whole point of this patch? The only purpose of this feature is to make the exception information available in a "machine-readable" way. That functionality has been requested many times over the years.
On 28 December 2012 19:55, Pavel Stehule <pavel.stehule@gmail.com> wrote: > http://publib.boulder.ibm.com/infocenter/iseries/v5r3/index.jsp?topic=%2Fdb2%2Frbafzmstgetdiag.htm I'm unconvinced by this. First of all, it only applies to the GET DIAGNOSTICS statement, and the only implementation that actually currently uses that is DB2, AFAICT. Secondly, DB2 only provides it for very specific errcode classes that relate to problems that are peculiar to routines/functions, so in general you cannot rely on the information being available in the same way as you intend. Clearly, DB2 provides it for completeness, and not because you can rely on it being available for error handling purposes. On the other hand, your latest revision of the patch (eelog5.patch) sees plpgsql jury-rigged to set the fields itself, which seems like a whole other proposition entirely. What's more, you're changing ErrorData to make this happen, rather than having the user interrogate the server for this information as GET DIAGNOSTICS does. So I don't see that this supports your case at all. I maintain that having an exception handler's behaviour vary based on a field that describes a routine that originally raised the function is a really bad idea, and that we should not facilitate it. -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services
On 12/28/12 2:03 PM, Peter Geoghegan wrote: > Are you aware of any popular programming language that provides this > kind of information? Can you tell in a well-principled way what > function a Python exception originated from, for example? These are > the built-in Python 2 exception classes: > > http://docs.python.org/2/library/exceptions.html > > None of the Python built-in exception types have this kind of > information available from fields or anything. Sure, OSError has a filename attribute (which I'm sure is qualified by a directory name if necessary), SyntaxError has filename, lineno, etc. OSError.filename is essentially the equivalent of what is being proposed here.
On 28 December 2012 20:34, Peter Eisentraut <peter_e@gmx.net> wrote: > Isn't that the whole point of this patch? The only purpose of this > feature is to make the exception information available in a > "machine-readable" way. That functionality has been requested many > times over the years. Right, and I agree that it's very useful for some fields (if you can actually have a reasonable set of guarantees about where each becomes available). I just don't think that it's worth including fields like routine_name within ErrorData, and in fact it may be harmful to do so. -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services
2012/12/28 Peter Geoghegan <peter@2ndquadrant.com>: > On 28 December 2012 19:55, Pavel Stehule <pavel.stehule@gmail.com> wrote: >> http://publib.boulder.ibm.com/infocenter/iseries/v5r3/index.jsp?topic=%2Fdb2%2Frbafzmstgetdiag.htm > > I'm unconvinced by this. First of all, it only applies to the GET > DIAGNOSTICS statement, and the only implementation that actually > currently uses that is DB2, AFAICT. Secondly, DB2 only provides it for > very specific errcode classes that relate to problems that are > peculiar to routines/functions, so in general you cannot rely on the > information being available in the same way as you intend. Clearly, > DB2 provides it for completeness, and not because you can rely on it > being available for error handling purposes. On the other hand, your > latest revision of the patch (eelog5.patch) sees plpgsql jury-rigged > to set the fields itself, which seems like a whole other proposition > entirely. > > What's more, you're changing ErrorData to make this happen, rather > than having the user interrogate the server for this information as > GET DIAGNOSTICS does. So I don't see that this supports your case at > all. I maintain that having an exception handler's behaviour vary > based on a field that describes a routine that originally raised the > function is a really bad idea, and that we should not facilitate it. It cannot to wait to GET DIAGNOSTICS request - because when GET DIAGNOSTICS is called, then all unsupported info about exception is lost, all used memory will be released. So if we would to support ROUTINE_NAME or similar fields like CONTEXT or MESSAGE, we have to store these values to ErrorData without knowledge if this value will be used or not. Pavel > > -- > Peter Geoghegan http://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Training and Services
On 28 December 2012 20:40, Pavel Stehule <pavel.stehule@gmail.com> wrote: > It cannot to wait to GET DIAGNOSTICS request - because when GET > DIAGNOSTICS is called, then all unsupported info about exception is > lost, all used memory will be released. I'm not suggesting that you do. I'm only suggesting that the two are not comparable. -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services
2012/12/28 Peter Geoghegan <peter@2ndquadrant.com>: > On 28 December 2012 20:40, Pavel Stehule <pavel.stehule@gmail.com> wrote: >> It cannot to wait to GET DIAGNOSTICS request - because when GET >> DIAGNOSTICS is called, then all unsupported info about exception is >> lost, all used memory will be released. > > I'm not suggesting that you do. I'm only suggesting that the two are > not comparable MESSAGE and similar yes. But CONTEXT is comparable. And there is interesting precedent. Some years PL/Python or PL/Perl doesn't support CONTEXT and PL/pgSQL yes. And it was working. Regards Pavel > > -- > Peter Geoghegan http://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Training and Services
On 28 December 2012 20:40, Peter Eisentraut <peter_e@gmx.net> wrote: > Sure, OSError has a filename attribute (which I'm sure is qualified by a > directory name if necessary), SyntaxError has filename, lineno, etc. > > OSError.filename is essentially the equivalent of what is being proposed > here. I disagree. That isn't the same as what's being proposed here, because that's something you're only going to get for those particular exception classes, and I'm guessing that the fields only exist to facilitate refactoring tools, IDEs and the like. If BaseException, Exception or StandardError had a function_name field, and it was idiomatic to change the behaviour of an exception handler on the basis of the field's value, that would be equivalent. Obviously that is not the case. -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services
Peter Eisentraut <peter_e@gmx.net> writes: > On 12/28/12 2:03 PM, Peter Geoghegan wrote: >> None of the Python built-in exception types have this kind of >> information available from fields or anything. > Sure, OSError has a filename attribute (which I'm sure is qualified by a > directory name if necessary), SyntaxError has filename, lineno, etc. No no no ... that's completely unrelated. Those fields report information about the user-visible object that the error is about. The point Peter G is making is that reporting the source of the error is like having the kernel report the name of the function inside the kernel that threw the error. Or perhaps the name of the kernel source file containing that function. Now, these things are quite useful *to a kernel developer* who's trying to debug a problem involving an error report. But they're pretty useless to an application developer, and certainly any application developer who relied on such information to control his error handling code would receive no sympathy when (not if) a new kernel version broke it. In the same way, the filename/lineno stuff that we include in ereports is sometimes useful to Postgres developers --- but it is very hard to see a reason why application code would do anything else with it except possibly print it for human reference. And, in the same way, a CONTEXT traceback is sometimes useful for debugging a collection of server-side functions --- but it's hard to see any client-side code using that information in a mechanized way, at least not while respecting a sane degree of separation between server-side and client-side code. So I'm with Peter G on this: the existing CONTEXT mechanism is good enough, we do not need to split out selected sub-parts of that as separate error fields. Moreover, doing so would provide only an utterly arbitrary subset of the context stack: who's to say whether it would be more important to report the most closely nested function, or the outermost one? regards, tom lane
2012/12/29 Tom Lane <tgl@sss.pgh.pa.us>: > Peter Eisentraut <peter_e@gmx.net> writes: >> On 12/28/12 2:03 PM, Peter Geoghegan wrote: >>> None of the Python built-in exception types have this kind of >>> information available from fields or anything. > >> Sure, OSError has a filename attribute (which I'm sure is qualified by a >> directory name if necessary), SyntaxError has filename, lineno, etc. > > No no no ... that's completely unrelated. Those fields report > information about the user-visible object that the error is about. > > The point Peter G is making is that reporting the source of the error is > like having the kernel report the name of the function inside the kernel > that threw the error. Or perhaps the name of the kernel source file > containing that function. > > Now, these things are quite useful *to a kernel developer* who's trying > to debug a problem involving an error report. But they're pretty > useless to an application developer, and certainly any application > developer who relied on such information to control his error handling > code would receive no sympathy when (not if) a new kernel version broke > it. > > In the same way, the filename/lineno stuff that we include in ereports > is sometimes useful to Postgres developers --- but it is very hard to > see a reason why application code would do anything else with it except > possibly print it for human reference. I wrote exactly this - our filename, funcname is not useful for PL application developers > > And, in the same way, a CONTEXT traceback is sometimes useful for > debugging a collection of server-side functions --- but it's hard to see > any client-side code using that information in a mechanized way, at > least not while respecting a sane degree of separation between > server-side and client-side code. again - it is reason why I propose ROUTINE_NAME and TRIGGER_NAME - it can be useful for some use cases, where developer should to handle exception when they coming from known functions/triggers and he would to raise exception, when it was raised elsewhere. Is possible working with CONTEXT, but it needs little bit more work and situation is very similar to fields COLUMN_NAME, where I can parse message and I can get this information. But then it depends on message format. > > So I'm with Peter G on this: the existing CONTEXT mechanism is good > enough, we do not need to split out selected sub-parts of that as > separate error fields. Moreover, doing so would provide only an utterly > arbitrary subset of the context stack: who's to say whether it would be > more important to report the most closely nested function, or the > outermost one? I don't agree with this argument - you can say too "COLUMN_NAME, TABLE_NAME" is not necessary, because MESSAGE is good enough. I don't see any difference - and I would to use these variables for error handling (not for logging) without dependency on current format of MESSAGE or CONTEXT. Second question - what should be context is important. I am thinking so standard and implementation in other databases is relative clean - it is name of custom routine, that doesn't handle exception. Moving to pg - it is name of routine on the top on error callback stack, because builtin functions doesn't set it - I don't need on top of CONTEX - div_int or other name of builtin function - but I need there function foo(b): a := 10/b. other point - theoretically I can get ROUTINE_NAME from CONTEXT, but I cannot get TRIGGER_NAME - this information is lost. Regards Pavel > > regards, tom lane
On 29 December 2012 05:04, Pavel Stehule <pavel.stehule@gmail.com> wrote: > again - it is reason why I propose ROUTINE_NAME and TRIGGER_NAME - it > can be useful for some use cases, where developer should to handle > exception when they coming from known functions/triggers and he would > to raise exception, when it was raised elsewhere. Is possible working > with CONTEXT, but it needs little bit more work and situation is very > similar to fields COLUMN_NAME, where I can parse message and I can get > this information. But then it depends on message format. That seems very thin. Again, I have to point out that a precedent for this doesn't really appear to exist in any major programming language, even though ISTM it would be just as useful in a wide variety of programming environments as it would be within Postgres. As I've said, the DB2 GET DIAGNOSTIC stuff isn't anything like what you've proposed. >> So I'm with Peter G on this: the existing CONTEXT mechanism is good >> enough, we do not need to split out selected sub-parts of that as >> separate error fields. Moreover, doing so would provide only an utterly >> arbitrary subset of the context stack: who's to say whether it would be >> more important to report the most closely nested function, or the >> outermost one? > > I don't agree with this argument - you can say too "COLUMN_NAME, > TABLE_NAME" is not necessary, because MESSAGE is good enough. I don't > see any difference - and I would to use these variables for error > handling (not for logging) without dependency on current format of > MESSAGE or CONTEXT. In my judgement, COLUMN_NAME and TABLE_NAME can be used without having an unreasonable degree of coupling between client and server-side code. They are at least easily understood, and not at all astonishing, unlike ROUTINE_NAME. That said, CONSTRAINT_NAME clearly is by far the most compelling new field, and I actually wouldn't mind losing the other ones too much. I might have suggested losing COLUMN_NAME and TABLE_NAME myself if we could reliably get something like a constraint name for NOT NULL violations. > Second question - what should be context is important. I am thinking > so standard and implementation in other databases is relative clean - > it is name of custom routine, that doesn't handle exception. Moving to > pg - it is name of routine on the top on error callback stack, because > builtin functions doesn't set it - I don't need on top of CONTEX - > div_int or other name of builtin function - but I need there function > foo(b): a := 10/b. I don't think the fact that built-in functions don't set ROUTINE_NAME supports your position. In fact, it seems pretty broken to me that one pl handler sets the value, while others may not. Furthermore, the distinction between built-in and not built-in is fairly small within Postgres - who is to say that even if a person thinks the proposed semantics are useful, they'll continue to when they find out that ROUTINE_NAME isn't set to the name of their C function? > other point - theoretically I can get ROUTINE_NAME from CONTEXT, but I > cannot get TRIGGER_NAME - this information is lost. I don't think there should be a TRIGGER_NAME either - I think that we should make interfaces easy to use correctly, and hard to use incorrectly, and a mechanised response to a TRIGGER_NAME seems incorrect. If you think that there should be a trigger name within CONTEXT, there might be a case to be made for that, but I would prefer to have that reviewed separately. -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services
Peter Geoghegan <peter@2ndquadrant.com> writes: > On 29 December 2012 05:04, Pavel Stehule <pavel.stehule@gmail.com> wrote: >>> So I'm with Peter G on this: the existing CONTEXT mechanism is good >>> enough, we do not need to split out selected sub-parts of that as >>> separate error fields. Moreover, doing so would provide only an utterly >>> arbitrary subset of the context stack: who's to say whether it would be >>> more important to report the most closely nested function, or the >>> outermost one? >> I don't agree with this argument - you can say too "COLUMN_NAME, >> TABLE_NAME" is not necessary, because MESSAGE is good enough. I don't >> see any difference - and I would to use these variables for error >> handling (not for logging) without dependency on current format of >> MESSAGE or CONTEXT. > In my judgement, COLUMN_NAME and TABLE_NAME can be used without having > an unreasonable degree of coupling between client and server-side > code. Yeah, I was about to reply in almost exactly those words. Table and column names are, almost by definition, part of the shared understanding of the client-side and server-side portions of any application, because both sides have to manipulate those in order to do anything. However, if client-side code were to rely on something like ROUTINE_NAME for error processing, it would become closely tied to the *code structure* of the server-side functions, which is a bad idea. Basically the value proposition here is "let's contort the backend code in order to support very bad programming practices in applications". I'm not attracted by either part of that. > I don't think there should be a TRIGGER_NAME either - I think that we > should make interfaces easy to use correctly, and hard to use > incorrectly, and a mechanised response to a TRIGGER_NAME seems > incorrect. If you think that there should be a trigger name within > CONTEXT, there might be a case to be made for that, but I would prefer > to have that reviewed separately. If the calling of a trigger isn't visible in the CONTEXT stack then that's something we should address --- but in practice, wouldn't it be visible anyway as an ordinary function call? At least if the trigger is written in a reasonable PL. If the trigger is written in C, then I'm outright against adding any such overhead. I don't think this patch should be adding any cycles whatsoever to non-error code paths. regards, tom lane
2012/12/29 Peter Geoghegan <peter@2ndquadrant.com>: > On 29 December 2012 05:04, Pavel Stehule <pavel.stehule@gmail.com> wrote: >> again - it is reason why I propose ROUTINE_NAME and TRIGGER_NAME - it >> can be useful for some use cases, where developer should to handle >> exception when they coming from known functions/triggers and he would >> to raise exception, when it was raised elsewhere. Is possible working >> with CONTEXT, but it needs little bit more work and situation is very >> similar to fields COLUMN_NAME, where I can parse message and I can get >> this information. But then it depends on message format. > > That seems very thin. Again, I have to point out that a precedent for > this doesn't really appear to exist in any major programming language, > even though ISTM it would be just as useful in a wide variety of > programming environments as it would be within Postgres. As I've said, > the DB2 GET DIAGNOSTIC stuff isn't anything like what you've proposed. > >>> So I'm with Peter G on this: the existing CONTEXT mechanism is good >>> enough, we do not need to split out selected sub-parts of that as >>> separate error fields. Moreover, doing so would provide only an utterly >>> arbitrary subset of the context stack: who's to say whether it would be >>> more important to report the most closely nested function, or the >>> outermost one? >> >> I don't agree with this argument - you can say too "COLUMN_NAME, >> TABLE_NAME" is not necessary, because MESSAGE is good enough. I don't >> see any difference - and I would to use these variables for error >> handling (not for logging) without dependency on current format of >> MESSAGE or CONTEXT. > > In my judgement, COLUMN_NAME and TABLE_NAME can be used without having > an unreasonable degree of coupling between client and server-side > code. They are at least easily understood, and not at all astonishing, > unlike ROUTINE_NAME. That said, CONSTRAINT_NAME clearly is by far the > most compelling new field, and I actually wouldn't mind losing the > other ones too much. I might have suggested losing COLUMN_NAME and > TABLE_NAME myself if we could reliably get something like a constraint > name for NOT NULL violations. > Maybe this is main difference of our views. In my mind I never thinking about using ROUTINE_NAME on client side part. My motivation and reason why I push this feature is a using ROUTINE_NAME inside PL/pgSQL - or other PL, where it can helps with more precious exception processing - one real example - handling exception in business process routines BEGIN ... ... business logic EXCEPTION WHEN OTHERS THEN .. all is returned back GET STACKED DIAGNOSTICS _message = MESSAGE, _routine_name = ROUTINE_NAME, ... IF verbosity <> 'verbose' THEN INSERT INTO log(...) VALUES(_message, _routine_name) ELSE -- log CONTEXT too INSERT INTO log(...) VALUES(_message, _routine_name, _context); END IF; END; ROUTINE_NAME is information, that has different character from MESSAGE --> and it is simply (not too detailed) information, where exception coming from. Usually I don't know what check or builtin function failed, because it is described by MESSAGE. And I don't would to use CONTEXT everywhere, because it is too detailed. >> Second question - what should be context is important. I am thinking >> so standard and implementation in other databases is relative clean - >> it is name of custom routine, that doesn't handle exception. Moving to >> pg - it is name of routine on the top on error callback stack, because >> builtin functions doesn't set it - I don't need on top of CONTEX - >> div_int or other name of builtin function - but I need there function >> foo(b): a := 10/b. > > I don't think the fact that built-in functions don't set ROUTINE_NAME > supports your position. In fact, it seems pretty broken to me that one > pl handler sets the value, while others may not. Furthermore, the > distinction between built-in and not built-in is fairly small within > Postgres - who is to say that even if a person thinks the proposed > semantics are useful, they'll continue to when they find out that > ROUTINE_NAME isn't set to the name of their C function? If C function sets error callback then it can fill ROUTINE_NAME. yes, now errors callback is used only for PL languages - probably it is more intuitive design than explicit design, but it is working. It is different design than in generic languages, where you can track exception to really last point. But I don't think so it is necessary in PL - typical customer is not interested in PostgreSQL internals. > >> other point - theoretically I can get ROUTINE_NAME from CONTEXT, but I >> cannot get TRIGGER_NAME - this information is lost. > > I don't think there should be a TRIGGER_NAME either - I think that we > should make interfaces easy to use correctly, and hard to use > incorrectly, and a mechanised response to a TRIGGER_NAME seems > incorrect. If you think that there should be a trigger name within > CONTEXT, there might be a case to be made for that, but I would prefer > to have that reviewed separately. We can do it safely for any PL language - because PL handlers correctly use error callbacks. > > -- > Peter Geoghegan http://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Training and Services
2012/12/29 Tom Lane <tgl@sss.pgh.pa.us>: > Peter Geoghegan <peter@2ndquadrant.com> writes: >> On 29 December 2012 05:04, Pavel Stehule <pavel.stehule@gmail.com> wrote: >>>> So I'm with Peter G on this: the existing CONTEXT mechanism is good >>>> enough, we do not need to split out selected sub-parts of that as >>>> separate error fields. Moreover, doing so would provide only an utterly >>>> arbitrary subset of the context stack: who's to say whether it would be >>>> more important to report the most closely nested function, or the >>>> outermost one? > >>> I don't agree with this argument - you can say too "COLUMN_NAME, >>> TABLE_NAME" is not necessary, because MESSAGE is good enough. I don't >>> see any difference - and I would to use these variables for error >>> handling (not for logging) without dependency on current format of >>> MESSAGE or CONTEXT. > >> In my judgement, COLUMN_NAME and TABLE_NAME can be used without having >> an unreasonable degree of coupling between client and server-side >> code. > > Yeah, I was about to reply in almost exactly those words. Table and > column names are, almost by definition, part of the shared understanding > of the client-side and server-side portions of any application, because > both sides have to manipulate those in order to do anything. However, > if client-side code were to rely on something like ROUTINE_NAME for > error processing, it would become closely tied to the *code structure* > of the server-side functions, which is a bad idea. > > Basically the value proposition here is "let's contort the backend code > in order to support very bad programming practices in applications". > I'm not attracted by either part of that. I don't think so it is necessary bad practice. first - my motivation is related primary for usage in PL/pgSQL. we can handle exceptions very near to place where exception was raised. And then I can take same information like ROUTINE_NAME. Later I can forward exception. A disadvantage of this design is higher number of subtransactions. Other design, I talked about it - is based on one relative high positioned subtransaction, where I can handle lot of types of exceptions or can raise final exception. I can do same work as in previous mentioned design but just with one subtransaction. But for design I need a data. And if is possible in simple form - because it is better for PL/pgSQL. > >> I don't think there should be a TRIGGER_NAME either - I think that we >> should make interfaces easy to use correctly, and hard to use >> incorrectly, and a mechanised response to a TRIGGER_NAME seems >> incorrect. If you think that there should be a trigger name within >> CONTEXT, there might be a case to be made for that, but I would prefer >> to have that reviewed separately. > > If the calling of a trigger isn't visible in the CONTEXT stack then > that's something we should address --- but in practice, wouldn't it be > visible anyway as an ordinary function call? At least if the trigger > is written in a reasonable PL. If the trigger is written in C, then > I'm outright against adding any such overhead. I don't think this patch > should be adding any cycles whatsoever to non-error code paths. > visibility in CONTEXT depends on creating and registering error callbacks - I don't would to change it. Regards Pavel > regards, tom lane
* Peter Geoghegan (peter@2ndquadrant.com) wrote: > On 28 December 2012 15:57, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > I don't think that part's been agreed to at all; it seems entirely > > possible that it'll get dropped if/when the patch gets committed. > > I'm not convinced that having these fields in the log is worth > > the compatibility hit of changing the CSV column set. > > I am willing to let that go, because it just doesn't seem important to > me. I just thought that if we were going to break compatibility, we > might as well not hold back on the inclusion of fields. I'm not sure > where this leaves the regular log. Pavel wants to remove that, too. I > defer to others. I'd like to see more options for what is logged, as I've mentioned in the past, and I think all of these would be good candidates for logging in some circumstances. The insistence on having one CSV format to rule them all and which doesn't change (except that we've been regularly changing it across major releases anyway..) doesn't strike me as the right approach. > Now, as to the question of whether we need to make sure that > everything is always fully qualified - I respectfully disagree with > Stephen, and maintain my position set out in the last round of > feedback [1], which is that we don't need to fully qualify everything, > because *for the purposes of error handling*, which is what I think we > should care about, these fields are sufficient: It's not entirely clear to me what distinction you're making here or if we're simply in agreement about what the necessary fields are. Having the schema name, table name, column name and constraint name seems like it's sufficient to fully qualify a specific constraint..? Where I see this being useful would be something along these lines: COMMENT ON my_constraint ON my_schema.my_table IS 'Please update XYZ first.'; Application runs, gets back a constraint violation, then: SELECT description, consrc FROM pg_description a JOIN pg_constraint b ON (a.objid = b.oid) JOIN pg_namespace c ON (b.connamespace = c.oid) JOIN pg_class d ON (b.connrelid = d.oid) WHERE classoid = (select oid from pg_class x join pg_namespace y on (x.relnamespace = y.oid) wherey.nspname = 'pg_catalog' and x.relname = 'pg_constraint') AND c.nspname = 'my_schema' AND d.relname = 'my_table' ; and have that information available to return to the client. Thanks, Stephen
On 29 December 2012 17:43, Stephen Frost <sfrost@snowman.net> wrote: > I'd like to see more options for what is logged, as I've mentioned in > the past, and I think all of these would be good candidates for logging > in some circumstances. The insistence on having one CSV format to rule > them all and which doesn't change (except that we've been regularly > changing it across major releases anyway..) doesn't strike me as the > right approach. Yeah, you're probably right about that. I'm just not sure where it leaves this patch. > It's not entirely clear to me what distinction you're making here or if > we're simply in agreement about what the necessary fields are. I think we might be in agreement. > Having the schema name, table name, column name and constraint name > seems like it's sufficient to fully qualify a specific constraint..? Not necessarily, strictly speaking. It's my position that we should not care about the theoretical edge-cases that this presents. For example, I don't have any sympathy for the idea that we need to fully qualify that we're talking about a constraint in a particular schema, in case there are two distinct constraints in different schemas *that have the same name but don't do exactly the same thing anyway*. In cases where there are two distinct constraints with the same name in the same code path, it seems very likely that the custom error message to be displayed (or whatever) should not need to differ for each (this could come up if you were using schemas for multi-tenancy, for example, and each schema contained the same objects). So, in my latest revision of the patch, the only thing that isn't fully-qualified is constraint_name (because routine_name and so on are no longer included). It just seems unnecessary, given that only ERRCODE_CHECK_VIOLATION errors will lack a schema name and table name (and even then, only sometimes). Pavel originally included a constraint_schema field, because he figured that the way constraints are catalogued necessitated such a field. -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services
2012/12/29 Peter Geoghegan <peter@2ndquadrant.com>: > On 29 December 2012 17:43, Stephen Frost <sfrost@snowman.net> wrote: >> I'd like to see more options for what is logged, as I've mentioned in >> the past, and I think all of these would be good candidates for logging >> in some circumstances. The insistence on having one CSV format to rule >> them all and which doesn't change (except that we've been regularly >> changing it across major releases anyway..) doesn't strike me as the >> right approach. > > Yeah, you're probably right about that. I'm just not sure where it > leaves this patch. > >> It's not entirely clear to me what distinction you're making here or if >> we're simply in agreement about what the necessary fields are. > > I think we might be in agreement. > >> Having the schema name, table name, column name and constraint name >> seems like it's sufficient to fully qualify a specific constraint..? > > Not necessarily, strictly speaking. It's my position that we should > not care about the theoretical edge-cases that this presents. For > example, I don't have any sympathy for the idea that we need to fully > qualify that we're talking about a constraint in a particular schema, > in case there are two distinct constraints in different schemas *that > have the same name but don't do exactly the same thing anyway*. In > cases where there are two distinct constraints with the same name in > the same code path, it seems very likely that the custom error message > to be displayed (or whatever) should not need to differ for each (this > could come up if you were using schemas for multi-tenancy, for > example, and each schema contained the same objects). > > So, in my latest revision of the patch, the only thing that isn't > fully-qualified is constraint_name (because routine_name and so on are > no longer included). It just seems unnecessary, given that only > ERRCODE_CHECK_VIOLATION errors will lack a schema name and table name > (and even then, only sometimes). Pavel originally included a > constraint_schema field, because he figured that the way constraints > are catalogued necessitated such a field. Design of constraints is little bit different between ANSI SQL and PostgreSQL. And then some fields proposed by standard has no sense in pg - TRIGGER_SCHEMA and probably CONSTRAINT_SCHEMA. Ours constraints are related to some relation - everywhere. You cannot create constraint without relation - so everywhere where CONSTRAINT_NAME is not empty, then TABLE_NAME and TABLE_SCHEMA should be defined or CONSTRAINT_NAME should be unique in database. In my first patch used for these field some expected generated value, but I agree, so it is not necessary and better to drop it, although it can help with readability of some queries. Regards Pavel > > -- > Peter Geoghegan http://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Training and Services
Peter, * Peter Geoghegan (peter@2ndquadrant.com) wrote: > Pavel originally included a > constraint_schema field, because he figured that the way constraints > are catalogued necessitated such a field. That's exactly what I was getting at also- in order to do a lookup in the catalog, you need to know certain information to avoid potentially getting multiple records back (which would be an error...). If it isn't possible to uniquely identify a constraint using the information returned, that's a problem. It's not a question about what information is necessary- the catalog dictates what the definition of a constraint is and it includes schema, table, and costraint name. In my view, we shouldn't be returing a constraint name without returning the other information to identify that constraint. How is that different from the schema_name and table_name fields that you mentioned were going to be included..? Are they not always filled out for constraint violations and if not, why not? Is there a situation where we could be getting an error back about a constraint where the table returned isn't the table which the constraint is on? If you review the use-case from my last email, I'm hopeful that you'll see and understand where I think returning this information would be useful and the information which would be necessary to make that query work. Even without looking up the comment and instead just trying to return the source of the constraint which was violated, you need the information whcih will allow you to uniquely identify the constraint. Thanks, Stephen
On 29 December 2012 18:37, Stephen Frost <sfrost@snowman.net> wrote: > That's exactly what I was getting at also- in order to do a lookup in > the catalog, you need to know certain information to avoid potentially > getting multiple records back (which would be an error...). Well, Pavel said that since a constraint is necessarily associated with another object, the constraint name doesn't need to be separately qualified. That isn't quite the truth, but I think it's close enough. Note that I've documented a new set of requirements for various errcodes: Section: Class 23 - Integrity Constraint Violation ! Requirement: unused 23000 E ERRCODE_INTEGRITY_CONSTRAINT_VIOLATION integrity_constraint_violation + Requirement: unused 23001 E ERRCODE_RESTRICT_VIOLATION restrict_violation + # Note that requirements for ERRCODE_NOT_NULL do not apply to domains: + Requirement: schema_name, table_name 23502 E ERRCODE_NOT_NULL_VIOLATION not_null_violation + Requirement: schema_name, table_name, constraint_name 23503 E ERRCODE_FOREIGN_KEY_VIOLATION foreign_key_violation + Requirement: schema_name, table_name, constraint_name 23505 E ERRCODE_UNIQUE_VIOLATION unique_violation + Requirement: constraint_name 23514 E ERRCODE_CHECK_VIOLATION check_violation + Requirement: schema_name, table_name, constraint_name 23P01 E ERRCODE_EXCLUSION_VIOLATION exclusion_violation So, unless someone adds a constraint name outside of these errcodes (I doubt that would make sense), there is exactly one case where a constraint_name could not have a schema_name (which, as I've said, is almost the same thing as constraint_schema, the exception being when referencing FKs on *other* tables are involved) - that case is ERRCODE_CHECK_VIOLATION. That's because this SQL could cause ERRCODE_CHECK_VIOLATION: select '123'::upc_barcode; What should schema_name be set to now? Surely not the schema of the type upc_barcode, since that would be inconsistent with a few other ERRCODE_CHECK_VIOLATION sites where we do know schema_name + table_name (those two are always either available together or not at all). The bottom line is that I'm not promising that you can reliably look up the constraint, and I don't think that that should be a blocker, or even that it's all that important. You could do it reliably with the schema_name + table_name, though I'm not strongly encouraging that you do. So I guess we disagree on that, though I'm not *that* strongly opposed to adding back in a constraint_schema field if the extra code is deemed worth it. Does anyone else have an opinion? Tom? -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services
2012/12/29 Peter Geoghegan <peter@2ndquadrant.com>: > On 29 December 2012 18:37, Stephen Frost <sfrost@snowman.net> wrote: >> That's exactly what I was getting at also- in order to do a lookup in >> the catalog, you need to know certain information to avoid potentially >> getting multiple records back (which would be an error...). > > Well, Pavel said that since a constraint is necessarily associated > with another object, the constraint name doesn't need to be separately > qualified. That isn't quite the truth, but I think it's close enough. > > Note that I've documented a new set of requirements for various errcodes: > > Section: Class 23 - Integrity Constraint Violation > ! Requirement: unused > 23000 E ERRCODE_INTEGRITY_CONSTRAINT_VIOLATION > integrity_constraint_violation > + Requirement: unused > 23001 E ERRCODE_RESTRICT_VIOLATION > restrict_violation > + # Note that requirements for ERRCODE_NOT_NULL do not apply to domains: > + Requirement: schema_name, table_name > 23502 E ERRCODE_NOT_NULL_VIOLATION > not_null_violation > + Requirement: schema_name, table_name, constraint_name > 23503 E ERRCODE_FOREIGN_KEY_VIOLATION > foreign_key_violation > + Requirement: schema_name, table_name, constraint_name > 23505 E ERRCODE_UNIQUE_VIOLATION > unique_violation > + Requirement: constraint_name > 23514 E ERRCODE_CHECK_VIOLATION > check_violation > + Requirement: schema_name, table_name, constraint_name > 23P01 E ERRCODE_EXCLUSION_VIOLATION > exclusion_violation > > So, unless someone adds a constraint name outside of these errcodes (I > doubt that would make sense), there is exactly one case where a > constraint_name could not have a schema_name (which, as I've said, is > almost the same thing as constraint_schema, the exception being when > referencing FKs on *other* tables are involved) - that case is > ERRCODE_CHECK_VIOLATION. > > That's because this SQL could cause ERRCODE_CHECK_VIOLATION: > > select '123'::upc_barcode; > > What should schema_name be set to now? Surely not the schema of the > type upc_barcode, since that would be inconsistent with a few other > ERRCODE_CHECK_VIOLATION sites where we do know schema_name + > table_name (those two are always either available together or not at > all). I forgot on domain :( this is use case, where CONSTRAINT_SCHEMA has sense > > The bottom line is that I'm not promising that you can reliably look > up the constraint, and I don't think that that should be a blocker, or > even that it's all that important. You could do it reliably with the > schema_name + table_name, though I'm not strongly encouraging that you > do. so then we probably need a CONSTRAINT_SCHEMA > > So I guess we disagree on that, though I'm not *that* strongly opposed > to adding back in a constraint_schema field if the extra code is > deemed worth it. > > Does anyone else have an opinion? Tom? > > -- > Peter Geoghegan http://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Training and Services
* Peter Geoghegan (peter@2ndquadrant.com) wrote: > So, unless someone adds a constraint name outside of these errcodes (I > doubt that would make sense), there is exactly one case where a > constraint_name could not have a schema_name (which, as I've said, is > almost the same thing as constraint_schema, the exception being when > referencing FKs on *other* tables are involved) To be honest, I expected the concern to be about FKs and RESTRICT-type relationships, which I think we do need to figure out an answer for. Is there a distinction between the errors thrown for violating an FK on an insert vs. violating a FK on a delete? Perhaps with that we could identify referring table vs referred table and provide all of that information to the application in a structured way? > - that case is > ERRCODE_CHECK_VIOLATION. > > That's because this SQL could cause ERRCODE_CHECK_VIOLATION: > > select '123'::upc_barcode; I'm surprised to see that as a constraint violation rather than a domain violation..? ala: =*> select '3000000000'::int; ERROR: value "3000000000" is out of range for type integer > What should schema_name be set to now? Surely not the schema of the > type upc_barcode, since that would be inconsistent with a few other > ERRCODE_CHECK_VIOLATION sites where we do know schema_name + > table_name (those two are always either available together or not at > all). I'm not sure that the schema of the type would be entirely wrong in that specific case, along with the table name being set to the name of the domain. I still think a domain violation-type error would be more appropriate than calling it a constraint violation though. > The bottom line is that I'm not promising that you can reliably look > up the constraint, and I don't think that that should be a blocker, or > even that it's all that important. You could do it reliably with the > schema_name + table_name, though I'm not strongly encouraging that you > do. > > So I guess we disagree on that, though I'm not *that* strongly opposed > to adding back in a constraint_schema field if the extra code is > deemed worth it. > > Does anyone else have an opinion? Tom? Having just constraint_schema and constraint_name feels horribly wrong as the definition of a constraint also includes a pg_class oid. Thanks, Stephen
2012/12/29 Stephen Frost <sfrost@snowman.net>: > * Peter Geoghegan (peter@2ndquadrant.com) wrote: >> So, unless someone adds a constraint name outside of these errcodes (I >> doubt that would make sense), there is exactly one case where a >> constraint_name could not have a schema_name (which, as I've said, is >> almost the same thing as constraint_schema, the exception being when >> referencing FKs on *other* tables are involved) > > To be honest, I expected the concern to be about FKs and RESTRICT-type > relationships, which I think we do need to figure out an answer for. Is > there a distinction between the errors thrown for violating an FK on an > insert vs. violating a FK on a delete? Perhaps with that we could > identify referring table vs referred table and provide all of that > information to the application in a structured way? > >> - that case is >> ERRCODE_CHECK_VIOLATION. >> >> That's because this SQL could cause ERRCODE_CHECK_VIOLATION: >> >> select '123'::upc_barcode; > > I'm surprised to see that as a constraint violation rather than a domain > violation..? ala: > > =*> select '3000000000'::int; > ERROR: value "3000000000" is out of range for type integer > >> What should schema_name be set to now? Surely not the schema of the >> type upc_barcode, since that would be inconsistent with a few other >> ERRCODE_CHECK_VIOLATION sites where we do know schema_name + >> table_name (those two are always either available together or not at >> all). > > I'm not sure that the schema of the type would be entirely wrong in that > specific case, along with the table name being set to the name of the > domain. I still think a domain violation-type error would be more > appropriate than calling it a constraint violation though. > >> The bottom line is that I'm not promising that you can reliably look >> up the constraint, and I don't think that that should be a blocker, or >> even that it's all that important. You could do it reliably with the >> schema_name + table_name, though I'm not strongly encouraging that you >> do. >> >> So I guess we disagree on that, though I'm not *that* strongly opposed >> to adding back in a constraint_schema field if the extra code is >> deemed worth it. >> >> Does anyone else have an opinion? Tom? > > Having just constraint_schema and constraint_name feels horribly wrong > as the definition of a constraint also includes a pg_class oid. but then TABLE_NAME and TABLE_SCHEMA will be defined. Pavel > > Thanks, > > Stephen
* Pavel Stehule (pavel.stehule@gmail.com) wrote: > > Having just constraint_schema and constraint_name feels horribly wrong > > as the definition of a constraint also includes a pg_class oid. > > but then TABLE_NAME and TABLE_SCHEMA will be defined. How are you going to look up the constraint? Using constraint_schema, table_name, and constraint_name? Or table_schema, table_name and constraint_name? When do you use constraint_schema instead of table_schema? None of those options is exactly clear or understandable... Thanks, Stephen
2012/12/29 Stephen Frost <sfrost@snowman.net>: > * Pavel Stehule (pavel.stehule@gmail.com) wrote: >> > Having just constraint_schema and constraint_name feels horribly wrong >> > as the definition of a constraint also includes a pg_class oid. >> >> but then TABLE_NAME and TABLE_SCHEMA will be defined. > > How are you going to look up the constraint? Using constraint_schema, > table_name, and constraint_name? Or table_schema, table_name and > constraint_name? When do you use constraint_schema instead of > table_schema? > > None of those options is exactly clear or understandable... probably there will be situation when TABLE_SCHEMA and CONSTRAINT_SCHEMA same values Hypothetically - if we define CONSTRAINT_TABLE - what is difference from TABLE_NAME ? Pavel > > Thanks, > > Stephen
On 29 December 2012 19:56, Stephen Frost <sfrost@snowman.net> wrote: >> - that case is >> ERRCODE_CHECK_VIOLATION. >> >> That's because this SQL could cause ERRCODE_CHECK_VIOLATION: >> >> select '123'::upc_barcode; > > I'm surprised to see that as a constraint violation rather than a domain > violation..? I was trying to convey that upc_barcode is a domain with a check constraint (i.e. that the checkdigit on UPC barcode domains must be correct). So yes, that would be an ERRCODE_CHECK_VIOLATION. See ExecEvalCoerceToDomain(). >> What should schema_name be set to now? Surely not the schema of the >> type upc_barcode, since that would be inconsistent with a few other >> ERRCODE_CHECK_VIOLATION sites where we do know schema_name + >> table_name (those two are always either available together or not at >> all). > > I'm not sure that the schema of the type would be entirely wrong in that > specific case, along with the table name being set to the name of the > domain. I still think a domain violation-type error would be more > appropriate than calling it a constraint violation though. Well, it is what it is. We can't change it now. > Having just constraint_schema and constraint_name feels horribly wrong > as the definition of a constraint also includes a pg_class oid. I think that it's probably sufficient *for error handling purposes*. Since it is non-trivial to get the schema of a constraint, and since we have that jarring inconsistency (the schema of the type or the schema of the table on which a check constraint is defined?) we might well be better off just not addressing it. It isn't as simple as you make out. Not all constraints appear within pg_constraint (consider NOT NULL constraints), and besides, pg_constraint.conrelid can be zero for non-table constraints. What's more, pg_constraint actually has three pg_class oid columns; conrelid, conindid and confrelid. That is all. -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services
* Peter Geoghegan (peter@2ndquadrant.com) wrote: > > Having just constraint_schema and constraint_name feels horribly wrong > > as the definition of a constraint also includes a pg_class oid. > > I think that it's probably sufficient *for error handling purposes*. > Since it is non-trivial to get the schema of a constraint, and since > we have that jarring inconsistency (the schema of the type or the > schema of the table on which a check constraint is defined?) we might > well be better off just not addressing it. I actually really like the idea of being able to programatically figure out what constraint caused a given error and then go look up that constraint definition and the comment associated with it, to be able to pass back a meaningful error to the user. Reducing the cases where users end up implementing their own application-level error checking before sending things to the DB is a worthwhile goal, as they often end up implementing slightly different checking that what the DB does and get upset when the DB throws an error that they can't do anything useful with. > It isn't as simple as you make out. Not all constraints appear within > pg_constraint (consider NOT NULL constraints), and besides, > pg_constraint.conrelid can be zero for non-table constraints. What's > more, pg_constraint actually has three pg_class oid columns; conrelid, > conindid and confrelid. Perhaps we can provide a bit more help to our application developers then by coming up with something which will work consistently- eg: we provide the data in a structured way to the client and then a function (or a few of them) which the application can then use to get the details. I understand that it's complicated and I'd hope that we can do something better. In general, I wouldn't recommend developers to query the catalogs directly, but I don't think there's really a better option currently. If we fix that, great. In the end, I may agree with you- the patch is a nice idea, but it needs more to be a complete and working solution and providing something that only gets people half-way there would result in developers hacking things together which will quitely break when they least expect it. Thanks, Stephen
2012/12/29 Stephen Frost <sfrost@snowman.net>: > * Peter Geoghegan (peter@2ndquadrant.com) wrote: >> > Having just constraint_schema and constraint_name feels horribly wrong >> > as the definition of a constraint also includes a pg_class oid. >> >> I think that it's probably sufficient *for error handling purposes*. >> Since it is non-trivial to get the schema of a constraint, and since >> we have that jarring inconsistency (the schema of the type or the >> schema of the table on which a check constraint is defined?) we might >> well be better off just not addressing it. > > I actually really like the idea of being able to programatically figure > out what constraint caused a given error and then go look up that > constraint definition and the comment associated with it, to be able to > pass back a meaningful error to the user. Reducing the cases where > users end up implementing their own application-level error checking > before sending things to the DB is a worthwhile goal, as they often end > up implementing slightly different checking that what the DB does and > get upset when the DB throws an error that they can't do anything useful > with. > >> It isn't as simple as you make out. Not all constraints appear within >> pg_constraint (consider NOT NULL constraints), and besides, >> pg_constraint.conrelid can be zero for non-table constraints. What's >> more, pg_constraint actually has three pg_class oid columns; conrelid, >> conindid and confrelid. > > Perhaps we can provide a bit more help to our application developers > then by coming up with something which will work consistently- eg: we > provide the data in a structured way to the client and then a function > (or a few of them) which the application can then use to get the > details. I understand that it's complicated and I'd hope that we can do > something better. In general, I wouldn't recommend developers to query > the catalogs directly, but I don't think there's really a better option > currently. If we fix that, great. > > In the end, I may agree with you- the patch is a nice idea, but it needs > more to be a complete and working solution and providing something that > only gets people half-way there would result in developers hacking > things together which will quitely break when they least expect it. it is a problem of this patch or not consistent constraints implementation ? Regards Pavel > > Thanks, > > Stephen
* Pavel Stehule (pavel.stehule@gmail.com) wrote: > it is a problem of this patch or not consistent constraints implementation ? Not sure, but I don't think it matters. You can blame the constraint implementation, but that doesn't change my feelings about what we need before we can accept a patch like this. Providing something which works only part of the time and then doesn't work for very unclear reasons isn't a good idea. Perhaps we need to fix the constraint implementation and perhaps we need to fix the error information being returned, or most likely we have to fix both, it doesn't change that we need to do something more than just ignore this problem. Thanks, Stephen
2012/12/29 Stephen Frost <sfrost@snowman.net>: > * Pavel Stehule (pavel.stehule@gmail.com) wrote: >> it is a problem of this patch or not consistent constraints implementation ? > > Not sure, but I don't think it matters. You can blame the constraint > implementation, but that doesn't change my feelings about what we need > before we can accept a patch like this. Providing something which works > only part of the time and then doesn't work for very unclear reasons > isn't a good idea. Perhaps we need to fix the constraint implementation > and perhaps we need to fix the error information being returned, or most > likely we have to fix both, it doesn't change that we need to do > something more than just ignore this problem. can we remove CONSTRAINT_NAME from this patch? Minimally TABLE_SCHEMA, TABLE_NAME and COLUMN_NAME works as expected. CONSTRAINT_NAME can be implemented after constraints refactoring Pavel > > Thanks, > > Stephen
On 29 December 2012 20:49, Stephen Frost <sfrost@snowman.net> wrote: > In the end, I may agree with you- the patch is a nice idea, but it needs > more to be a complete and working solution and providing something that > only gets people half-way there would result in developers hacking > things together which will quitely break when they least expect it. I certainly wouldn't go that far. I think that 95% of the value that could be delivered by this sort of thing will be realised by committing something that is along the lines of what we have here already. All I really want is to give users a well-principled way of getting a constraint name within their error handler, so that they can go do something in their application along the lines of displaying a custom error message in the domain of the application and/or error handler, not the domain of the database. That's it. Ascertaining the identity of the object in question perfectly unambiguously, so that you can safely do something like lookup a comment on the object, seems like something way beyond what I'd envisioned for this feature. Why should the comment be useful in an error handler anyway? At best, that seems like a nice-to-have extra to me. The vast majority are not even going to think about the ambiguity that may exist. They'll just write: if (constraint_name == "upc") MessageBox("That is not a valid barcode."); If it isn't the same constraint object as expected - if they have multiple schemas with the same object definitions, are they likely to care? We'll never promise that constraint_name is unambiguous, and in fact will warn that it may be ambiguous. It seems like what you're really looking for is a way of effectively changing the error message to a user-defined error message using some kind of DDL against a constraint object, without doing anything special in a handler. That would be less flexible, but more automated, and would probably entail using placeholders in our custom message that get expanded (as with FK constraint violations - *what* key was violated?). That has its appeal, but I don't see that it's something that we need to do first. Don't get me wrong - I think it's quite possible to pin down a way of unambiguously identifying constraint objects as they appear here. I just think that it isn't worth the effort to maintain the code in Postgres, and in particular, I think people will not care about any ambiguity, and even if they do, they could easily misunderstand the exact rules. The right thing to do is not have multiple constraints with the same name in flight within the same place that do different things. -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services
On 29 December 2012 21:24, Pavel Stehule <pavel.stehule@gmail.com> wrote: > can we remove CONSTRAINT_NAME from this patch? Minimally TABLE_SCHEMA, > TABLE_NAME and COLUMN_NAME works as expected. > > CONSTRAINT_NAME can be implemented after constraints refactoring This patch is almost completely pointless without a CONSTRAINT_NAME field. -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services
Peter, * Peter Geoghegan (peter@2ndquadrant.com) wrote: > if (constraint_name == "upc") > MessageBox("That is not a valid barcode."); So they'll quickly realize that a lookup-table based on constraint name would be useful, create it, and then have a primary key on it to make sure that they don't have any duplicates. Of course, this is all duplicating what we're *already* keeping track of, except that method won't be consistent with the catalog and they could end up with a single entry in their table which corresponds to multiple actual constraints in the system and begin subtly returning errors that don't make any sense to the end user. That's exactly the kind of subtly broken situation that I would hope we'd try to keep them from getting into. I'd almost rather return the OID and provide some lookup functions. Thanks, Stephen
On 29 December 2012 22:57, Stephen Frost <sfrost@snowman.net> wrote: > So they'll quickly realize that a lookup-table based on constraint name > would be useful, create it, and then have a primary key on it to make > sure that they don't have any duplicates. I don't find that terribly likely. There is nothing broken about the example. It's possible to misuse almost anything. In order for the problem you describe to happen, the user would have to ignore the warning in the documentation about constraint_name's ability to uniquely identify something, and then have two constraints in play at the same time with the same name but substantively different. That seems incredibly unlikely. Maybe you think that users cannot be trusted to take that warning on board, but then the same user could not be trusted to heed another warning about using a constraint_schema in the lookup table primary key. This whole lookup table idea presupposes that there'll only ever be one error message per constraint in the entire application. That usually isn't true for all sorts of reasons, in my experience. -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services
Peter, * Peter Geoghegan (peter@2ndquadrant.com) wrote: > In order for the problem you describe to happen, the user would have > to ignore the warning in the documentation about constraint_name's > ability to uniquely identify something, and then have two constraints > in play at the same time with the same name but substantively > different. That seems incredibly unlikely. I really don't think what I sketched out or something similar would happen. I do think it's incredibly frustrating as a user who is trying to develop an application which behaves correctly to be given only half the information. Thanks, Stephen
On 30 December 2012 02:01, Stephen Frost <sfrost@snowman.net> wrote: > I really don't think what I sketched out or something similar would > happen. I do think it's incredibly frustrating as a user who is trying > to develop an application which behaves correctly to be given only half > the information. I don't know what to say to that. Sometimes, when deciding how to address problems like this, it's possible to arrive at slightly surprising answers by process of elimination. -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services
* Peter Geoghegan (peter@2ndquadrant.com) wrote: > On 30 December 2012 02:01, Stephen Frost <sfrost@snowman.net> wrote: > > I really don't think what I sketched out or something similar would > > happen. I do think it's incredibly frustrating as a user who is trying > > to develop an application which behaves correctly to be given only half > > the information. > > I don't know what to say to that. Sometimes, when deciding how to > address problems like this, it's possible to arrive at slightly > surprising answers by process of elimination. Err. I intended to say "I really don't think what I sketched out, or something similar, would be that unlikely to happen", or something along those lines. Apologies for the confusion. Thanks, Stephen
On 30 December 2012 03:32, Stephen Frost <sfrost@snowman.net> wrote: > Err. I intended to say "I really don't think what I sketched out, or > something similar, would be that unlikely to happen", or something along > those lines. Apologies for the confusion. Almost anything can be misused. If you're going to insist that I hack a bunch of mechanism into this patch so that the user can unambiguously identify each constraint object, I'll do that. However, that's more code, and more complexity, that will have to be documented, for just next to no practical benefit that I can see. -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services
* Peter Geoghegan (peter@2ndquadrant.com) wrote: > On 30 December 2012 03:32, Stephen Frost <sfrost@snowman.net> wrote: > > Err. I intended to say "I really don't think what I sketched out, or > > something similar, would be that unlikely to happen", or something along > > those lines. Apologies for the confusion. > > Almost anything can be misused. I agree, almost anything can be. The 'misuse' in this case, however, is in expecting the information returned to be useful in a deterministic and consistent manner, as it's being returned in a programmatic fashion. > If you're going to insist that I hack a bunch of mechanism into this > patch so that the user can unambiguously identify each constraint > object, I'll do that. This is the part that I'm having trouble wrapping my head around- what's the additional complexity here? If we have the OID of the constraint, we should be able to unambiguoulsy return information which allows a user to get back to that specific constraint and OID. > However, that's more code, and more complexity, > that will have to be documented, for just next to no practical benefit > that I can see. I've certainly seen cases where constraints are duplicated by name across schemas. I would imagine it could happen across tables in the same schema also. I just don't like this notion of returning something ambiguous to the user and worry that they'd accept it as deterministic and dependable, regardless of the documentation. There's a certain amount of "read the docs, but also look at what information you really get back", which I think is, in general, a good thing, but it does mean individuals might come to believe that they can depend on the constraint name being unique in all cases. As a side-note, I've recently run into more cases than I care to think about where the 63-character limit on constraint names has been a problem- because we're trying to ensure that we create an unambiguous constraint name, and that's *with* various name shortening techniques being used. The discussion about having longer values possible for the 'name' data type was on a different thread and should probably stay there, but I bring it up because it has some impact on the possibility of name collisions which is relevent to this discussion. Thanks, Stephen
2012/12/30 Stephen Frost <sfrost@snowman.net>: > * Peter Geoghegan (peter@2ndquadrant.com) wrote: >> On 30 December 2012 03:32, Stephen Frost <sfrost@snowman.net> wrote: >> > Err. I intended to say "I really don't think what I sketched out, or >> > something similar, would be that unlikely to happen", or something along >> > those lines. Apologies for the confusion. >> >> Almost anything can be misused. > > I agree, almost anything can be. The 'misuse' in this case, however, is > in expecting the information returned to be useful in a deterministic > and consistent manner, as it's being returned in a programmatic fashion. > >> If you're going to insist that I hack a bunch of mechanism into this >> patch so that the user can unambiguously identify each constraint >> object, I'll do that. > > This is the part that I'm having trouble wrapping my head around- what's > the additional complexity here? If we have the OID of the constraint, > we should be able to unambiguoulsy return information which allows a > user to get back to that specific constraint and OID. > >> However, that's more code, and more complexity, >> that will have to be documented, for just next to no practical benefit >> that I can see. > > I've certainly seen cases where constraints are duplicated by name > across schemas. I would imagine it could happen across tables in the > same schema also. I just don't like this notion of returning something > ambiguous to the user and worry that they'd accept it as deterministic > and dependable, regardless of the documentation. There's a certain > amount of "read the docs, but also look at what information you really > get back", which I think is, in general, a good thing, but it does mean > individuals might come to believe that they can depend on the constraint > name being unique in all cases. > > As a side-note, I've recently run into more cases than I care to think > about where the 63-character limit on constraint names has been a > problem- because we're trying to ensure that we create an unambiguous > constraint name, and that's *with* various name shortening techniques > being used. The discussion about having longer values possible for the > 'name' data type was on a different thread and should probably stay > there, but I bring it up because it has some impact on the possibility > of name collisions which is relevent to this discussion. so - cannot be a solution define CONSTRAINT_TABLE field - constaint names in table are unique. sure there is a problem with long names, but I am thinking so it has solution - when constraint has no name, then we can try to generate name, and when this name is longer than 63 chars, then CREATE STATEMENT fails and users should be define name manually - this feature should be disabled by guc due compatibility issues. Regards Pavel > > Thanks, > > Stephen
* Pavel Stehule (pavel.stehule@gmail.com) wrote: > so - cannot be a solution define CONSTRAINT_TABLE field - constaint > names in table are unique. Adding a table column, and a schema column, would be ideal. Those would all be part of the PK and not null'able, but then we wouldn't necessairly always return all that information- that's the situation that we've been talking about. > sure there is a problem with long names, but I am thinking so it has > solution - when constraint has no name, then we can try to generate > name, and when this name is longer than 63 chars, then CREATE > STATEMENT fails and users should be define name manually - this > feature should be disabled by guc due compatibility issues. CREATE doesn't fail if the name is too long today, it truncates it instead. I continue to feel that's also the wrong thing to do. Thanks, Stephen
2012/12/30 Stephen Frost <sfrost@snowman.net>: > * Pavel Stehule (pavel.stehule@gmail.com) wrote: >> so - cannot be a solution define CONSTRAINT_TABLE field - constaint >> names in table are unique. > > Adding a table column, and a schema column, would be ideal. Those would > all be part of the PK and not null'able, but then we wouldn't > necessairly always return all that information- that's the situation > that we've been talking about. > >> sure there is a problem with long names, but I am thinking so it has >> solution - when constraint has no name, then we can try to generate >> name, and when this name is longer than 63 chars, then CREATE >> STATEMENT fails and users should be define name manually - this >> feature should be disabled by guc due compatibility issues. > > CREATE doesn't fail if the name is too long today, it truncates it > instead. I continue to feel that's also the wrong thing to do. probably it is far to ideal - but I have not any feedback about related problems in production. Regards Pavel Stehule > > Thanks, > > Stephen
On Fri, Dec 28, 2012 at 1:21 PM, Peter Geoghegan <peter@2ndquadrant.com> wrote: > Now, as to the question of whether we need to make sure that > everything is always fully qualified - I respectfully disagree with > Stephen, and maintain my position set out in the last round of > feedback [1], which is that we don't need to fully qualify everything, > because *for the purposes of error handling*, which is what I think we > should care about, these fields are sufficient: > > + column_name text, > + table_name text, > + constraint_name text, > + schema_name text, > > If you use the same constraint name everywhere, you might get the same > error message. The situations in which this scheme might fall down are > too implausible for me to want to bloat up all those ereport sites any > further (something that Stephen also complained about). > > I think that the major outstanding issues are concerning whether or > not I have the API here right. I make explicit guarantees as to the > availability of certain fields for certain errcodes (a small number of > "Class 23 - Integrity Constraint Violation" codes). No one has really > said anything about that, though I think it's important. > > I also continue to think that we shouldn't have "routine_name", > "routine_schema" and "trigger_name" fields - I think it's wrong-headed > to have an exception handler magically act on knowledge about where > the exception came from that has been baked into the exception - is > there any sort of precedent for this? Pavel disagrees here. Again, I > defer to others. I don't really agree with this. To be honest, my biggest concern about this patch is that it will make it take longer to report an error. At least in the cases where these additional fields are included, it will take CPU time to populate those fields, and maybe there will be some overhead even in the cases where they aren't even used (although I'd expect that to be too little to measure). Now, maybe that doesn't matter in the case where the error is being reported back to the client, because the overhead of shipping packets across even a local socket likely dwarfs the overhead, but I think it matters a lot where you are running a big exception-catching loop. That is already painfully slow, and -1 from me on anything that makes it significantly slower. I have a feeling this isn't the first time I'm ranting about this topic in relation to this patch, so apologies if this is old news. But if we decide that there is no performance issue or that we don't care about the hit there, then I think Stephen and Pavel are right to want a large amount of very precise detail. What's the use case for this feature? Presumably, it's for people for whom parsing the error message is not a practical option, so either they textual error message doesn't identify the target object sufficiently precisely, and they want to make sure they know what it applies to; or else it's for people who want any error that applies to a table to be handled the same way (without worrying about exactly which error they have got). Imagine, for example, someone writing a framework that will be used as a basis for many different applications. It might want to do something, like, I don't know, update the comment on a table every time an error involving that table occurs. Clearly, precise identification of the table is a must. In a particular application development environment, it's reasonable to rely on users to name things sensibly, but if you're shipping a tool that might be dropped into any arbitrary environment, that's significantly more dangerous. Similarly, for a function-related error, you'd need something like that looks like the output of pg_proc.oid::regprocedure, or individual fields with the various components of that output. That sounds like routine_name et. al. I'm not happy about the idea of shipping OIDs back to the client. OIDs are too user-visible as it is; we should try to make them less not moreso. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Sat, Dec 29, 2012 at 4:30 PM, Peter Geoghegan <peter@2ndquadrant.com> wrote: > Ascertaining the identity of the object in question perfectly > unambiguously, so that you can safely do something like lookup a > comment on the object, seems like something way beyond what I'd > envisioned for this feature. Why should the comment be useful in an > error handler anyway? At best, that seems like a nice-to-have extra to > me. The vast majority are not even going to think about the ambiguity > that may exist. They'll just write: > > if (constraint_name == "upc") > MessageBox("That is not a valid barcode."); The people who are content to do that don't need this patch at all. They can just apply a regexp to the message that comes back from the server and then set constraint_name based on what pops out of the regex. And then do just what you did there. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> schrieb: >On Sat, Dec 29, 2012 at 4:30 PM, Peter Geoghegan ><peter@2ndquadrant.com> wrote: >> Ascertaining the identity of the object in question perfectly >> unambiguously, so that you can safely do something like lookup a >> comment on the object, seems like something way beyond what I'd >> envisioned for this feature. Why should the comment be useful in an >> error handler anyway? At best, that seems like a nice-to-have extra >to >> me. The vast majority are not even going to think about the ambiguity >> that may exist. They'll just write: >> >> if (constraint_name == "upc") >> MessageBox("That is not a valid barcode."); > >The people who are content to do that don't need this patch at all. >They can just apply a regexp to the message that comes back from the >server and then set constraint_name based on what pops out of the >regex. And then do just what you did there. Easier said than done if you're dealing with pg installations with different lc_messages... Andres --- Please excuse the brevity and formatting - I am writing this on my mobile phone.
"anarazel@anarazel.de" <andres@anarazel.de> writes: > Robert Haas <robertmhaas@gmail.com> schrieb: >> The people who are content to do that don't need this patch at all. >> They can just apply a regexp to the message that comes back from the >> server and then set constraint_name based on what pops out of the >> regex. And then do just what you did there. > Easier said than done if you're dealing with pg installations with different lc_messages... Exactly. To my mind, the *entire* point of this patch is to remove the need for people to try to dig information out of potentially-localized message strings. It's not clear to me that we have to strain to provide information that isn't in the currently-reported messages --- we are only trying to make it easier for client-side code to extract the information it's likely to need. regards, tom lane
On 4 January 2013 18:07, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Exactly. To my mind, the *entire* point of this patch is to remove the > need for people to try to dig information out of potentially-localized > message strings. It's not clear to me that we have to strain to provide > information that isn't in the currently-reported messages --- we are > only trying to make it easier for client-side code to extract the > information it's likely to need. It seems that we're in agreement, then. I'll prepare a version of the patch very similar to the one I previously posted, but with some caveats about how reliably the values can be used. I think that that should be fine. -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services
On 4 January 2013 17:10, Robert Haas <robertmhaas@gmail.com> wrote: > I don't really agree with this. To be honest, my biggest concern > about this patch is that it will make it take longer to report an > error. At least in the cases where these additional fields are > included, it will take CPU time to populate those fields, and maybe > there will be some overhead even in the cases where they aren't even > used (although I'd expect that to be too little to measure). Now, > maybe that doesn't matter in the case where the error is being > reported back to the client, because the overhead of shipping packets > across even a local socket likely dwarfs the overhead, but I think it > matters a lot where you are running a big exception-catching loop. > That is already painfully slow, and -1 from me on anything that makes > it significantly slower. I have a feeling this isn't the first time > I'm ranting about this topic in relation to this patch, so apologies > if this is old news. You already brought it up. I measured the additional overhead, and found it to be present, but inconsequential. That was with Pavel's version of the patch, that had many more fields then what I've proposed. Please take a look upthread. -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services
Hello 2013/1/4 Robert Haas <robertmhaas@gmail.com>: > On Fri, Dec 28, 2012 at 1:21 PM, Peter Geoghegan <peter@2ndquadrant.com> wrote: >> Now, as to the question of whether we need to make sure that >> everything is always fully qualified - I respectfully disagree with >> Stephen, and maintain my position set out in the last round of >> feedback [1], which is that we don't need to fully qualify everything, >> because *for the purposes of error handling*, which is what I think we >> should care about, these fields are sufficient: >> >> + column_name text, >> + table_name text, >> + constraint_name text, >> + schema_name text, >> >> If you use the same constraint name everywhere, you might get the same >> error message. The situations in which this scheme might fall down are >> too implausible for me to want to bloat up all those ereport sites any >> further (something that Stephen also complained about). >> >> I think that the major outstanding issues are concerning whether or >> not I have the API here right. I make explicit guarantees as to the >> availability of certain fields for certain errcodes (a small number of >> "Class 23 - Integrity Constraint Violation" codes). No one has really >> said anything about that, though I think it's important. >> >> I also continue to think that we shouldn't have "routine_name", >> "routine_schema" and "trigger_name" fields - I think it's wrong-headed >> to have an exception handler magically act on knowledge about where >> the exception came from that has been baked into the exception - is >> there any sort of precedent for this? Pavel disagrees here. Again, I >> defer to others. > > I don't really agree with this. To be honest, my biggest concern > about this patch is that it will make it take longer to report an > error. At least in the cases where these additional fields are > included, it will take CPU time to populate those fields, and maybe > there will be some overhead even in the cases where they aren't even > used (although I'd expect that to be too little to measure). Now, > maybe that doesn't matter in the case where the error is being > reported back to the client, because the overhead of shipping packets > across even a local socket likely dwarfs the overhead, but I think it > matters a lot where you are running a big exception-catching loop. > That is already painfully slow, and -1 from me on anything that makes > it significantly slower. I have a feeling this isn't the first time > I'm ranting about this topic in relation to this patch, so apologies > if this is old news. We did these tests independently - Peter and me with same result - in use cases developed for highlighting impact of this patch you can see some slowdown - if I remember well - less than 3% - and it raised exceptions really intensively - and It can be visible only from stored procedures environment - if somebody use it from client side, then impact is zero. > > But if we decide that there is no performance issue or that we don't > care about the hit there, then I think Stephen and Pavel are right to > want a large amount of very precise detail. What's the use case for > this feature? Presumably, it's for people for whom parsing the error > message is not a practical option, so either they textual error > message doesn't identify the target object sufficiently precisely, and > they want to make sure they know what it applies to; or else it's for > people who want any error that applies to a table to be handled the > same way (without worrying about exactly which error they have got). > Imagine, for example, someone writing a framework that will be used as > a basis for many different applications. It might want to do > something, like, I don't know, update the comment on a table every > time an error involving that table occurs. Clearly, precise > identification of the table is a must. In a particular application > development environment, it's reasonable to rely on users to name > things sensibly, but if you're shipping a tool that might be dropped > into any arbitrary environment, that's significantly more dangerous. > > Similarly, for a function-related error, you'd need something like > that looks like the output of pg_proc.oid::regprocedure, or individual > fields with the various components of that output. That sounds like > routine_name et. al. Probably we don't need all fields mentioned in ANSI SQL, because some from these fields has no sense in pg, but routine name and trigger name is really basic for some little bit more sophisticate work with exception on PL level. Regards Pavel > > I'm not happy about the idea of shipping OIDs back to the client. > OIDs are too user-visible as it is; we should try to make them less > not moreso. > > -- > Robert Haas > EnterpriseDB: http://www.enterprisedb.com > The Enterprise PostgreSQL Company
2013/1/4 Peter Geoghegan <peter@2ndquadrant.com>: > On 4 January 2013 18:07, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Exactly. To my mind, the *entire* point of this patch is to remove the >> need for people to try to dig information out of potentially-localized >> message strings. It's not clear to me that we have to strain to provide >> information that isn't in the currently-reported messages --- we are >> only trying to make it easier for client-side code to extract the >> information it's likely to need. > > It seems that we're in agreement, then. I'll prepare a version of the > patch very similar to the one I previously posted, but with some > caveats about how reliably the values can be used. I think that that > should be fine. is there agreement of routine_name and trigger_name fields? Regards Pavel > > -- > Peter Geoghegan http://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Training and Services
On 5 January 2013 16:56, Pavel Stehule <pavel.stehule@gmail.com> wrote: >> It seems that we're in agreement, then. I'll prepare a version of the >> patch very similar to the one I previously posted, but with some >> caveats about how reliably the values can be used. I think that that >> should be fine. > > is there agreement of routine_name and trigger_name fields? Well, Tom and I are both opposed to including those fields. Peter E seemed to support it in some way, but didn't respond to Tom's criticisms (which were just a restatement of my own). So, it seems to me that we're not going to do that, assuming nothing changes. -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services
2013/1/5 Peter Geoghegan <peter@2ndquadrant.com>: > On 5 January 2013 16:56, Pavel Stehule <pavel.stehule@gmail.com> wrote: >>> It seems that we're in agreement, then. I'll prepare a version of the >>> patch very similar to the one I previously posted, but with some >>> caveats about how reliably the values can be used. I think that that >>> should be fine. >> >> is there agreement of routine_name and trigger_name fields? > > Well, Tom and I are both opposed to including those fields. Peter E > seemed to support it in some way, but didn't respond to Tom's > criticisms (which were just a restatement of my own). So, it seems to > me that we're not going to do that, assuming nothing changes. if I understand well Robert Haas is for including these fields - so score is still 2:2 - but this is not a match :) I have no more new arguments for these fields - yes, there are no change Pavel > > -- > Peter Geoghegan http://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Training and Services
On 5 January 2013 18:06, Pavel Stehule <pavel.stehule@gmail.com> wrote: > I have no more new arguments for these fields - yes, there are no change Attached revision is similar to the eelog4.diff patch (i.e. it has only a few fields, most of which are present within the original error message). Note that "schema_name" remains, because it was so easy to keep it around since we pass the new infrastructure a Relation (which also serves to sanitize). There are a few differences: * All logging infrastructure has been removed. If the sole purpose of this patch is to facilitate client code, it doesn't make sense to redundantly log what is indicated by an error message. My feelings on log-scraping tools are well known (that they shouldn't need to exist, even if they currently do), so it won't surprise anyone to learn that I think it's totally wrong-headed to actively facilitate them. * Miscellaneous minor tweaks. I think it's probably fine that we don't go into detail about how ERRCODE_NOT_NULL_VIOLATION errors won't have a schema_name and table_name for domains - it's obvious that they don't, because a name cannot exist in principle. This is documented here, for example: case DOM_CONSTRAINT_NOTNULL: if (isnull) + /* + * We don't provide errtable here, because it is + * unavailable, though we require it anywhere where it is + * available in principle. + */ ereport(ERROR, (errcode(ERRCODE_NOT_NULL_VIOLATION), errmsg("domain %s does not allow null values", We also go to extra lengths to get a table_name for certain domain-related ereport sites. So, with the question of what fields to include and whether constraint name needs to be unambiguously resolvable addressed (I think), it appears that I've brought this one as far as I can. I'd still like to get input from a Perl hacker, but I think a committer needs to pick this up now. -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services
Attachment
Hello 2012/12/29 Stephen Frost <sfrost@snowman.net>: > * Pavel Stehule (pavel.stehule@gmail.com) wrote: >> it is a problem of this patch or not consistent constraints implementation ? > > Not sure, but I don't think it matters. You can blame the constraint > implementation, but that doesn't change my feelings about what we need > before we can accept a patch like this. Providing something which works > only part of the time and then doesn't work for very unclear reasons > isn't a good idea. Perhaps we need to fix the constraint implementation > and perhaps we need to fix the error information being returned, or most > likely we have to fix both, it doesn't change that we need to do > something more than just ignore this problem. so we have to solve this issue first. Please, can you do resume, what is and where is current constraint implementation raise strange/unexpected messages? one question when we will fix constraints, maybe we can use some infrastructure for enhanced error fields. What about partial commit now - just necessary infrastructure without modification of other code - I am thinking so there is agreement on new fields: column_name, table_name, schema_name, constraint_name and constraint_schema? Regards Pavel > > Thanks, > > Stephen
On 13 January 2013 06:13, Pavel Stehule <pavel.stehule@gmail.com> wrote: >> Not sure, but I don't think it matters. You can blame the constraint >> implementation, but that doesn't change my feelings about what we need >> before we can accept a patch like this. Providing something which works >> only part of the time and then doesn't work for very unclear reasons >> isn't a good idea. Perhaps we need to fix the constraint implementation >> and perhaps we need to fix the error information being returned, or most >> likely we have to fix both, it doesn't change that we need to do >> something more than just ignore this problem. > > so we have to solve this issue first. Please, can you do resume, what > is and where is current constraint implementation raise > strange/unexpected messages? I felt that this was quite unnecessary because of the limited scope of the patch, and because this raises thorny issues of both semantics and implementation. Tom agreed with this general view - after all, this patch exists for the express purpose of having a well-principled way of obtaining the various fields across lc_messages settings. So I don't see that we have to do anything about making a constraint_schema available. -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services
Peter Geoghegan <peter@2ndquadrant.com> writes: > I felt that this was quite unnecessary because of the limited scope of > the patch, and because this raises thorny issues of both semantics and > implementation. Tom agreed with this general view - after all, this > patch exists for the express purpose of having a well-principled way > of obtaining the various fields across lc_messages settings. So I > don't see that we have to do anything about making a constraint_schema > available. Or in other words, there are two steps here: first, create infrastructure to expose the fields that we already provide within the regular message text; then two, consider adding new fields. The first part of that is a good deal less controversial than the second, so let's go ahead and get that part committed. regards, tom lane
Peter Geoghegan <peter@2ndquadrant.com> writes: > [ eelog6.patch ] > So, with the question of what fields to include and whether constraint > name needs to be unambiguously resolvable addressed (I think), it > appears that I've brought this one as far as I can. I'd still like to > get input from a Perl hacker, but I think a committer needs to pick > this up now. I started to look this patch over. I think we can get to something committable from here, but I'm having a problem with the concept that we're going to "guarantee" anything about which additional fields might be available for any given SQLSTATE. This is already quite broken for the ERRCODE_NOT_NULL_VIOLATION case, and it's not hard to envision that there will be other inconsistencies in future, even without the issue that third-party code might use these SQLSTATEs without having gotten the memo about additional fields. If we were doing this from scratch we could perhaps fix that by using, eg, two different SQLSTATEs for the column-not-null and something-else-not-null cases. But it's probably too late to change the SQLSTATEs for existing error cases; applications might be depending on those. Anyway our choice of SQLSTATE is often constrained by the standard. I'm inclined to remove the "requirements" business altogether and just document that these fields may be supplied, or words to that effect. In practice, an application is going to know whether the particular error case it's concerned about has the auxiliary fields, I should think. > We also go to extra lengths to get a table_name for certain > domain-related ereport sites. A lot of that looks pretty broken to me, eg the changes in ExecEvalCoerceToDomain are just hokum. (Even if the expression is executing in a statement that has a result table, there's no very good reason to think that the error has anything to do with the result table.) It's possible we could restructure things so that coercions performed as part of an assignment to a specific table column are processed differently from other coercions, and have knowledge available about what the target table/column is. But it would be a pretty invasive change for limited benefit. BTW, one thing that struck me in a quick look-through is that the ERRCODE_FOREIGN_KEY_VIOLATION patches seem to inconsistently send either the PK or FK rel as the "errtable". Is this really per spec? I'd have sort of expected that the reported table ought to be the one that the constraint belongs to, namely the FK table. regards, tom lane
On 26 January 2013 22:36, Tom Lane <tgl@sss.pgh.pa.us> wrote: > I started to look this patch over. I think we can get to something > committable from here, but I'm having a problem with the concept that > we're going to "guarantee" anything about which additional fields might > be available for any given SQLSTATE. This is already quite broken for > the ERRCODE_NOT_NULL_VIOLATION case, and it's not hard to envision that > there will be other inconsistencies in future, even without the issue > that third-party code might use these SQLSTATEs without having gotten > the memo about additional fields. > I'm inclined to remove the "requirements" business altogether and just > document that these fields may be supplied, or words to that effect. > In practice, an application is going to know whether the particular > error case it's concerned about has the auxiliary fields, I should think. I think we may be talking at cross purposes here. Guarantee may have been too strong a word, or the wrong word entirely. All that I really want here is for there to be a coding standard instituted, so that in future client code will not be broken by a failure to include some field in a new ereport site that related to what is effectively the same error as an existing ereport site. I'm sure you'll agree that subtly breaking client code when refactoring Postgres is unacceptable. I thought that the best way to do that was at the errcode granularity, and am perfectly willing to pull back on the set of fields where this coding standard applies in respect of certain errcodes. I think we can afford to be very conservative about limiting the scope of that guarantee. >> We also go to extra lengths to get a table_name for certain >> domain-related ereport sites. > > A lot of that looks pretty broken to me, eg the changes in > ExecEvalCoerceToDomain are just hokum. (Even if the expression is > executing in a statement that has a result table, there's no very good > reason to think that the error has anything to do with the result > table.) I must admit that that particular change wasn't very well thought out. I was trying to appease Stephen, who seemed to think that having a table_name where it was in principle available was particularly important. I myself do not, and the current structure of the relevant code (and difficulty in providing a table_name consistently) in a sense reflects that. I'm inclined to agree that it is not worth pursuing further. > BTW, one thing that struck me in a quick look-through is that the > ERRCODE_FOREIGN_KEY_VIOLATION patches seem to inconsistently send > either the PK or FK rel as the "errtable". Is this really per spec? > I'd have sort of expected that the reported table ought to be the one > that the constraint belongs to, namely the FK table. Personally, on the face of it I'd expect the "inconsistency" to simply reflect the fact that the error related to the referencing table or referenced table. Pavel's original patch followed the same convention (though it also had a constraint_table field). I'm having a hard time figuring out the standards intent here, and I'm not sure that we should even care, because that applies on to GET DIAGNOSTICS, which isn't really the same thing as what we have here. I defer to you, though - it's not as if I feel too strongly about it. As I've said, the vast majority of the value likely to be delivered by this patch comes from the constraint_name field. That's the really compelling one, to my mind. -- Regards, Peter Geoghegan
Hello From my perspective - the core of this patch has two parts a) necessary infrastructure b) enhancing current ereport calls @a point is important for me and plpgsql coders, because it allows using these fields in custom PL/pgSQL exception - and errors processing in this language can be more structurable and comfortable. But now we are too late - and this part can be commited probably in 9.4 - although I have this patch prepared. @b is important for application users - but there was agreement so we will coverage exceptions step by step - so in this context we can drop support for domains now. 2013/1/26 Tom Lane <tgl@sss.pgh.pa.us>: > Peter Geoghegan <peter@2ndquadrant.com> writes: >> [ eelog6.patch ] >> So, with the question of what fields to include and whether constraint >> name needs to be unambiguously resolvable addressed (I think), it >> appears that I've brought this one as far as I can. I'd still like to >> get input from a Perl hacker, but I think a committer needs to pick >> this up now. > > I started to look this patch over. I think we can get to something > committable from here, but I'm having a problem with the concept that > we're going to "guarantee" anything about which additional fields might > be available for any given SQLSTATE. This is already quite broken for > the ERRCODE_NOT_NULL_VIOLATION case, and it's not hard to envision that > there will be other inconsistencies in future, even without the issue > that third-party code might use these SQLSTATEs without having gotten > the memo about additional fields. > > If we were doing this from scratch we could perhaps fix that by using, > eg, two different SQLSTATEs for the column-not-null and > something-else-not-null cases. But it's probably too late to change the > SQLSTATEs for existing error cases; applications might be depending on > those. Anyway our choice of SQLSTATE is often constrained by the > standard. > > I'm inclined to remove the "requirements" business altogether and just > document that these fields may be supplied, or words to that effect. > In practice, an application is going to know whether the particular > error case it's concerned about has the auxiliary fields, I should think. > >> We also go to extra lengths to get a table_name for certain >> domain-related ereport sites. > > A lot of that looks pretty broken to me, eg the changes in > ExecEvalCoerceToDomain are just hokum. (Even if the expression is > executing in a statement that has a result table, there's no very good > reason to think that the error has anything to do with the result > table.) It's possible we could restructure things so that coercions > performed as part of an assignment to a specific table column are > processed differently from other coercions, and have knowledge available > about what the target table/column is. But it would be a pretty > invasive change for limited benefit. > > BTW, one thing that struck me in a quick look-through is that the > ERRCODE_FOREIGN_KEY_VIOLATION patches seem to inconsistently send > either the PK or FK rel as the "errtable". Is this really per spec? > I'd have sort of expected that the reported table ought to be the one > that the constraint belongs to, namely the FK table. > Today I'll to spec Pavel > regards, tom lane
Hello > > Personally, on the face of it I'd expect the "inconsistency" to simply > reflect the fact that the error related to the referencing table or > referenced table. Pavel's original patch followed the same convention > (though it also had a constraint_table field). I'm having a hard time > figuring out the standards intent here, and I'm not sure that we > should even care, because that applies on to GET DIAGNOSTICS, which > isn't really the same thing as what we have here. I defer to you, > though - it's not as if I feel too strongly about it. > These fields will be reused in GET DIAGNOSTICS statement in PL/pgSQL. It is was primary goal. Regards Pavel
Peter Geoghegan <peter.geoghegan86@gmail.com> writes: > On 26 January 2013 22:36, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> I'm inclined to remove the "requirements" business altogether and just >> document that these fields may be supplied, or words to that effect. > I think we may be talking at cross purposes here. Guarantee may have > been too strong a word, or the wrong word entirely. All that I really > want here is for there to be a coding standard instituted, so that in > future client code will not be broken by a failure to include some > field in a new ereport site that related to what is effectively the > same error as an existing ereport site. I'm sure you'll agree that > subtly breaking client code when refactoring Postgres is unacceptable. [ shrug... ] If you have a way of making a guarantee that future versions introduce no new bugs, patent it and you'll soon have all the money in the world. It's conceivable that we could adapt some static checker to look for ereport calls that mention particular ERRCODEs and lack particular helper functions, but even that wouldn't be a cast-iron guarantee because of the possibility of call sites using non-constant errcode values. It'd probably be good enough in practice though. However, this patch is not that, and mere documentation isn't going to buy a thing here IMO. Especially not user-facing documentation, as opposed to something that might be in a developers' face when he's copying-and-pasting code somewhere. This patch didn't even touch the one place in the documentation that might be somewhat useful from a developer's standpoint, which is 49.2. Reporting Errors Within the Server. >> A lot of that looks pretty broken to me, eg the changes in >> ExecEvalCoerceToDomain are just hokum. (Even if the expression is >> executing in a statement that has a result table, there's no very good >> reason to think that the error has anything to do with the result >> table.) > I must admit that that particular change wasn't very well thought out. > I was trying to appease Stephen, who seemed to think that having a > table_name where it was in principle available was particularly > important. Well, if we can get an accurate and relevant value then I'm all for adding it. But field values that are misleading or even plain wrong in corner cases are not an improvement. At some point we might want to undertake a round of refactoring that makes this type of information available; and once we do, we'd probably want to expose it in the regular message texts not just the auxiliary fields. I think that sort of work is out-of-scope for this patch though. IMO what we should be doing here is getting the infrastructure in place, and then decorating some basic set of messages with aux info in places where not a lot of new code is needed to make that happen. Extending the decoration beyond that is material for future work. >> BTW, one thing that struck me in a quick look-through is that the >> ERRCODE_FOREIGN_KEY_VIOLATION patches seem to inconsistently send >> either the PK or FK rel as the "errtable". Is this really per spec? >> I'd have sort of expected that the reported table ought to be the one >> that the constraint belongs to, namely the FK table. > Personally, on the face of it I'd expect the "inconsistency" to simply > reflect the fact that the error related to the referencing table or > referenced table. Pavel's original patch followed the same convention > (though it also had a constraint_table field). I'm having a hard time > figuring out the standards intent here, and I'm not sure that we > should even care, because that applies on to GET DIAGNOSTICS, which > isn't really the same thing as what we have here. I defer to you, > though - it's not as if I feel too strongly about it. A large part of the argument for doing this patch at all is to satisfy the standard's requirements for information reported to a client. (I believe that GET DIAGNOSTICS is in the end a client-side requirement, ie in principle ecpg or similar library should be able to implement it based on what the server reports.) So to the extent that the spec defines what should be in the fields, we need to follow that. regards, tom lane
On 27 January 2013 18:57, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Peter Geoghegan <peter.geoghegan86@gmail.com> writes: >> I think we may be talking at cross purposes here. Guarantee may have >> been too strong a word, or the wrong word entirely. All that I really >> want here is for there to be a coding standard instituted, so that in >> future client code will not be broken by a failure to include some >> field in a new ereport site that related to what is effectively the >> same error as an existing ereport site. I'm sure you'll agree that >> subtly breaking client code when refactoring Postgres is unacceptable. > > [ shrug... ] If you have a way of making a guarantee that future > versions introduce no new bugs, patent it and you'll soon have all the > money in the world. Is that kind of sarcasm really necessary? Certain sets of ereport call sites within Postgres relate to either very similar errors (i.e. ereports that have the same errcode) or arguably identical errors (e.g. the various ERRCODE_CHECK_VIOLATION sites within nbtinsert.c, that have identical error texts). It would seem quite unfortunate to me if client code was to break based only on an internal implementation detail that differed between Postgres versions, or based on the current phase of the moon. This is the kind of problem that I'd hoped to prevent by documenting a set of required fields for a small number of errcodes going forward. Now, you could take the view that all of this is only for the purposes of error handling, and it isn't terribly critical that things work very reliably. That isn't my view, though. > It's conceivable that we could adapt some static checker to look for > ereport calls that mention particular ERRCODEs and lack particular > helper functions, but even that wouldn't be a cast-iron guarantee > because of the possibility of call sites using non-constant errcode > values. It'd probably be good enough in practice though. I thought about ways of doing that, but it didn't seem worth pursuing right now. > However, this patch is not that, and mere documentation isn't going to buy a > thing here IMO. Especially not user-facing documentation, as opposed > to something that might be in a developers' face when he's > copying-and-pasting code somewhere. This patch didn't even touch the > one place in the documentation that might be somewhat useful from a > developer's standpoint, which is 49.2. Reporting Errors Within the > Server. Well, an entry should probably be added to 49.2 too, then. Why should documentation (of whatever kind deemed appropriate) not buy a thing? Don't we want to prevent the kind of problems that I describe above? How are people supposed to know about something that isn't written down anywhere? Surely documentation is better than nothing? > At some point we might want to undertake a round of refactoring that > makes this type of information available; and once we do, we'd probably > want to expose it in the regular message texts not just the auxiliary > fields. I think that sort of work is out-of-scope for this patch > though. IMO what we should be doing here is getting the infrastructure > in place, and then decorating some basic set of messages with aux info > in places where not a lot of new code is needed to make that happen. > Extending the decoration beyond that is material for future work. Fair enough. -- Regards, Peter Geoghegan
Peter Geoghegan <peter.geoghegan86@gmail.com> writes: > On 26 January 2013 22:36, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> BTW, one thing that struck me in a quick look-through is that the >> ERRCODE_FOREIGN_KEY_VIOLATION patches seem to inconsistently send >> either the PK or FK rel as the "errtable". Is this really per spec? >> I'd have sort of expected that the reported table ought to be the one >> that the constraint belongs to, namely the FK table. > Personally, on the face of it I'd expect the "inconsistency" to simply > reflect the fact that the error related to the referencing table or > referenced table. I looked in the spec a bit, and what I found seems to support my recollection about this. In SQL99, it's 19.1 <get diagnostics statement> that defines the usage of these fields, and I see f) If the value of RETURNED_SQLSTATE corresponds to integrity constraint violation, transaction rollback- integrity constraint violation, or a triggered data change violation that was caused bya violation of a referential constraint, then: i) The values of CONSTRAINT_CATALOG and CONSTRAINT_SCHEMA are the <catalog name> and the <unqualifiedschema name> of the <schema name> of the schema containing the constraint or assertion.The value of CONSTRAINT_NAME is the <qualified identifier> of the constraint or assertion. ii) Case: 1) If the violated integrity constraint is a table constraint, then the values of CATALOG_NAME,SCHEMA_ NAME, and TABLE_NAME are the <catalog name>, the <unqualified schemaname> of the <schema name>, and the <qualified identifier> or <local table name>, respectively, of the table in which the table constraint is contained. The notion of a constraint being "contained" in a table is a bit weird; I guess they mean contained in the table's schema description. Anyway it seems fairly clear to me that it's supposed to be the table that the constraint belongs to, and that has to be the FK table not the PK table. regards, tom lane
Peter Geoghegan <peter.geoghegan86@gmail.com> writes: > On 27 January 2013 18:57, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> However, this patch is not that, and mere documentation isn't going to buy a >> thing here IMO. Especially not user-facing documentation, as opposed >> to something that might be in a developers' face when he's >> copying-and-pasting code somewhere. This patch didn't even touch the >> one place in the documentation that might be somewhat useful from a >> developer's standpoint, which is 49.2. Reporting Errors Within the >> Server. > Well, an entry should probably be added to 49.2 too, then. Why should > documentation (of whatever kind deemed appropriate) not buy a thing? > Don't we want to prevent the kind of problems that I describe above? > How are people supposed to know about something that isn't written > down anywhere? Surely documentation is better than nothing? I don't think I said or implied that we should not have any documentation about this. What I'm trying to say is that I find this approach to documentation unhelpful, both to users and developers. It's based on the notion that there's a rigorous connection between particular SQLSTATEs and the auxiliary fields that should be provided; an assumption already proven false within the very tiny set of SQLSTATEs dealt with in this first patch. That's a connection that we probably could have made valid if we'd been assigning SQLSTATEs with that idea in mind from the beginning, but we didn't and now it's too late. Future development will almost surely expose even more inconsistencies, not be able to get rid of them. I think we'd be better off providing docs that say "this is what this auxiliary field means, if it's provided" and then encourage developers to provide it wherever that meaning applies. We can at the same time note something like "As of Postgres 9.3, only errors in Class 23 provide this information", so that users (a) don't have unrealistic expectations about what's provided, and (b) don't get the idea that the current set of auxiliary fields is fixed. But a SQLSTATE-by-SQLSTATE listing doesn't seem to me to be very helpful. regards, tom lane
A couple more things about this patch ... I went back through the thread and reviewed all the angst about which fields to provide, especially whether we need CONSTRAINT_SCHEMA. I agree with the conclusion that we don't. It's in the spec because the spec supposes that CONSTRAINT_SCHEMA+CONSTRAINT_NAME is a unique identification for a constraint --- but it is not in Postgres, for historical reasons that we aren't going to revisit in this patch. Rather what we've got is that constraints are uniquely named among those associated with a table, or with a domain. So the correct unique key for a table constraint is table schema + table name + constraint name, whereas for a domain constraint it's domain schema + domain name + constraint name. The current patch provides sufficient information to uniquely identify a table constraint, but not so much domain constraints. Should we fix that? I think it'd be legitimate to re-use SCHEMA_NAME for domain schema, but we'd need a new nonstandard field DOMAIN_NAME (or maybe better DATATYPE_NAME) if we want to fix it. Do we want to add that now? If we do fix this, either now or later, it's going to require some small support function very much like errtable() to perform the catalog lookups for a datatype and set the ErrorData fields. The thought of dropping such a thing into relerror.c exposes the fact that that "module" isn't actually very modular. We could choose a different name for the file but it'd still be pretty questionable as to what its boundaries are. So I'm a bit inclined to drop relerror.c as such, and go back to the early design that put errtable() and friends in relcache.c. The errconstraint() function, not being specific to either rels or datatypes, perhaps really does belong in elog.c. Thoughts? regards, tom lane
2013/1/28 Tom Lane <tgl@sss.pgh.pa.us>: > A couple more things about this patch ... > > I went back through the thread and reviewed all the angst about which > fields to provide, especially whether we need CONSTRAINT_SCHEMA. > I agree with the conclusion that we don't. It's in the spec because > the spec supposes that CONSTRAINT_SCHEMA+CONSTRAINT_NAME is a unique > identification for a constraint --- but it is not in Postgres, for > historical reasons that we aren't going to revisit in this patch. > Rather what we've got is that constraints are uniquely named among > those associated with a table, or with a domain. So the correct > unique key for a table constraint is table schema + table name + > constraint name, whereas for a domain constraint it's domain schema + > domain name + constraint name. The current patch provides sufficient > information to uniquely identify a table constraint, but not so much > domain constraints. Should we fix that? I think it'd be legitimate > to re-use SCHEMA_NAME for domain schema, but we'd need a new nonstandard > field DOMAIN_NAME (or maybe better DATATYPE_NAME) if we want to fix it. > Do we want to add that now? should be for me. one question - what do you thing about marking proprietary field with some prefix - like PG_DOMAIN_NAME ? > > If we do fix this, either now or later, it's going to require some small > support function very much like errtable() to perform the catalog > lookups for a datatype and set the ErrorData fields. The thought of > dropping such a thing into relerror.c exposes the fact that that > "module" isn't actually very modular. We could choose a different name > for the file but it'd still be pretty questionable as to what its > boundaries are. So I'm a bit inclined to drop relerror.c as such, and > go back to the early design that put errtable() and friends in > relcache.c. The errconstraint() function, not being specific to either > rels or datatypes, perhaps really does belong in elog.c. > +1 Pavel > Thoughts? > > regards, tom lane
Pavel Stehule <pavel.stehule@gmail.com> writes: > 2013/1/28 Tom Lane <tgl@sss.pgh.pa.us>: >> ... The current patch provides sufficient >> information to uniquely identify a table constraint, but not so much >> domain constraints. Should we fix that? I think it'd be legitimate >> to re-use SCHEMA_NAME for domain schema, but we'd need a new nonstandard >> field DOMAIN_NAME (or maybe better DATATYPE_NAME) if we want to fix it. >> Do we want to add that now? > should be for me. > one question - what do you thing about marking proprietary field with > some prefix - like PG_DOMAIN_NAME ? Don't particularly see the point of that. It seems quite unlikely that the ISO committee would invent a field with the same name and a conflicting definition. Anyway, these names aren't going to be exposed in any non "proprietary" interfaces AFAICS. Surely we don't, for instance, need to call the postgres_ext.h macro PG_DIAG_PG_DOMAIN_NAME. regards, tom lane
2013/1/28 Tom Lane <tgl@sss.pgh.pa.us>: > Pavel Stehule <pavel.stehule@gmail.com> writes: >> 2013/1/28 Tom Lane <tgl@sss.pgh.pa.us>: >>> ... The current patch provides sufficient >>> information to uniquely identify a table constraint, but not so much >>> domain constraints. Should we fix that? I think it'd be legitimate >>> to re-use SCHEMA_NAME for domain schema, but we'd need a new nonstandard >>> field DOMAIN_NAME (or maybe better DATATYPE_NAME) if we want to fix it. >>> Do we want to add that now? > >> should be for me. > >> one question - what do you thing about marking proprietary field with >> some prefix - like PG_DOMAIN_NAME ? > > Don't particularly see the point of that. It seems quite unlikely that > the ISO committee would invent a field with the same name and a > conflicting definition. Anyway, these names aren't going to be exposed > in any non "proprietary" interfaces AFAICS. Surely we don't, for > instance, need to call the postgres_ext.h macro PG_DIAG_PG_DOMAIN_NAME. ok Pavel > > regards, tom lane
On 1/5/13 12:48 PM, Peter Geoghegan wrote: >> is there agreement of routine_name and trigger_name fields? > Well, Tom and I are both opposed to including those fields. Peter E > seemed to support it in some way, but didn't respond to Tom's > criticisms (which were just a restatement of my own). So, it seems to > me that we're not going to do that, assuming nothing changes. Another point, in case someone wants to revisit this in the future, is that these fields were applied in a way that is contrary to the SQL standard, I think. The presented patch interpreted ROUTINE_NAME as: the error happened while executing this function. But according to the standard, the field is only set when the error was directly related to the function itself, for example when calling an INSERT statement in a non-volatile function.This is consistent with how, for example, TABLE_NAMEis set when the error is about the table, not just happened while reading the table.
On 28 January 2013 21:33, Peter Eisentraut <peter_e@gmx.net> wrote: > Another point, in case someone wants to revisit this in the future, is > that these fields were applied in a way that is contrary to the SQL > standard, I think. > > The presented patch interpreted ROUTINE_NAME as: the error happened > while executing this function. But according to the standard, the field > is only set when the error was directly related to the function itself, > for example when calling an INSERT statement in a non-volatile function. Right. It seems to me that ROUTINE_NAME is vastly less compelling than the fields that are likely to be present in the committed patch. GET DIAGNOSTICS, as implemented by DB2, allows clients /to poll/ for a large number of fields. I'm not really interested in that myelf, but if we were to add something in the same spirit, I think that extending errdata to support this would not be a sensible approach. Perhaps I'm mistaken, but I can't imagine that it would be terribly useful to anyone (including Pavel) to have a GET DIAGNOSTICS style ROUTINE_NAME. -- Regards, Peter Geoghegan
I wrote: > Rather what we've got is that constraints are uniquely named among > those associated with a table, or with a domain. So the correct > unique key for a table constraint is table schema + table name + > constraint name, whereas for a domain constraint it's domain schema + > domain name + constraint name. The current patch provides sufficient > information to uniquely identify a table constraint, but not so much > domain constraints. Should we fix that? I think it'd be legitimate > to re-use SCHEMA_NAME for domain schema, but we'd need a new nonstandard > field DOMAIN_NAME (or maybe better DATATYPE_NAME) if we want to fix it. I have hacked up the code (but not yet the documentation) to support this, but I found out that there's at least one place where this definition of the field semantics is a bit awkward. The issue is that this definition presupposes that we want to complain about a table or a domain, never both, because we're overloading both the SCHEMA_NAME and CONSTRAINT_NAME fields for both purposes. This is annoying in validateDomainConstraint(), where we know the domain constraint that we're complaining about and also the table/column containing the bad value. We can't fill in both TABLE_NAME and DATATYPE_NAME because they both want to set SCHEMA_NAME, and perhaps not to the same value. Since the error report is about a domain constraint, I don't have a big problem deciding that the domain has to win this tug-of-war. And it could easily be argued that if we had separate fields and filled in both, an application could get confused about whether we meant a table constraint or a domain constraint. (As submitted, the patch would definitely have made it look like we were complaining about a table constraint.) But it's still annoying that we can't represent all the info that is in the human-readable message. I'm not sure we should allow corner cases like this to drive the design; but if anyone has an idea about a cleaner way, let's hear it. regards, tom lane
On 1/28/13 11:08 PM, Tom Lane wrote: > The issue is that > this definition presupposes that we want to complain about a table or > a domain, never both, because we're overloading both the SCHEMA_NAME > and CONSTRAINT_NAME fields for both purposes. This is annoying in > validateDomainConstraint(), where we know the domain constraint that > we're complaining about and also the table/column containing the bad > value. We can't fill in both TABLE_NAME and DATATYPE_NAME because > they both want to set SCHEMA_NAME, and perhaps not to the same value. I think any error should only complain about one object, in this case the domain. The table, in this case, is more like a context stack item.
2013/1/27 Tom Lane <tgl@sss.pgh.pa.us>: > Peter Geoghegan <peter.geoghegan86@gmail.com> writes: >> On 26 January 2013 22:36, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> BTW, one thing that struck me in a quick look-through is that the >>> ERRCODE_FOREIGN_KEY_VIOLATION patches seem to inconsistently send >>> either the PK or FK rel as the "errtable". Is this really per spec? >>> I'd have sort of expected that the reported table ought to be the one >>> that the constraint belongs to, namely the FK table. > >> Personally, on the face of it I'd expect the "inconsistency" to simply >> reflect the fact that the error related to the referencing table or >> referenced table. > > I looked in the spec a bit, and what I found seems to support my > recollection about this. In SQL99, it's 19.1 <get diagnostics > statement> that defines the usage of these fields, and I see > > f) If the value of RETURNED_SQLSTATE corresponds to integrity > constraint violation, transaction rollback - integrity > constraint violation, or a triggered data change violation > that was caused by a violation of a referential constraint, > then: > > i) The values of CONSTRAINT_CATALOG and CONSTRAINT_SCHEMA are > the <catalog name> and the <unqualified schema name> of the > <schema name> of the schema containing the constraint or > assertion. The value of CONSTRAINT_NAME is the <qualified > identifier> of the constraint or assertion. > > ii) Case: > > 1) If the violated integrity constraint is a table > constraint, then the values of CATALOG_NAME, SCHEMA_ > NAME, and TABLE_NAME are the <catalog name>, the > <unqualified schema name> of the <schema name>, and > the <qualified identifier> or <local table name>, > respectively, of the table in which the table constraint > is contained. > > The notion of a constraint being "contained" in a table is a bit weird; > I guess they mean contained in the table's schema description. Anyway > it seems fairly clear to me that it's supposed to be the table that the > constraint belongs to, and that has to be the FK table not the PK table. +1 > > regards, tom lane
2013/1/28 Peter Geoghegan <peter.geoghegan86@gmail.com>: > On 28 January 2013 21:33, Peter Eisentraut <peter_e@gmx.net> wrote: >> Another point, in case someone wants to revisit this in the future, is >> that these fields were applied in a way that is contrary to the SQL >> standard, I think. >> >> The presented patch interpreted ROUTINE_NAME as: the error happened >> while executing this function. But according to the standard, the field >> is only set when the error was directly related to the function itself, >> for example when calling an INSERT statement in a non-volatile function. > > Right. It seems to me that ROUTINE_NAME is vastly less compelling than > the fields that are likely to be present in the committed patch. GET > DIAGNOSTICS, as implemented by DB2, allows clients /to poll/ for a > large number of fields. I'm not really interested in that myelf, but > if we were to add something in the same spirit, I think that extending > errdata to support this would not be a sensible approach. > > Perhaps I'm mistaken, but I can't imagine that it would be terribly > useful to anyone (including Pavel) to have a GET DIAGNOSTICS style > ROUTINE_NAME. I hoped so I can use it inside exception handler Regards Pavel > > -- > Regards, > Peter Geoghegan
On 29 January 2013 17:05, Pavel Stehule <pavel.stehule@gmail.com> wrote: >> Perhaps I'm mistaken, but I can't imagine that it would be terribly >> useful to anyone (including Pavel) to have a GET DIAGNOSTICS style >> ROUTINE_NAME. > > I hoped so I can use it inside exception handler Right, but is that really any use to you if it becomes available for a small subset of errors, specifically, errors that directly relate to the function? You're not going to be able to use it to trace the function where an arbitrary error occurred, if we do something consistent with GET DIAGNOSTICS as described by the SQL standard, it seems. I think that what the SQL standard intends here is actually consistent with what we're going to do with CONSTRAINT_NAME and so on. I just happen to think it's much less interesting, but am not opposed to it in principle (though I may oppose it in practice, if writing the feature means bloating up errdata). -- Regards, Peter Geoghegan
2013/1/29 Peter Geoghegan <peter.geoghegan86@gmail.com>: > On 29 January 2013 17:05, Pavel Stehule <pavel.stehule@gmail.com> wrote: >>> Perhaps I'm mistaken, but I can't imagine that it would be terribly >>> useful to anyone (including Pavel) to have a GET DIAGNOSTICS style >>> ROUTINE_NAME. >> >> I hoped so I can use it inside exception handler > > Right, but is that really any use to you if it becomes available for a > small subset of errors, specifically, errors that directly relate to > the function? You're not going to be able to use it to trace the > function where an arbitrary error occurred, if we do something > consistent with GET DIAGNOSTICS as described by the SQL standard, it > seems. in this meaning is not too useful as I expected. > > I think that what the SQL standard intends here is actually consistent > with what we're going to do with CONSTRAINT_NAME and so on. I just > happen to think it's much less interesting, but am not opposed to it > in principle (though I may oppose it in practice, if writing the > feature means bloating up errdata). I checked performance and Robert too, and there is not significant slowdown. if I do DROP FUNCTION inner(int); DROP FUNCTION middle(int); DROP FUNCTION outer(); CREATE OR REPLACE FUNCTION inner(int) RETURNS int AS $$ BEGIN RETURN 10/$1; END; $$ LANGUAGE plpgsql; CREATE OR REPLACE FUNCTION middle(int) RETURNS int AS $$ BEGIN RETURN inner($1); END; $$ LANGUAGE plpgsql; CREATE OR REPLACE FUNCTION outer() RETURNS int AS $$ DECLARE text_var1 text; BEGIN RETURN middle(0); EXCEPTION WHEN OTHERS THEN GET STACKED DIAGNOSTICS text_var1 = PG_EXCEPTION_CONTEXT; RAISE NOTICE '>>%<<', text_var1; RETURN-1; END; $$ LANGUAGE plpgsql; SELECT outer(); then output is psql:test.psql:34: NOTICE: >>PL/pgSQL function "inner"(integer) line 3 at RETURN PL/pgSQL function middle(integer) line 3 at RETURN PL/pgSQL function "outer"() line 5 at RETURN<< I have not any possibility to take information about source of exception without parsing context - and a string with context information can be changed, so it is not immutable and not easy accessible. Why I need this info - sometimes when I can log some info about handled exception I don't would log a complete context due size and due readability. Another idea - some adjusted parser of context message can live in GET STACKED DIAGNOSTICS implementation - so there can be some custom field (just for GET STACKED DIAG.) that returns expected value. But this value should be taken from context string with parsing - that is not nice - but possible - I did similar game in orafce. Regards Pavel > > -- > Regards, > Peter Geoghegan
I wrote: > Rather what we've got is that constraints are uniquely named among > those associated with a table, or with a domain. So the correct > unique key for a table constraint is table schema + table name + > constraint name, whereas for a domain constraint it's domain schema + > domain name + constraint name. The current patch provides sufficient > information to uniquely identify a table constraint, but not so much > domain constraints. Should we fix that? I think it'd be legitimate > to re-use SCHEMA_NAME for domain schema, but we'd need a new nonstandard > field DOMAIN_NAME (or maybe better DATATYPE_NAME) if we want to fix it. Here's an updated patch (code only, sans documentation) that fixes that and adds some other refactoring that I thought made for improvements. I think this is ready to commit except for the documentation. I eventually concluded that the two ALTER DOMAIN call sites in typecmds.c should report the table/column name (with errtablecol), even though in principle the auxiliary info ought to be about the domain. The reason is that the name of the domain is nearly useless here --- you probably know it already, if you're issuing an ALTER for it. In fact, we don't even bother to provide the name of the domain at all in either human-readable error message. On the other hand, the location of the offending value could be useful info. So I think usefulness should trump principle there. regards, tom lane
Attachment
I wrote: > Here's an updated patch (code only, sans documentation) that fixes that > and adds some other refactoring that I thought made for improvements. > I think this is ready to commit except for the documentation. Pushed with documentation. regards, tom lane