Thread: Second thoughts on CheckIndexCompatible() vs. operator families
In 367bc426a1c22b9f6badb06cd41fc438fd034639, I introduced a CheckIndexCompatible() that approves btree and hash indexes having changed to a different operator class within the same operator family. To make that valid, I also tightened the operator family contracts for those access methods to address casts. However, I've found two retained formal hazards. They're boring and perhaps unlikely to harm real users, but I'd rather nail them down just in case. First, opclasses like array_ops have polymorphic opcintype. Members of such operator classes could, in general, change behavior arbitrarily in response to get_fn_expr_argtype(). No core polymorphic operator family member does this. Nor could they, for no core index access method sets fn_expr. In the absence of responses to my previous inquiry[1] on the topic, I'm taking the position that the lack of fn_expr in these calls is an isolated implementation detail that could change at any time. Therefore, we should only preserve an index of polymorphic operator class when the old and new opcintype match exactly. This patch's test suite addition illustrates one ALTER TABLE ALTER TYPE affected by this new restriction: a conversion from an array type to a domain over that array type will now require an index rebuild. Second, as a thought experiment, consider a database with these three types: 1. int4, the core type. 2. int4neg, stores to disk as the negation of its logical value. Includes a complete set of operators in the integer_ops btree family, but defines no casts to other integer_ops-represented types. 3. int4other, stores to disk like int4. No operators. Has a implicit binary coercion cast to int4 and an explicit binary coercion cast to int4neg. Suppose a table in this database has a column of type int4other with an index of default operator class. By virtue of the implicit binary coercion and lack of other candidates, the index will use int4_ops. Now, change the type of that column from int4other to int4neg. The operator class changes to int4neg_ops, still within the integer_ops family, and we do not rebuild the index. However, the logical sign of each value just flipped, making the index invalid. Where did CheckIndexCompatible() miss? An operator family only promises cast-compatibility when there exists an implicit or binary coercion cast between the types in question. There's no int4->int4neg cast here, not even a multiple-step pathway. CheckIndexCompatible() assumes that our ability to perform a no-rewrite ALTER TABLE ALTER TYPE implies the existence of such a cast. It does imply that for the before-and-after _table column types_, but it's the before-and-after _opcintype_ that matter here. I think we could close this hazard by having CheckIndexCompatible() test for an actual implicit or binary coercion cast between the opcintype of the old and new operator classes. However, I'm not confident to say that no similar problem would remain. Therefore, I propose ceasing to optimize intra-family index operator transitions and simply requiring exact matches of operator classes and exclusion operators. This does not remove any actual optimization for changes among core types, and it will be simpler to validate. (I designed the current rules under the misapprehension that varchar_ops was the default operator class for varchar columns. However, varchar_ops is a copy of text_ops apart from the name and being non-default. We ship no operator with a varchar operand, relying instead on binary coercion to text.) This patch does not, however, remove the new terms from the operator family contracts. While core PostgreSQL will no longer depend on them, they may again prove handy for future optimizations like this. I remain of the opinion that they're already widely (perhaps even universally) obeyed. This patch conflicts trivially with my patch "Avoid FK validations for no-rewrite ALTER TABLE ALTER TYPE" in that they both add test cases to the same location in alter_table.sql. They're otherwise independent, albeit reflecting parallel principles. Thanks, nm [1] http://archives.postgresql.org/message-id/20111229211711.GA8085@tornado.leadboat.com
Attachment
New version that repairs a defective test case.
Attachment
On Sun, Jan 22, 2012 at 12:23 AM, Noah Misch <noah@leadboat.com> wrote: > New version that repairs a defective test case. Committed. I don't find this to be particularly good style: + for (i = 0; i < old_natts && ret; i++) + ret = (!IsPolymorphicType(get_opclass_input_type(classObjectId[i + irel->rd_att->attrs[i]->atttypid == typeObjectId[i]); ...but I am not sure whether we have any formal policy against it, so I just committed it as-is for now. I would have surrounded the loop with an if (ret) block and written the body of the loop as if (condition) { ret = false; break; }. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Excerpts from Robert Haas's message of mié ene 25 17:32:49 -0300 2012: > On Sun, Jan 22, 2012 at 12:23 AM, Noah Misch <noah@leadboat.com> wrote: > > New version that repairs a defective test case. > > Committed. I don't find this to be particularly good style: > > + for (i = 0; i < old_natts && ret; i++) > + ret = (!IsPolymorphicType(get_opclass_input_type(classObjectId[i > + irel->rd_att->attrs[i]->atttypid == typeObjectId[i]); > > ...but I am not sure whether we have any formal policy against it, so > I just committed it as-is for now. I would have surrounded the loop > with an if (ret) block and written the body of the loop as if > (condition) { ret = false; break; }. I find that code way too clever. -- Álvaro Herrera <alvherre@commandprompt.com> The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support
On Wed, Jan 25, 2012 at 3:52 PM, Alvaro Herrera <alvherre@commandprompt.com> wrote: > > Excerpts from Robert Haas's message of mié ene 25 17:32:49 -0300 2012: >> On Sun, Jan 22, 2012 at 12:23 AM, Noah Misch <noah@leadboat.com> wrote: >> > New version that repairs a defective test case. >> >> Committed. I don't find this to be particularly good style: >> >> + for (i = 0; i < old_natts && ret; i++) >> + ret = (!IsPolymorphicType(get_opclass_input_type(classObjectId[i >> + irel->rd_att->attrs[i]->atttypid == typeObjectId[i]); >> >> ...but I am not sure whether we have any formal policy against it, so >> I just committed it as-is for now. I would have surrounded the loop >> with an if (ret) block and written the body of the loop as if >> (condition) { ret = false; break; }. > > I find that code way too clever. The way he wrote it, or the way I proposed to write it? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Excerpts from Robert Haas's message of mié ene 25 19:05:44 -0300 2012: > > On Wed, Jan 25, 2012 at 3:52 PM, Alvaro Herrera > <alvherre@commandprompt.com> wrote: > > > > Excerpts from Robert Haas's message of mié ene 25 17:32:49 -0300 2012: > >> On Sun, Jan 22, 2012 at 12:23 AM, Noah Misch <noah@leadboat.com> wrote: > >> > New version that repairs a defective test case. > >> > >> Committed. I don't find this to be particularly good style: > >> > >> + for (i = 0; i < old_natts && ret; i++) > >> + ret = (!IsPolymorphicType(get_opclass_input_type(classObjectId[i > >> + irel->rd_att->attrs[i]->atttypid == typeObjectId[i]); > >> > >> ...but I am not sure whether we have any formal policy against it, so > >> I just committed it as-is for now. I would have surrounded the loop > >> with an if (ret) block and written the body of the loop as if > >> (condition) { ret = false; break; }. > > > > I find that code way too clever. > > The way he wrote it, or the way I proposed to write it? The code as committed. -- Álvaro Herrera <alvherre@commandprompt.com> The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support
On Wed, Jan 25, 2012 at 03:32:49PM -0500, Robert Haas wrote: > On Sun, Jan 22, 2012 at 12:23 AM, Noah Misch <noah@leadboat.com> wrote: > > New version that repairs a defective test case. > > Committed. I don't find this to be particularly good style: Thanks. > + for (i = 0; i < old_natts && ret; i++) > + ret = (!IsPolymorphicType(get_opclass_input_type(classObjectId[i > + irel->rd_att->attrs[i]->atttypid == typeObjectId[i]); > > ...but I am not sure whether we have any formal policy against it, so > I just committed it as-is for now. I would have surrounded the loop > with an if (ret) block and written the body of the loop as if > (condition) { ret = false; break; }. I value the savings in vertical space more than the lost idiomaticness. This decision is 90+% subjective, so I cannot blame you for concluding otherwise. I do know the feeling of looking at PostgreSQL source code and wishing the author had not attempted to conserve every line.
Alvaro Herrera <alvherre@commandprompt.com> writes: > Excerpts from Robert Haas's message of mié ene 25 17:32:49 -0300 2012: >> On Sun, Jan 22, 2012 at 12:23 AM, Noah Misch <noah@leadboat.com> wrote: > New version that repairs a defective test case. >> >> Committed. I don't find this to be particularly good style: >> >> + for (i = 0; i < old_natts && ret; i++) >> + ret = (!IsPolymorphicType(get_opclass_input_type(classObjectId[i >> + irel->rd_att->attrs[i]->atttypid == typeObjectId[i]); >> >> ...but I am not sure whether we have any formal policy against it, so >> I just committed it as-is for now. I would have surrounded the loop >> with an if (ret) block and written the body of the loop as if >> (condition) { ret = false; break; }. > I find that code way too clever. Not only is that code spectacularly unreadable, but has nobody noticed that this commit broke the buildfarm? regards, tom lane
On Wed, Jan 25, 2012 at 11:53:10PM -0500, Tom Lane wrote: > Not only is that code spectacularly unreadable, but has nobody noticed > that this commit broke the buildfarm? Thanks for reporting the problem. This arose because the new test case temporarily sets client_min_messages=DEBUG1. The default buildfarm configuration sets log_statement=all in its postgresql.conf, so the client gets those log_statement lines. I would fix this as attached, resetting the optional logging to defaults during the test cases in question. Not delightful, but that's what we have to work with. nm
Attachment
On Thu, Jan 26, 2012 at 6:55 AM, Noah Misch <noah@leadboat.com> wrote: > On Wed, Jan 25, 2012 at 11:53:10PM -0500, Tom Lane wrote: >> Not only is that code spectacularly unreadable, but has nobody noticed >> that this commit broke the buildfarm? > > Thanks for reporting the problem. This arose because the new test case > temporarily sets client_min_messages=DEBUG1. The default buildfarm > configuration sets log_statement=all in its postgresql.conf, so the client > gets those log_statement lines. I would fix this as attached, resetting the > optional logging to defaults during the test cases in question. Not > delightful, but that's what we have to work with. I'm just going to remove the test. This is not very future-proof and an ugly pattern if it gets copied to other places. We need to work on a more sensible way for ALTER TABLE to report what it did, but a solution based on what GUCs the build-farm happens to set doesn't seem like it's justified for the narrowness of the case we're testing here.Whether or not we allow this case to work without arewrite is in some sense arbitrary. There's no real reason it can't be done; rather, we're just exercising restraint to minimize the risk of future bugs. So I don't want to go to great lengths to test it. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Thu, Jan 26, 2012 at 08:23:34AM -0500, Robert Haas wrote: > On Thu, Jan 26, 2012 at 6:55 AM, Noah Misch <noah@leadboat.com> wrote: > > On Wed, Jan 25, 2012 at 11:53:10PM -0500, Tom Lane wrote: > >> Not only is that code spectacularly unreadable, but has nobody noticed > >> that this commit broke the buildfarm? > > > > Thanks for reporting the problem. ?This arose because the new test case > > temporarily sets client_min_messages=DEBUG1. ?The default buildfarm > > configuration sets log_statement=all in its postgresql.conf, so the client > > gets those log_statement lines. ?I would fix this as attached, resetting the > > optional logging to defaults during the test cases in question. ?Not > > delightful, but that's what we have to work with. > > I'm just going to remove the test. This is not very future-proof and It would deserve an update whenever we add a new optional-logging GUC pertaining to user backends. Other tests require similarly-infrequent refreshes in response to other changes. Of course, buildfarm members would not be setting those new GUCs the day they're available. Calling for an update to the test could reasonably fall to the first buildfarm member owner who actually decides to use a new GUC in his configuration. > an ugly pattern if it gets copied to other places. We need to work on I would rather folks introduce ugliness to the test suite than introduce behaviors having no test coverage. > a more sensible way for ALTER TABLE to report what it did, but a > solution based on what GUCs the build-farm happens to set doesn't seem > like it's justified for the narrowness of the case we're testing here. It's not based on what GUCs the buildfarm happens to set. I looked up all GUCs that might create problems such as the one observed here. They were the four I included in the patch, plus debug_pretty_print, debug_print_parse, debug_print_plan and debug_print_rewritten. I concluded that the four debug_* ones were materially less likely to ever get set in postgresql.conf for a "make installcheck" run, so I left them out for brevity. The setting changed *by default* for buildfarm clients is log_statement. Buildfarm member owners can do as they wish, though. > Whether or not we allow this case to work without a rewrite is in > some sense arbitrary. There's no real reason it can't be done; rather, > we're just exercising restraint to minimize the risk of future bugs. > So I don't want to go to great lengths to test it. I used the same strategy in another ALTER TABLE patch this CF: http://archives.postgresql.org/message-id/20120126033956.GC15670@tornado.leadboat.com If we pay its costs for those tests, we then may as well let this test case ride their coattails.
Noah Misch <noah@leadboat.com> writes: > On Thu, Jan 26, 2012 at 08:23:34AM -0500, Robert Haas wrote: >> I'm just going to remove the test. This is not very future-proof and > [ objections ] FWIW, I concur with Robert's choice here. This test method is ugly and fragile, and I'm not even thinking about the question of whether an installation might have GUC settings that affect it. My objection is that you're assuming that nowhere else, anywhere in the large amount of code executed by the queries under test, will anyone ever wish to insert another elog(DEBUG) message. > I used the same strategy in another ALTER TABLE patch this CF: > http://archives.postgresql.org/message-id/20120126033956.GC15670@tornado.leadboat.com That's going to need to be removed before commit too, then. regards, tom lane