Thread: Does rewriteTargetListIU still need to add UPDATE tlist entries?
The comments for rewriteTargetListIU say (or said until earlier today) * 2. For an UPDATE on a trigger-updatable view, add tlist entries for any * unassigned-to attributes, assigning them their old values. These will * later get expanded to the output values of the view. (This is equivalent * to what the planner's expand_targetlist() will do for UPDATE on a regular * table, but it's more convenient to do it here while we still have easy * access to the view's original RT index.) This is only necessary for * trigger-updatable views, for which the view remains the result relation of * the query. For auto-updatable views we must not do this, since it might * add assignments to non-updatable view columns. For rule-updatable views it * is unnecessary extra work, since the query will be rewritten with a * different result relation which will be processed when we recurse via * RewriteQuery. I noticed that this is referencing something that, in fact, expand_targetlist() doesn't do anymore, so that this is a poor justification for the behavior. My first thought was that we still need to do it to produce the correct row contents for the INSTEAD OF trigger, so I updated the comment (in 08a986966) to claim that. However, on closer inspection, that's nonsense. nodeModifyTable.c populates the trigger "OLD" row from the whole-row variable that is generated for the view, and then it computes the "NEW" row using that old row and the UPDATE tlist; there is no need there for the UPDATE tlist to compute all the columns. The regression tests still pass just fine if we take out the questionable logic (cf. attached patch). Moreover, if you poke into it a little closer than the tests do, you notice that the existing code is uselessly computing non-updated columns twice, in both the extra tlist item and the whole-row variable. As an example, consider create table base (a int, b int); create view v1 as select a+1 as a1, b+2 as b2 from base; -- you also need an INSTEAD OF UPDATE trigger, not shown here explain verbose update v1 set a1 = a1 - 44; With HEAD you get Update on public.v1 (cost=0.00..60.85 rows=0 width=0) -> Seq Scan on public.base (cost=0.00..60.85 rows=2260 width=46) Output: ((base.a + 1) - 44), (base.b + 2), ROW((base.a + 1), (base.b + 2)), base.ctid There's really no need to compute base.b + 2 twice, and with this patch we don't: Update on public.v1 (cost=0.00..55.20 rows=0 width=0) -> Seq Scan on public.base (cost=0.00..55.20 rows=2260 width=42) Output: ((base.a + 1) - 44), ROW((base.a + 1), (base.b + 2)), base.ctid I would think that this is a totally straightforward improvement, but there's one thing in the comments for rewriteTargetListIU that gives me a little pause: it says * We must do items 1,2,3 before firing rewrite rules, else rewritten * references to NEW.foo will produce wrong or incomplete results. As far as I can tell, though, references to NEW values still do the right thing. I'm not certain whether any of the existing regression tests really cover this point, but experimenting with the scenario shown in the attached SQL file says that the DO ALSO rule gets the right results. It's possible that the expansion sequence is a bit different than before, but we still end up with the right answer. So, as far as I can tell, this is an oversight in 86dc90056 and we ought to clean it up as attached. regards, tom lane diff --git a/src/backend/rewrite/rewriteHandler.c b/src/backend/rewrite/rewriteHandler.c index 4fa4ac2aef..4bba6eb805 100644 --- a/src/backend/rewrite/rewriteHandler.c +++ b/src/backend/rewrite/rewriteHandler.c @@ -679,18 +679,7 @@ adjustJoinTreeList(Query *parsetree, bool removert, int rt_index) * and UPDATE, replace explicit DEFAULT specifications with column default * expressions. * - * 2. For an UPDATE on a trigger-updatable view, add tlist entries for any - * unassigned-to attributes, assigning them their old values. These will - * later get expanded to the output values of the view. This is only needed - * for trigger-updatable views, for which the view remains the result relation - * of the query; without it, we would not have a complete "new tuple" to pass - * to triggers. For auto-updatable views we must not do this, since it might - * add assignments to non-updatable view columns. For rule-updatable views it - * is unnecessary extra work, since the query will be rewritten with a - * different result relation which will be processed when we recurse via - * RewriteQuery. - * - * 3. Merge multiple entries for the same target attribute, or declare error + * 2. Merge multiple entries for the same target attribute, or declare error * if we can't. Multiple entries are only allowed for INSERT/UPDATE of * portions of an array or record field, for example * UPDATE table SET foo[2] = 42, foo[4] = 43; @@ -698,11 +687,11 @@ adjustJoinTreeList(Query *parsetree, bool removert, int rt_index) * the expression we want to produce in this case is like * foo = array_set_element(array_set_element(foo, 2, 42), 4, 43) * - * 4. Sort the tlist into standard order: non-junk fields in order by resno, + * 3. Sort the tlist into standard order: non-junk fields in order by resno, * then junk fields (these in no particular order). * - * We must do items 1,2,3 before firing rewrite rules, else rewritten - * references to NEW.foo will produce wrong or incomplete results. Item 4 + * We must do items 1 & 2 before firing rewrite rules, else rewritten + * references to NEW.foo will produce wrong or incomplete results. Item 3 * is not needed for rewriting, but it is helpful for the planner, and we * can do it essentially for free while handling the other items. * @@ -984,29 +973,6 @@ rewriteTargetListIU(List *targetList, false); } - /* - * For an UPDATE on a trigger-updatable view, provide a dummy entry - * whenever there is no explicit assignment. - */ - if (new_tle == NULL && commandType == CMD_UPDATE && - target_relation->rd_rel->relkind == RELKIND_VIEW && - view_has_instead_trigger(target_relation, CMD_UPDATE)) - { - Node *new_expr; - - new_expr = (Node *) makeVar(result_rti, - attrno, - att_tup->atttypid, - att_tup->atttypmod, - att_tup->attcollation, - 0); - - new_tle = makeTargetEntry((Expr *) new_expr, - attrno, - pstrdup(NameStr(att_tup->attname)), - false); - } - if (new_tle) new_tlist = lappend(new_tlist, new_tle); } drop table base cascade; drop table log cascade; create table base (a int, b int); create view v1 as select a+1 as a1, b+2 as b2 from base; insert into base values (1,10),(11,13); create table log(ao int, an int, bo int, bn int); CREATE OR REPLACE FUNCTION vtrigger() RETURNS trigger LANGUAGE plpgsql AS $function$ begin raise notice '% % % %', TG_TABLE_NAME, TG_WHEN, TG_OP, TG_LEVEL; raise NOTICE 'OLD: %, NEW: %', OLD, NEW; UPDATE base SET a = NEW.a1-1, b = NEW.b2-2 WHERE a = OLD.a1-1 AND b = OLD.b2-2; if NOT FOUND then RETURN NULL; end if; RETURN NEW; end; $function$; CREATE TRIGGER instead_of_update_trig INSTEAD OF UPDATE ON v1 FOR EACH ROW EXECUTE PROCEDURE vtrigger(); create rule logit as on update to v1 do also insert into log values(old.a1, new.a1, old.b2, new.b2); update v1 set a1 = a1-44;
On Mon, Apr 26, 2021 at 9:40 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > The comments for rewriteTargetListIU say (or said until earlier today) > > * 2. For an UPDATE on a trigger-updatable view, add tlist entries for any > * unassigned-to attributes, assigning them their old values. These will > * later get expanded to the output values of the view. (This is equivalent > * to what the planner's expand_targetlist() will do for UPDATE on a regular > * table, but it's more convenient to do it here while we still have easy > * access to the view's original RT index.) This is only necessary for > * trigger-updatable views, for which the view remains the result relation of > * the query. For auto-updatable views we must not do this, since it might > * add assignments to non-updatable view columns. For rule-updatable views it > * is unnecessary extra work, since the query will be rewritten with a > * different result relation which will be processed when we recurse via > * RewriteQuery. > > I noticed that this is referencing something that, in fact, > expand_targetlist() doesn't do anymore, so that this is a poor > justification for the behavior. My first thought was that we still > need to do it to produce the correct row contents for the INSTEAD OF > trigger, so I updated the comment (in 08a986966) to claim that. Check. > However, on closer inspection, that's nonsense. nodeModifyTable.c > populates the trigger "OLD" row from the whole-row variable that is > generated for the view, and then it computes the "NEW" row using > that old row and the UPDATE tlist; there is no need there for the > UPDATE tlist to compute all the columns. The regression tests still > pass just fine if we take out the questionable logic (cf. attached > patch). Moreover, if you poke into it a little closer than the tests > do, you notice that the existing code is uselessly computing non-updated > columns twice, in both the extra tlist item and the whole-row variable. > > As an example, consider > > create table base (a int, b int); > create view v1 as select a+1 as a1, b+2 as b2 from base; > -- you also need an INSTEAD OF UPDATE trigger, not shown here > explain verbose update v1 set a1 = a1 - 44; > > With HEAD you get > > Update on public.v1 (cost=0.00..60.85 rows=0 width=0) > -> Seq Scan on public.base (cost=0.00..60.85 rows=2260 width=46) > Output: ((base.a + 1) - 44), (base.b + 2), ROW((base.a + 1), (base.b + 2)), base.ctid > > There's really no need to compute base.b + 2 twice, and with this > patch we don't: > > Update on public.v1 (cost=0.00..55.20 rows=0 width=0) > -> Seq Scan on public.base (cost=0.00..55.20 rows=2260 width=42) > Output: ((base.a + 1) - 44), ROW((base.a + 1), (base.b + 2)), base.ctid That makes sense to me, at least logically. > I would think that this is a totally straightforward improvement, > but there's one thing in the comments for rewriteTargetListIU that > gives me a little pause: it says > > * We must do items 1,2,3 before firing rewrite rules, else rewritten > * references to NEW.foo will produce wrong or incomplete results. > > As far as I can tell, though, references to NEW values still do the > right thing. I'm not certain whether any of the existing regression > tests really cover this point, but experimenting with the scenario shown > in the attached SQL file says that the DO ALSO rule gets the right > results. It's possible that the expansion sequence is a bit different > than before, but we still end up with the right answer. I also checked what the rewriter and the planner do for the following DO ALSO insert: create rule logit as on update to v1 do also insert into log values(old.a1, new.a1, old.b2, new.b2); and can see that the insert ends up with the right targetlist irrespective of whether or not rewriteTargetListIU() adds an item for NEW.b2. So, I attached a debugger to the update query in your shared script and focused on how ReplaceVarsFromTargetList(), running on the insert query added by the rule, handles the item for NEW.b2 no longer being added to the update's targetlist after your patch. Turns out the result (the insert's targetlist) is the same even if the path taken in ReplaceVarsFromTargetList_callback() is different after your patch. Before: explain verbose update v1 set a1 = a1-44; QUERY PLAN ----------------------------------------------------------------------------------------------- Insert on public.log (cost=0.00..60.85 rows=0 width=0) -> Seq Scan on public.base (cost=0.00..60.85 rows=2260 width=16) Output: (base.a + 1), ((base.a + 1) - 44), (base.b + 2), (base.b + 2) Update on public.v1 (cost=0.00..60.85 rows=0 width=0) -> Seq Scan on public.base (cost=0.00..60.85 rows=2260 width=46) Output: ((base.a + 1) - 44), (base.b + 2), ROW((base.a + 1), (base.b + 2)), base.ctid (7 rows) After: explain verbose update v1 set a1 = a1-44; QUERY PLAN --------------------------------------------------------------------------------- Insert on public.log (cost=0.00..60.85 rows=0 width=0) -> Seq Scan on public.base (cost=0.00..60.85 rows=2260 width=16) Output: (base.a + 1), ((base.a + 1) - 44), (base.b + 2), (base.b + 2) Update on public.v1 (cost=0.00..55.20 rows=0 width=0) -> Seq Scan on public.base (cost=0.00..55.20 rows=2260 width=42) Output: ((base.a + 1) - 44), ROW((base.a + 1), (base.b + 2)), base.ctid (7 rows) I didn't however study closely why REPLACEVARS_CHANGE_VARNO does the correct thing, so am not sure if there might be cases that would be broken. > So, as far as I can tell, this is an oversight in 86dc90056 and we > ought to clean it up as attached. Thanks for noticing this and the patch. If you are confident that REPLACEVARS_CHANGE_VARNO covers all imaginable cases, I suppose it makes sense to apply it. -- Amit Langote EDB: http://www.enterprisedb.com
Amit Langote <amitlangote09@gmail.com> writes: > On Mon, Apr 26, 2021 at 9:40 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: >> I would think that this is a totally straightforward improvement, >> but there's one thing in the comments for rewriteTargetListIU that >> gives me a little pause: it says >> >> * We must do items 1,2,3 before firing rewrite rules, else rewritten >> * references to NEW.foo will produce wrong or incomplete results. >> >> As far as I can tell, though, references to NEW values still do the >> right thing. I'm not certain whether any of the existing regression >> tests really cover this point, but experimenting with the scenario shown >> in the attached SQL file says that the DO ALSO rule gets the right >> results. It's possible that the expansion sequence is a bit different >> than before, but we still end up with the right answer. > I also checked what the rewriter and the planner do for the following > DO ALSO insert: > create rule logit as on update to v1 do also > insert into log values(old.a1, new.a1, old.b2, new.b2); > and can see that the insert ends up with the right targetlist > irrespective of whether or not rewriteTargetListIU() adds an item for > NEW.b2. Thanks for looking at that. On reflection I think this must be so, because those rewriter mechanisms were designed long before we had trigger-updatable views, and rewriteTargetListIU has never added tlist items like this for any other sort of view. So the ability to insert the original view output column has necessarily been there from the beginning. This is just getting rid of a weird implementation difference between trigger-updatable views and other views. regards, tom lane
On Mon, 26 Apr 2021 at 15:09, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Thanks for looking at that. On reflection I think this must be so, > because those rewriter mechanisms were designed long before we had > trigger-updatable views, and rewriteTargetListIU has never added > tlist items like this for any other sort of view. So the ability > to insert the original view output column has necessarily been there > from the beginning. This is just getting rid of a weird implementation > difference between trigger-updatable views and other views. > FWIW, I had a look at this too and came to much the same conclusion, so I think this is a safe change that makes the code a little neater and more efficient. Regards, Dean
Dean Rasheed <dean.a.rasheed@gmail.com> writes: > On Mon, 26 Apr 2021 at 15:09, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Thanks for looking at that. On reflection I think this must be so, >> because those rewriter mechanisms were designed long before we had >> trigger-updatable views, and rewriteTargetListIU has never added >> tlist items like this for any other sort of view. So the ability >> to insert the original view output column has necessarily been there >> from the beginning. This is just getting rid of a weird implementation >> difference between trigger-updatable views and other views. > FWIW, I had a look at this too and came to much the same conclusion, > so I think this is a safe change that makes the code a little neater > and more efficient. Again, thanks for looking! I checked into the commit history (how'd we ever survive without "git blame"?) and found that my argument above is actually wrong in detail. Before cab5dc5da of 2013-10-18, rewriteTargetListIU expanded non-updated columns for all views not only trigger-updatable ones. However, that behavior itself goes back only to 2ec993a7c of 2010-10-10, which added triggers on views; before that there was indeed no such expansion. Of course the view rewrite mechanisms are ten or so years older than that, so the conclusion that they weren't designed to need this still stands. regards, tom lane
On Mon, 26 Apr 2021 at 15:55, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > I checked into the commit history (how'd we ever survive without "git > blame"?) and found that my argument above is actually wrong in detail. > Before cab5dc5da of 2013-10-18, rewriteTargetListIU expanded non-updated > columns for all views not only trigger-updatable ones. However, that > behavior itself goes back only to 2ec993a7c of 2010-10-10, which added > triggers on views; before that there was indeed no such expansion. Ah, that makes sense. Before cab5dc5da, expanding non-updated columns of auto-updatable views was safe because until then a view was only auto-updatable if all it's columns were. It was still unnecessary work though, and with 20/20 hindsight, when triggers on views were first added in 2ec993a7c, it probably should have only expanded the targetlist for trigger-updatable views. > Of course the view rewrite mechanisms are ten or so years older than > that, so the conclusion that they weren't designed to need this still > stands. Yeah, I think that conclusion is right. The trickiest part I found was deciding whether any product queries from conditional rules would do the right thing if the main trigger-updatable query no longer expands its targetlist. But I think that has to be OK, because even before trigger-updatable views were added, it was possible to have product queries from conditional rules together with an unconditional do-nothing rule, so the product queries don't rely on the expanded targetlist, and never have. Regards, Dean