Re: Does rewriteTargetListIU still need to add UPDATE tlist entries? - Mailing list pgsql-hackers

From Amit Langote
Subject Re: Does rewriteTargetListIU still need to add UPDATE tlist entries?
Date
Msg-id CA+HiwqHupkTsaEKt5duXxSnH5fiMiv-kNtc5G=EVdncfuqzggA@mail.gmail.com
Whole thread Raw
In response to Does rewriteTargetListIU still need to add UPDATE tlist entries?  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Does rewriteTargetListIU still need to add UPDATE tlist entries?  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Mon, Apr 26, 2021 at 9:40 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> The comments for rewriteTargetListIU say (or said until earlier today)
>
>  * 2. For an UPDATE on a trigger-updatable view, add tlist entries for any
>  * unassigned-to attributes, assigning them their old values.  These will
>  * later get expanded to the output values of the view.  (This is equivalent
>  * to what the planner's expand_targetlist() will do for UPDATE on a regular
>  * table, but it's more convenient to do it here while we still have easy
>  * access to the view's original RT index.)  This is only necessary for
>  * trigger-updatable views, for which the view remains the result relation of
>  * the query.  For auto-updatable views we must not do this, since it might
>  * add assignments to non-updatable view columns.  For rule-updatable views it
>  * is unnecessary extra work, since the query will be rewritten with a
>  * different result relation which will be processed when we recurse via
>  * RewriteQuery.
>
> I noticed that this is referencing something that, in fact,
> expand_targetlist() doesn't do anymore, so that this is a poor
> justification for the behavior.  My first thought was that we still
> need to do it to produce the correct row contents for the INSTEAD OF
> trigger, so I updated the comment (in 08a986966) to claim that.

Check.

> However, on closer inspection, that's nonsense.  nodeModifyTable.c
> populates the trigger "OLD" row from the whole-row variable that is
> generated for the view, and then it computes the "NEW" row using
> that old row and the UPDATE tlist; there is no need there for the
> UPDATE tlist to compute all the columns.  The regression tests still
> pass just fine if we take out the questionable logic (cf. attached
> patch).  Moreover, if you poke into it a little closer than the tests
> do, you notice that the existing code is uselessly computing non-updated
> columns twice, in both the extra tlist item and the whole-row variable.
>
> As an example, consider
>
> create table base (a int, b int);
> create view v1 as select a+1 as a1, b+2 as b2 from base;
> -- you also need an INSTEAD OF UPDATE trigger, not shown here
> explain verbose update v1 set a1 = a1 - 44;
>
> With HEAD you get
>
>  Update on public.v1  (cost=0.00..60.85 rows=0 width=0)
>    ->  Seq Scan on public.base  (cost=0.00..60.85 rows=2260 width=46)
>          Output: ((base.a + 1) - 44), (base.b + 2), ROW((base.a + 1), (base.b + 2)), base.ctid
>
> There's really no need to compute base.b + 2 twice, and with this
> patch we don't:
>
>  Update on public.v1  (cost=0.00..55.20 rows=0 width=0)
>    ->  Seq Scan on public.base  (cost=0.00..55.20 rows=2260 width=42)
>          Output: ((base.a + 1) - 44), ROW((base.a + 1), (base.b + 2)), base.ctid

That makes sense to me, at least logically.

> I would think that this is a totally straightforward improvement,
> but there's one thing in the comments for rewriteTargetListIU that
> gives me a little pause: it says
>
>  * We must do items 1,2,3 before firing rewrite rules, else rewritten
>  * references to NEW.foo will produce wrong or incomplete results.
>
> As far as I can tell, though, references to NEW values still do the
> right thing.  I'm not certain whether any of the existing regression
> tests really cover this point, but experimenting with the scenario shown
> in the attached SQL file says that the DO ALSO rule gets the right
> results.  It's possible that the expansion sequence is a bit different
> than before, but we still end up with the right answer.

I also checked what the rewriter and the planner do for the following
DO ALSO insert:

create rule logit as on update to v1 do also
insert into log values(old.a1, new.a1, old.b2, new.b2);

and can see that the insert ends up with the right targetlist
irrespective of whether or not rewriteTargetListIU() adds an item for
NEW.b2.  So, I attached a debugger to the update query in your shared
script and focused on how ReplaceVarsFromTargetList(), running on the
insert query added by the rule, handles the item for NEW.b2 no longer
being added to the update's targetlist after your patch.  Turns out
the result (the insert's targetlist) is the same even if the path
taken in ReplaceVarsFromTargetList_callback() is different after your
patch.

Before:

explain verbose update v1 set a1 = a1-44;
                                          QUERY PLAN
-----------------------------------------------------------------------------------------------
 Insert on public.log  (cost=0.00..60.85 rows=0 width=0)
   ->  Seq Scan on public.base  (cost=0.00..60.85 rows=2260 width=16)
         Output: (base.a + 1), ((base.a + 1) - 44), (base.b + 2), (base.b + 2)

 Update on public.v1  (cost=0.00..60.85 rows=0 width=0)
   ->  Seq Scan on public.base  (cost=0.00..60.85 rows=2260 width=46)
         Output: ((base.a + 1) - 44), (base.b + 2), ROW((base.a + 1),
(base.b + 2)), base.ctid
(7 rows)

After:

explain verbose update v1 set a1 = a1-44;
                                   QUERY PLAN
---------------------------------------------------------------------------------
 Insert on public.log  (cost=0.00..60.85 rows=0 width=0)
   ->  Seq Scan on public.base  (cost=0.00..60.85 rows=2260 width=16)
         Output: (base.a + 1), ((base.a + 1) - 44), (base.b + 2), (base.b + 2)

 Update on public.v1  (cost=0.00..55.20 rows=0 width=0)
   ->  Seq Scan on public.base  (cost=0.00..55.20 rows=2260 width=42)
         Output: ((base.a + 1) - 44), ROW((base.a + 1), (base.b + 2)), base.ctid
(7 rows)

I didn't however study closely why REPLACEVARS_CHANGE_VARNO does the
correct thing, so am not sure if there might be cases that would be
broken.

> So, as far as I can tell, this is an oversight in 86dc90056 and we
> ought to clean it up as attached.

Thanks for noticing this and the patch.  If you are confident that
REPLACEVARS_CHANGE_VARNO covers all imaginable cases, I suppose it
makes sense to apply it.

-- 
Amit Langote
EDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Bharath Rupireddy
Date:
Subject: Re: Can a child process detect postmaster death when in pg_usleep?
Next
From: Amit Kapila
Date:
Subject: Re: Truncate in synchronous logical replication failed