Re: making update/delete of inheritance trees scale better - Mailing list pgsql-hackers

From Amit Langote
Subject Re: making update/delete of inheritance trees scale better
Date
Msg-id CA+HiwqFe=EZqyAQpzK=q4Rig1ZHgAtt94KKr973oR+nkVvarig@mail.gmail.com
Whole thread Raw
In response to Re: making update/delete of inheritance trees scale better  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: making update/delete of inheritance trees scale better  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Fri, Mar 26, 2021 at 3:07 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
> > - Until I went back and found your earlier remarks on this thread, I
> > was confused as to why you were replacing a JunkFilter with a
> > ProjectionInfo. I think it would be good to try to add some more
> > explicit comments about that design choice, perhaps as header comments
> > for ExecGetUpdateNewTuple, or maybe there's a better place. I'm still
> > not sure why we need to do the same thing for the insert case, which
> > seems to just be about removing junk columns.
>
> I wondered about that too; this patch allegedly isn't touching anything
> interesting about INSERT cases, so why should we modify that?  However,
> when I tried to poke at that, I discovered that it seems to be dead code
> anyway.  A look at coverage.postgresql.org will show you that no
> regression test reaches "junk_filter_needed = true" in
> ExecInitModifyTable's inspection of INSERT tlists, and I've been unable to
> create such a case manually either.

I noticed this too.

>  I think the reason is that the parser
> initially builds all INSERT ... SELECT cases with the SELECT as an
> explicit subquery, giving rise to a SubqueryScan node just below the
> ModifyTable, which will project exactly the desired columns and no more.
> We'll optimize away the SubqueryScan if it's a no-op projection, but not
> if it is getting rid of junk columns.  There is room for more optimization
> here: dropping the SubqueryScan in favor of making ModifyTable do the same
> projection would win by removing one plan node's worth of overhead.

Oh, so there could possibly be a case where ModifyTable would have to
do junk filtering for INSERTs, but IIUC only if the planner optimized
away junk-filtering-SubqueryScan nodes too?  I was thinking that
perhaps INSERTs used to need junk filtering in the past but no longer
and now it's just dead code.

>  But
> I don't think we need to panic about whether the projection is done with
> ExecProject or a junk filter --- we'd be strictly ahead of the current
> situation either way.

I would've liked to confirm that with a performance comparison, but no
test case exists to do so. :(

> Given that background, I agree with Amit's choice to change this,
> just to reduce the difference between how INSERT and UPDATE cases work.
>
> For now, there's no performance difference anyway, since neither the
> ProjectionInfo nor the JunkFilter code can be reached.
>  (Maybe a comment about that would be useful.)

I've added a comment in ExecInitModifyTable() around the block that
initializes new-tuple ProjectionInfo to say that INSERTs don't
actually need to use it today.

> BTW, in the version of the patch that I'm working on (not ready to
> post yet), I've thrown away everything that Amit did in setrefs.c
> and tlist.c, so I don't recommend he spend time improving the comments
> there ;-).

Oh, I removed filter_junk_tlist_entries() in my updated version too
prompted by Robert's comment, but haven't touched
set_update_tlist_references().

>  I did not much like the idea of building a full TargetList
> for each partition; that's per-partition cycles and storage space that
> we won't be able to reclaim with the 0002 patch.  And we don't really
> need it, because there are only going to be three cases at runtime:
> pull a column from the subplan result tuple, pull a column from the
> old tuple, or insert a NULL for a dropped column.  So I've replaced
> the per-target-table tlists with integer lists of the attnums of the
> UPDATE's target columns in that table.  These are compact and they
> don't require any further processing in setrefs.c, and the executor
> can easily build a projection expression from that data.

I remember that in the earliest unposted versions, I had made
ExecInitModifyTable() take up the burden of creating the targetlist
that the projection will compute, which is what your approach sounds
like.  However, I had abandoned it due to the concern that it possibly
hurt the prepared statements because we would build the targetlist on
every execution, whereas only once if the planner does it.

I'm just about done addressing Robert's comments, so will post an
update shortly.

--
Amit Langote
EDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Andy Fan
Date:
Subject: Re: Proposal for col LIKE $1 with generic Plan
Next
From: Peter Geoghegan
Date:
Subject: Re: New IndexAM API controlling index vacuum strategies