Re: making update/delete of inheritance trees scale better - Mailing list pgsql-hackers

From Amit Langote
Subject Re: making update/delete of inheritance trees scale better
Date
Msg-id CA+HiwqFaTia5D6sBQzNQhu9Vt-09Ad+TkUveX-oC-uu_sVSXFw@mail.gmail.com
Whole thread Raw
In response to Re: making update/delete of inheritance trees scale better  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: making update/delete of inheritance trees scale better
Re: making update/delete of inheritance trees scale better
List pgsql-hackers
On Tue, Mar 30, 2021 at 1:51 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Here's a v13 patchset that I feel pretty good about.

Thanks.  After staring at this for a day now, I do too.

> My original thought for replacing the "fake variable" design was to
> add another RTE holding the extra variables, and then have setrefs.c
> translate the placeholder variables to the real thing at the last
> possible moment.  I soon realized that instead of an actual RTE,
> it'd be better to invent a special varno value akin to INDEX_VAR
> (I called it ROWID_VAR, though I'm not wedded to that name).  Info
> about the associated variables is kept in a list of RowIdentityVarInfo
> structs, which are more suitable than a regular RTE would be.
>
> I got that and the translate-in-setrefs approach more or less working,
> but it was fairly messy, because the need to know about these special
> variables spilled into FDWs and a lot of other places; for example
> indxpath.c needed a special check for them when deciding if an
> index-only scan is possible.  What turns out to be a lot cleaner is
> to handle the translation in adjust_appendrel_attrs_mutator(), so that
> we have converted to real variables by the time we reach any
> relation-scan-level logic.
>
> I did end up having to break the API for FDW AddForeignUpdateTargets
> functions: they need to do things differently when adding junk columns,
> and they need different parameters.  This seems all to the good though,
> because the old API has been a backwards-compatibility hack for some
> time (e.g., in not passing the "root" pointer).

This all looks really neat.

I couldn't help but think that the RowIdentityVarInfo management code
looks a bit like SpecialJunkVarInfo stuff in my earliest patches, but
of course without all the fragility of assigning "fake" attribute
numbers to a "real" base relation(s).

> Some other random notes:
>
> * I was unimpressed with the idea of distinguishing different target
> relations by embedding integer constants in the plan.  In the first
> place, the implementation was extremely fragile --- there was
> absolutely NOTHING tying the counter you used to the subplans' eventual
> indexes in the ModifyTable lists.  Plus I don't have a lot of faith
> that setrefs.c will reliably do what you want in terms of bubbling the
> things up.  Maybe that could be made more robust, but the other problem
> is that the EXPLAIN output is just about unreadable; nobody will
> understand what "(0)" means.  So I went back to the idea of emitting
> tableoid, and installed a hashtable plus a one-entry lookup cache
> to make the run-time mapping as fast as I could.  I'm not necessarily
> saying that this is how it has to be indefinitely, but I think we
> need more work on planner and EXPLAIN infrastructure before we can
> get the idea of directly providing a list index to work nicely.

Okay.

> * I didn't agree with your decision to remove the now-failing test
> cases from postgres_fdw.sql.  I think it's better to leave them there,
> especially in the cases where we were checking the plan as well as
> the execution.  Hopefully we'll be able to un-break those soon.

Okay.

> * I updated a lot of hereby-obsoleted comments, which makes the patch
> a bit longer than v12; but actually the code is a good bit smaller.
> There's a noticeable net code savings in src/backend/optimizer/,
> which there was not before.

Agreed.  (I had evidently missed a bunch of comments referring to the
old ways of how inherited updates are performed.)

> I've not made any attempt to do performance testing on this,
> but I think that's about the only thing standing between us
> and committing this thing.

I reran some of the performance tests I did earlier (I've attached the
modified test running script for reference):

pgbench -n -T60 -M{simple|prepared} -f nojoin.sql

nojoin.sql:

\set a random(1, 1000000)
update test_table t set b = :a where a = :a;

...and here are the tps figures:

-Msimple

nparts  10cols      20cols      40cols

master:
64      10112       9878        10920
128     9662        10691       10604
256     9642        9691        10626
1024    8589        9675        9521

patched:
64      13493       13463       13313
128     13305       13447       12705
256     13190       13161       12954
1024    11791       11408       11786

No variation across various column counts, but the patched improves
the tps for each case by quite a bit.

-Mprepared (plan_cache_mode=force_generic_plan)

master:
64      2286        2285        2266
128     1163        1127        1091
256     531         519         544
1024    77          71          69

patched:
64      6522        6612        6275
128     3568        3625        3372
256     1847        1710        1823
1024    433         427         386

Again, no variation across columns counts.  tps drops as partition
count increases both before and after applying the patches, although
patched performs way better, which is mainly attributable to the
ability of UPDATE to now utilize runtime pruning (actually of the
Append under ModifyTable).  The drop as partition count increases can
be attributed to the fact that with a generic plan, there are a bunch
of steps that must be done across all partitions, such as
AcauireExecutorLocks(), ExecCheckRTPerms(), per-result-rel
initializations in ExecInitModifyTable(), etc., even with the patched.
As mentioned upthread, [1] can help with the last bit.

--
Amit Langote
EDB: http://www.enterprisedb.com

[1] https://commitfest.postgresql.org/32/2621/

Attachment

pgsql-hackers by date:

Previous
From: Ashutosh Bapat
Date:
Subject: Re: cursor already in use, UPDATE RETURNING bug?
Next
From: Dean Rasheed
Date:
Subject: Re: pgbench - add pseudo-random permutation function