Re: ModifyTable overheads in generic plans - Mailing list pgsql-hackers

From Amit Langote
Subject Re: ModifyTable overheads in generic plans
Date
Msg-id CA+HiwqEhiXt08QvNxTbnnDNDOtOfyfbyKGxhn4pfqzuMiCPQ=w@mail.gmail.com
Whole thread Raw
In response to Re: ModifyTable overheads in generic plans  (Amit Langote <amitlangote09@gmail.com>)
List pgsql-hackers
On Wed, Apr 7, 2021 at 5:18 PM Amit Langote <amitlangote09@gmail.com> wrote:
> On Wed, Apr 7, 2021 at 8:24 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > I also could not get excited about postponing initialization of RETURNING
> > or WITH CHECK OPTIONS expressions.  I grant that that can be helpful
> > when those features are used, but I doubt that RETURNING is used that
> > heavily, and WITH CHECK OPTIONS is surely next door to nonexistent
> > in performance-critical queries.  If the feature isn't used, the cost
> > of the existing code is about zero.  So I couldn't see that it was worth
> > the amount of code thrashing and risk of new bugs involved.
>
> Okay.
>
> > The bit you
> > noted about EXPLAIN missing a subplan is pretty scary in this connection;
> > I'm not at all sure that that's just cosmetic.
>
> Yeah, this and...

I looked into this and can't see why this isn't just cosmetic as far
as ModifyTable is concerned.

"EXPLAIN missing a subplan" here just means that
ModifyTableState.PlanState.subPlan is not set. Besides ExplainNode(),
only ExecReScan() looks at PlanState.subPlan, and that does not seem
relevant to ModifyTable, because it doesn't support rescanning.

I don't see any such problems with creating RETURNING projections
on-demand either.

> > (Having said that, I'm wondering if there are bugs in these cases for
> > cross-partition updates that target a previously-not-used partition.
> > So we might have things to fix anyway.)
>
> ...this would need to be looked at a bit more closely, which I'll try
> to do sometime later this week.

Given the above, I can't think of any undiscovered problems related to
WCO and RETURNING expression states in the cases where cross-partition
updates target partitions that need to be initialized by
ExecInitPartitionInfo().  Here is the result for the test case in
updatable_views.sql modified to use partitioning and cross-partition
updates:

CREATE TABLE base_tbl (a int) partition by range (a);
CREATE TABLE base_tbl1 PARTITION OF base_tbl FOR VALUES FROM (1) TO (6);
CREATE TABLE base_tbl2 PARTITION OF base_tbl FOR VALUES FROM (6) TO (11);
CREATE TABLE base_tbl3 PARTITION OF base_tbl FOR VALUES FROM (11) TO (15);
CREATE TABLE ref_tbl (a int PRIMARY KEY);
INSERT INTO ref_tbl SELECT * FROM generate_series(1,10);
CREATE VIEW rw_view1 AS
  SELECT * FROM base_tbl b
  WHERE EXISTS(SELECT 1 FROM ref_tbl r WHERE r.a = b.a)
  WITH CHECK OPTION;

INSERT INTO rw_view1 VALUES (1);
INSERT 0 1

INSERT INTO rw_view1 VALUES (11);
ERROR:  new row violates check option for view "rw_view1"
DETAIL:  Failing row contains (11).

-- Both are cross-partition updates where the target relation is
-- lazily initialized in ExecInitPartitionInfo(), along with the WCO
-- qual ExprState
UPDATE rw_view1 SET a = a + 5 WHERE a = 1;
UPDATE 1

UPDATE rw_view1 SET a = a + 5 WHERE a = 6;
ERROR:  new row violates check option for view "rw_view1"
DETAIL:  Failing row contains (11).

EXPLAIN (costs off) INSERT INTO rw_view1 VALUES (5);
      QUERY PLAN
----------------------
 Insert on base_tbl b
   ->  Result
(2 rows)

EXPLAIN (costs off) UPDATE rw_view1 SET a = a + 5 WHERE a = 1;
                       QUERY PLAN
--------------------------------------------------------
 Update on base_tbl b
   Update on base_tbl1 b_1
   ->  Nested Loop
         ->  Index Scan using ref_tbl_pkey on ref_tbl r
               Index Cond: (a = 1)
         ->  Seq Scan on base_tbl1 b_1
               Filter: (a = 1)
(7 rows)

EXPLAIN (costs off) UPDATE rw_view1 SET a = a + 5 WHERE a = 6;
                       QUERY PLAN
--------------------------------------------------------
 Update on base_tbl b
   Update on base_tbl2 b_1
   ->  Nested Loop
         ->  Index Scan using ref_tbl_pkey on ref_tbl r
               Index Cond: (a = 6)
         ->  Seq Scan on base_tbl2 b_1
               Filter: (a = 6)
(7 rows)

Patch attached.  I tested the performance benefit of doing this by
modifying the update query used in earlier benchmarks to have a
RETURNING * clause, getting the following TPS numbers:

-Mprepared (plan_cache_mode=force_generic_plan)

nparts  10cols      20cols      40cols

HEAD
64      10909       9067        7171
128     6903        5624        4161
256     3748        3056        2219
1024    953         738         427

Patched
64      13817       13395       12754
128     9271        9102        8279
256     5345        5207        5083
1024    1463        1443        1389

Also, I don't see much impact of checking if (node->returningLists) in
the per-result-rel initialization loop in the common cases where
there's no RETURNING.

--
Amit Langote
EDB: http://www.enterprisedb.com

Attachment

pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: Replication slot stats misgivings
Next
From: Tom Lane
Date:
Subject: Re: Converting contrib SQL functions to new style