Re: making update/delete of inheritance trees scale better - Mailing list pgsql-hackers

From Ashutosh Bapat
Subject Re: making update/delete of inheritance trees scale better
Date
Msg-id CAExHW5sR73nUAOr7ZAmEJWSpHVoCV5mpYERNO7Qy46ZpRH00PQ@mail.gmail.com
Whole thread Raw
In response to Re: making update/delete of inheritance trees scale better  (Amit Langote <amitlangote09@gmail.com>)
Responses Re: making update/delete of inheritance trees scale better  (David Rowley <dgrowleyml@gmail.com>)
Re: making update/delete of inheritance trees scale better  (Amit Langote <amitlangote09@gmail.com>)
List pgsql-hackers
On Mon, May 11, 2020 at 8:11 PM Amit Langote <amitlangote09@gmail.com> wrote:
> > Per row overhead would be incurred for every row whereas the plan time
> > overhead is one-time or in case of a prepared statement almost free.
> > So we need to compare it esp. when there are 2000 partitions and all
> > of them are being updated.
>
> I assume that such UPDATEs would be uncommon.

Yes, 2000 partitions being updated would be rare. But many rows from
the same partition being updated may not be that common. We have to
know how much is that per row overhead and updating how many rows it
takes to beat the planning time overhead. If the number of rows is
very large, we are good.

>
> > But generally I agree that this would be a
> > better approach. It might help using PWJ when the result relation
> > joins with other partitioned table.
>
> It does, because the plan below ModifyTable is same as if the query
> were SELECT instead of UPDATE; with my PoC:
>
> explain (costs off) update foo set a = foo2.a + 1 from foo foo2 where
> foo.a = foo2.a;
>                     QUERY PLAN
> --------------------------------------------------
>  Update on foo
>    Update on foo_1
>    Update on foo_2
>    ->  Append
>          ->  Merge Join
>                Merge Cond: (foo_1.a = foo2_1.a)
>                ->  Sort
>                      Sort Key: foo_1.a
>                      ->  Seq Scan on foo_1
>                ->  Sort
>                      Sort Key: foo2_1.a
>                      ->  Seq Scan on foo_1 foo2_1
>          ->  Merge Join
>                Merge Cond: (foo_2.a = foo2_2.a)
>                ->  Sort
>                      Sort Key: foo_2.a
>                      ->  Seq Scan on foo_2
>                ->  Sort
>                      Sort Key: foo2_2.a
>                      ->  Seq Scan on foo_2 foo2_2
> (20 rows)

Wonderful. That looks good.


> > Can we plan the scan query to add a sort node to order the rows by tableoid?
>
> Hmm, I am afraid that some piece of partitioning code that assumes a
> certain order of result relations, and that order is not based on
> sorting tableoids.

I am suggesting that we override that order (if any) in
create_modifytable_path() or create_modifytable_plan() by explicitly
ordering the incoming paths on tableoid. May be using MergeAppend.


>
> > * Tuple re-routing during UPDATE. For now it's disabled so your design
> > should work. But we shouldn't design this feature in such a way that
> > it comes in the way to enable tuple re-routing in future :).
>
> Sorry, what is tuple re-routing and why does this new approach get in its way?

An UPDATE causing a tuple to move to a different partition. It would
get in its way since the tuple will be located based on tableoid,
which will be the oid of the old partition. But I think this approach
has higher chance of being able to solve that problem eventually
rather than the current approach.
-- 
Best Wishes,
Ashutosh Bapat



pgsql-hackers by date:

Previous
From: Joe Conway
Date:
Subject: repeat() function, CHECK_FOR_INTERRUPTS(), and unlikely()
Next
From: Amit Langote
Date:
Subject: Re: making update/delete of inheritance trees scale better