Re: making update/delete of inheritance trees scale better - Mailing list pgsql-hackers
From | Ashutosh Bapat |
---|---|
Subject | Re: making update/delete of inheritance trees scale better |
Date | |
Msg-id | CAExHW5sR73nUAOr7ZAmEJWSpHVoCV5mpYERNO7Qy46ZpRH00PQ@mail.gmail.com Whole thread Raw |
In response to | Re: making update/delete of inheritance trees scale better (Amit Langote <amitlangote09@gmail.com>) |
Responses |
Re: making update/delete of inheritance trees scale better
Re: making update/delete of inheritance trees scale better |
List | pgsql-hackers |
On Mon, May 11, 2020 at 8:11 PM Amit Langote <amitlangote09@gmail.com> wrote: > > Per row overhead would be incurred for every row whereas the plan time > > overhead is one-time or in case of a prepared statement almost free. > > So we need to compare it esp. when there are 2000 partitions and all > > of them are being updated. > > I assume that such UPDATEs would be uncommon. Yes, 2000 partitions being updated would be rare. But many rows from the same partition being updated may not be that common. We have to know how much is that per row overhead and updating how many rows it takes to beat the planning time overhead. If the number of rows is very large, we are good. > > > But generally I agree that this would be a > > better approach. It might help using PWJ when the result relation > > joins with other partitioned table. > > It does, because the plan below ModifyTable is same as if the query > were SELECT instead of UPDATE; with my PoC: > > explain (costs off) update foo set a = foo2.a + 1 from foo foo2 where > foo.a = foo2.a; > QUERY PLAN > -------------------------------------------------- > Update on foo > Update on foo_1 > Update on foo_2 > -> Append > -> Merge Join > Merge Cond: (foo_1.a = foo2_1.a) > -> Sort > Sort Key: foo_1.a > -> Seq Scan on foo_1 > -> Sort > Sort Key: foo2_1.a > -> Seq Scan on foo_1 foo2_1 > -> Merge Join > Merge Cond: (foo_2.a = foo2_2.a) > -> Sort > Sort Key: foo_2.a > -> Seq Scan on foo_2 > -> Sort > Sort Key: foo2_2.a > -> Seq Scan on foo_2 foo2_2 > (20 rows) Wonderful. That looks good. > > Can we plan the scan query to add a sort node to order the rows by tableoid? > > Hmm, I am afraid that some piece of partitioning code that assumes a > certain order of result relations, and that order is not based on > sorting tableoids. I am suggesting that we override that order (if any) in create_modifytable_path() or create_modifytable_plan() by explicitly ordering the incoming paths on tableoid. May be using MergeAppend. > > > * Tuple re-routing during UPDATE. For now it's disabled so your design > > should work. But we shouldn't design this feature in such a way that > > it comes in the way to enable tuple re-routing in future :). > > Sorry, what is tuple re-routing and why does this new approach get in its way? An UPDATE causing a tuple to move to a different partition. It would get in its way since the tuple will be located based on tableoid, which will be the oid of the old partition. But I think this approach has higher chance of being able to solve that problem eventually rather than the current approach. -- Best Wishes, Ashutosh Bapat
pgsql-hackers by date: