Re: making update/delete of inheritance trees scale better - Mailing list pgsql-hackers
From | Amit Langote |
---|---|
Subject | Re: making update/delete of inheritance trees scale better |
Date | |
Msg-id | CA+HiwqEfEUk-F-DbffxCgZPDHqrFUKcjK=px--k8Si4-Uo-P2A@mail.gmail.com Whole thread Raw |
In response to | Re: making update/delete of inheritance trees scale better (Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>) |
Responses |
Re: making update/delete of inheritance trees scale better
|
List | pgsql-hackers |
On Tue, May 12, 2020 at 9:54 PM Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> wrote: > On Mon, May 11, 2020 at 8:11 PM Amit Langote <amitlangote09@gmail.com> wrote: > > > Per row overhead would be incurred for every row whereas the plan time > > > overhead is one-time or in case of a prepared statement almost free. > > > So we need to compare it esp. when there are 2000 partitions and all > > > of them are being updated. > > > > I assume that such UPDATEs would be uncommon. > > Yes, 2000 partitions being updated would be rare. But many rows from > the same partition being updated may not be that common. We have to > know how much is that per row overhead and updating how many rows it > takes to beat the planning time overhead. If the number of rows is > very large, we are good. Maybe I am misunderstanding you, but the more the rows to update, the more overhead we will be paying with the new approach. > > > Can we plan the scan query to add a sort node to order the rows by tableoid? > > > > Hmm, I am afraid that some piece of partitioning code that assumes a > > certain order of result relations, and that order is not based on > > sorting tableoids. > > I am suggesting that we override that order (if any) in > create_modifytable_path() or create_modifytable_plan() by explicitly > ordering the incoming paths on tableoid. May be using MergeAppend. So, we will need to do 2 things: 1. Implicitly apply an ORDER BY tableoid clause 2. Add result relation RTIs to ModifyTable.resultRelations in the order of their RTE's relid. Maybe we can do that as a separate patch. Also, I am not sure if it will get in the way of someone wanting to have ORDER BY LIMIT for updates. > > > * Tuple re-routing during UPDATE. For now it's disabled so your design > > > should work. But we shouldn't design this feature in such a way that > > > it comes in the way to enable tuple re-routing in future :). > > > > Sorry, what is tuple re-routing and why does this new approach get in its way? > > An UPDATE causing a tuple to move to a different partition. It would > get in its way since the tuple will be located based on tableoid, > which will be the oid of the old partition. But I think this approach > has higher chance of being able to solve that problem eventually > rather than the current approach. Again, I don't think I understand. We do currently (as of v11) re-route tuples when UPDATE causes them to move to a different partition, which, gladly, continues to work with my patch. So how it works is like this: for a given "new" tuple, ExecUpdate() checks if the tuple would violate the partition constraint of the result relation that was passed along with the tuple. If it does, the new tuple will be moved, by calling ExecDelete() to delete it from the current relation, followed by ExecInsert() to find the new home for the tuple. The only thing that changes with the new approach is how ExecModifyTable() chooses a result relation to pass to ExecUpdate() for a given "new" tuple it has fetched from the plan, which is quite independent from the tuple re-routing mechanism proper. -- Amit Langote EnterpriseDB: http://www.enterprisedb.com
pgsql-hackers by date: