Re: ATTACH/DETACH PARTITION CONCURRENTLY - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: ATTACH/DETACH PARTITION CONCURRENTLY |
Date | |
Msg-id | CA+TgmoaoZS1aRu9tW3k5Wrqsti-gcFF9yBiiGwE2Mhu5Gpkp6g@mail.gmail.com Whole thread Raw |
In response to | Re: ATTACH/DETACH PARTITION CONCURRENTLY (Alvaro Herrera <alvherre@2ndquadrant.com>) |
Responses |
Re: ATTACH/DETACH PARTITION CONCURRENTLY
|
List | pgsql-hackers |
On Thu, Dec 20, 2018 at 3:58 PM Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > Namely: how does this handle the case of partition pruning structure > being passed from planner to executor, if an attach happens in the > middle of it and puts a partition in between existing partitions? Array > indexes of any partitions that appear later in the partition descriptor > will change. I finally gotten a little more time to work on this. It took me a while to understand that a PartitionedRelPruneInfos assumes that the indexes of partitions in the PartitionDesc don't change between planning and execution, because subplan_map[] and subplan_map[] are indexed by PartitionDesc offset. I suppose the reason for this is so that we don't have to go to the expense of copying the partition bounds from the PartitionDesc into the final plan, but it somehow seems a little scary to me. Perhaps I am too easily frightened, but it's certainly a problem from the point of view of this project, which wants to let the PartitionDesc change concurrently. I wrote a little patch that stores the relation OIDs of the partitions into the PartitionedPruneRelInfo and then, at execution time, does an Assert() that what it gets matches what existed at plan time. I figured that a good start would be to find a test case where this fails with concurrent DDL allowed, but I haven't so far succeeded in devising one. To make the Assert() fail, I need to come up with a case where concurrent DDL has caused the PartitionDesc to be rebuilt but without causing an update to the plan. If I use prepared queries inside of a transaction block, I can continue to run old plans after concurrent DDL has changed things, but I can't actually make the Assert() fail, because the queries continue to use the old plans precisely because they haven't processed invalidation messages, and therefore they also have the old PartitionDesc and everything works. Maybe if I try it with CLOBBER_CACHE_ALWAYS... I also had the idea of trying to use a cursor, because if I could start execution of a query, then force a relcache rebuild, then continue executing the query, maybe something would blow up somehow. But that's not so easy because I don't think we have any way using SQL to declare a cursor for a prepared query, so how do I need to get a query plan that involves run-time pruning without using parameters, which I'm pretty sure is possible but I haven't figured it out yet. And even there the PartitionDirectory concept might preserve us from any damage if the change happens after the executor is initialized, though I'm not sure if there are any cases where we don't do the first PartitionDesc lookup for a particular table until mid-execution. Anyway, I think this idea of passing a list of relation OIDs that we saw at planning time through to the executor and cross-checking them might have some legs. If we only allowed concurrently *adding* partitions and not concurrently *removing* them, then even if we find the case(s) where the PartitionDesc can change under us, we can probably just adjust subplan_map and subpart_map to compensate, since we can iterate through the old and new arrays of relation OIDs and just figure out which things have shifted to higher indexes in the PartitionDesc. This is all kind of hand-waving at the moment; tips appreciated. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: