Re: generic plans and "initial" pruning - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: generic plans and "initial" pruning |
Date | |
Msg-id | CA+TgmoZ=92LxPeVJ20vf4vJ4o8dr7Q2seDQR=xLY04puLjDs_A@mail.gmail.com Whole thread Raw |
In response to | generic plans and "initial" pruning (Amit Langote <amitlangote09@gmail.com>) |
Responses |
Re: generic plans and "initial" pruning
Re: generic plans and "initial" pruning |
List | pgsql-hackers |
On Fri, Dec 24, 2021 at 10:36 PM Amit Langote <amitlangote09@gmail.com> wrote: > However, using an idea that Robert suggested to me off-list a little > while back, it seems possible to determine the set of partitions that > we can safely skip locking. The idea is to look at the "initial" or > "pre-execution" pruning instructions contained in a given Append or > MergeAppend node when AcquireExecutorLocks() is collecting the > relations to lock and consider relations from only those sub-nodes > that survive performing those instructions. I've attempted > implementing that idea in the attached patch. Hmm. The first question that occurs to me is whether this is fully safe. Currently, AcquireExecutorLocks calls LockRelationOid for every relation involved in the query. That means we will probably lock at least one relation on which we previously had no lock and thus AcceptInvalidationMessages(). That will end up marking the query as no longer valid and CheckCachedPlan() will realize this and tell the caller to replan. In the corner case where we already hold all the required locks, we will not accept invalidation messages at this point, but must have done so after acquiring the last of the locks required, and if that didn't mark the plan invalid, it can't be invalid now either. Either way, everything is fine. With the proposed patch, we might never lock some of the relations involved in the query. Therefore, if one of those relations has been modified in some way that would invalidate the plan, we will potentially fail to discover this, and will use the plan anyway. For instance, suppose there's one particular partition that has an extra index and the plan involves an Index Scan using that index. Now suppose that the scan of the partition in question is pruned, but meanwhile, the index has been dropped. Now we're running a plan that scans a nonexistent index. Admittedly, we're not running that part of the plan. But is that enough for this to be safe? There are things (like EXPLAIN or auto_explain) that we might try to do even on a part of the plan tree that we don't try to run. Those things might break, because for example we won't be able to look up the name of an index in the catalogs for EXPLAIN output if the index is gone. This is just a relatively simple example and I think there are probably a bunch of others. There are a lot of kinds of DDL that could be performed on a partition that gets pruned away: DROP INDEX is just one example. The point is that to my knowledge we have no existing case where we try to use a plan that might be only partly valid, so if we introduce one, there's some risk there. I thought for a while, too, about whether changes to some object in a part of the plan that we're not executing could break things for the rest of the plan even if we never do anything with the plan but execute it. I can't quite see any actual hazard. For example, I thought about whether we might try to get the tuple descriptor for the pruned-away object and get a different tuple descriptor than we were expecting. I think we can't, because (1) the pruned object has to be a partition, and tuple descriptors have to match throughout the partitioning hierarchy, except for column ordering, which currently can't be changed after-the-fact and (2) IIRC, the tuple descriptor is stored in the plan and not reconstructed at runtime and (3) if we don't end up opening the relation because it's pruned, then we certainly can't do anything with its tuple descriptor. But it might be worth giving more thought to the question of whether there's any other way we could be depending on the details of an object that ended up getting pruned. > Note that "initial" pruning steps are now performed twice when > executing generic plans: once in AcquireExecutorLocks() to find > partitions to be locked, and a 2nd time in ExecInit[Merge]Append() to > determine the set of partition sub-nodes to be initialized for > execution, though I wasn't able to come up with a good idea to avoid > this duplication. I think this is something that will need to be fixed somehow. Apart from the CPU cost, it's scary to imagine that the set of nodes on which we acquired locks might be different from the set of nodes that we initialize. If we do the same computation twice, there must be some non-zero probability of getting a different answer the second time, even if the circumstances under which it would actually happen are remote. Consider, for example, a function that is labeled IMMUTABLE but is really VOLATILE. Now maybe you can get the system to lock one set of partitions and then initialize a different set of partitions. I don't think we want to try to reason about what consequences that might have and prove that somehow it's going to be OK; I think we want to nail the door shut very tightly to make sure that it can't. -- Robert Haas EDB: http://www.enterprisedb.com
pgsql-hackers by date: