Re: [HACKERS] path toward faster partition pruning - Mailing list pgsql-hackers
From | David Rowley |
---|---|
Subject | Re: [HACKERS] path toward faster partition pruning |
Date | |
Msg-id | CAKJS1f_sXkUEU=Ft_a_O5znpmUSShvvtTs9MYtOJV9kjX00Kcg@mail.gmail.com Whole thread Raw |
In response to | Re: [HACKERS] path toward faster partition pruning (Robert Haas <robertmhaas@gmail.com>) |
Responses |
Re: [HACKERS] path toward faster partition pruning
|
List | pgsql-hackers |
On 2 March 2018 at 08:13, Robert Haas <robertmhaas@gmail.com> wrote: > 2. All processing of clauses should happen at plan time, not run time, > so that we're not repeating work. If, for example, a prepared query > is executed repeatedly, the clauses should get fully processed when > it's planned, and then the only thing that should happen at execution > time is that we take the values to which we now have access and use > them to decide what to prune. With this design, we're bound to repeat > at least the portion of the work done by get_partitions_from_clauses() > at runtime, and as things stand, it looks like the current version of > the run-time partition pruning patch repeats the > generate_partition_clauses() work at runtime as well. > > It seems to me as though what we ought to be doing is extracting > clauses that are of the correct general form and produces a list of > <partition-column-index>, <partition-operator-strategy>, <expression> > triplets sorted by increasing partition column index and operator > strategy number and in a form that can be represented as a Node tree. > This can be attached to the plan for use at runtime or, if there are > at least some constants among the expressions, used at plan time for > partition exclusion. So the main interfaces would be something like > this: We did try this already as I also thought the same a while back and even wrote a broken patch to do that. The thing is, PartitionClauseInfo is designed to be optimal for having get_partitions_from_clauses() called on it multiple times over, as will likely happen when run-time pruning is pruning away unneeded partitioned during a Parameterized nested loop join. To make it a Node type, it's not quite as simple as changing arrays to Lists as the keyclauses List contains PartClause, which are also not a Node type, and that struct contains a FmgrInfo, which is also not a node type, so we'd need to go try to make all those Node types (I doubt that's going to happen) ... but ... It's probably not impossible to come up with some intermediate partially processed type that can be a Node type. Ideally, this could be quickly converted into a PartitionClauseInfo during execution. The problem with this is that this adds a rather silly additional step during query planning to convert the intermediate processed list into the fully processed PaitionClauseInfo that get_partitions_from_clauses would still need. I don't think building it is going to cost a huge amount. Presumably, there are not many partitioned tables with 10 rows, so probably having the get_partitions_from_clauses work as quickly as possible is better than saving 100 nanoseconds in executor startup. That being said, there's still a small issue with the run-time pruning patch which is caused by me not pre-processing the clauses during planning. Ideally, I'd be able to pre-process at least enough to determine if any Params match the partition key so that I know if run-time pruning can be used or not. As of now, I'm not doing that as it seems wasteful to pre-process during planning just to get the Param Ids out, then not be able to carry the pre-processed ones over to the executor. We also can't really reuse the pre-processed state that was generated during the planner's calls to generate_partition_clauses() since we'll additionally also be passing in the parameterized path clauses as well as the baserestrictinfo clauses. Given the above, I'm happy to listen to ideas on this, as I'm really not sure what's best here. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
pgsql-hackers by date: