Re: Should we add GUCs to allow partition pruning to be disabled? - Mailing list pgsql-hackers

From David G. Johnston
Subject Re: Should we add GUCs to allow partition pruning to be disabled?
Date
Msg-id CAKFQuwZTUZfXLc-jkQVaJ07x_pFJ95iHkVur+_VL_MVVEUD8hg@mail.gmail.com
Whole thread Raw
In response to Re: Should we add GUCs to allow partition pruning to be disabled?  (Amit Langote <Langote_Amit_f8@lab.ntt.co.jp>)
List pgsql-hackers
On Wed, May 2, 2018 at 1:07 AM, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote:
Hi David.

On 2018/05/02 8:18, David Rowley wrote:
> On 1 May 2018 at 21:44, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote:
>
> I re-read the patch and it still looks fine to me. I'm sure it could
> be made better, but I just don't currently see how. I think it would
> be better if you commented on the specifics of what you think could be
> improved rather than a general comment that it could be improved.

Sorry, I may have been a bit vague.  I've read the patch one more time by
considering the phrase "partition pruning" as the name of the new feature
and that constraint exclusion is an optimization technique which doubled
as partition pruning until now.  The new feature achieves results faster
and can be used in more cases than constraint exclusion.  With that
reading, I don't see much to complain about with your patch at a high level.

Except some nitpicking:

+   <para>
+    Partition Pruning is also more powerful than constraint exclusion as
+    partition pruning is not something that is performed only during the
+    planning of a given query.

Maybe, don't repeat "partition pruning" again in the same sentence.  How
about:

​good thought

.. more powerful than constraint exclusion as *it* is not something..

​technically "it" refers to "constraint exclusion" when written this way.

Better would be:

Partition pruning, unlike constraint exclusion, may be performed during query execution.

Saying "not only planning" where there is only one other possible time it happens is unnecessarily vague.
+       If partition pruning can be
+       performed here then there is the added benefit of not having to
+       initialize partitions which are pruned.  Partitions which are pruned
+       during this stage will not show up in the query's
+       <command>EXPLAIN</command> or <command>EXPLAIN ANALYZE</command>.  It
+       is possible to determine the number of partitions which were removed
+       using this method by observing the <quote>Subplans Removed</quote>
+       property in the <command>EXPLAIN</command> output.

While it might be OK to keep the last two sentences, not sure about the
1st, which seems like it's spelling out an implementation detail -- that
there is an initialization step for partitions.  It's a nice performance
enhancement, sure, but might be irrelevant to the users reading this
documentation.

​I would concur with omitting the initialization implementation detail.
 

+       nested loop joins.  Since the value of these parameters may change
many
+       times during the execution of the query, partition pruning is
performed
+       whenever one of the execution parameters which is being compared to a
+       partition column or expression changes.

How about writing the last part as: whenever one of the execution
parameters relevant to pruning changes

​Is it when the values change or for each different value?  The difference being if values are not sorted an something like: 1,2,3,2,3,4,1,2 were to appear.
 

+   <note>
+    <para>
+     Currently, partition pruning of partitions during the planning of an
+     <command>UPDATE</command> or <command>DELETE</command> command are
+     internally implemented using the constraint exclusion method.  Only
+     <command>SELECT</command> uses the faster partition pruning method.
Also
+     partition pruning performed during execution is only done so for the
+     Append node type.  Both of these limitations are likely to be removed
+     in a future release of <productname>PostgreSQL</productname>.
+    </para>
+   </note>

Do we need to write this given that we decided to decouple even the
UPDATE/DELETE pruning from the constraint_exclusion configuration?  Also,
noting that only Append nodes can use execution-time pruning seems
unnecessary.  I don't see plan node names mentioned like this elsewhere in
the documentation.  But more to the point, it seems like spilling out
finer implementation details (and/or limitations thereof) in the
user-facing documentation.

​I suppose it would matter relative to what explain plans the user might see.  I do think the distinction between UPDATE/DELETE and SELECT can be removed here though.  The execution limitation seems potentially worthy though as written I am unable to convert the provided information into something I can use.  Knowing when it cannot happen, even if incomplete, would be more helpful to me.

David J.

pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: Should we add GUCs to allow partition pruning to be disabled?
Next
From: Tom Lane
Date:
Subject: Re: Is a modern build system acceptable for older platforms