Re: Should we add GUCs to allow partition pruning to be disabled? - Mailing list pgsql-hackers

From Amit Langote
Subject Re: Should we add GUCs to allow partition pruning to be disabled?
Date
Msg-id edaef8ab-37eb-8936-a68a-7981c39dca59@lab.ntt.co.jp
Whole thread Raw
In response to Re: Should we add GUCs to allow partition pruning to be disabled?  (David Rowley <david.rowley@2ndquadrant.com>)
Responses Re: Should we add GUCs to allow partition pruning to be disabled?
Re: Should we add GUCs to allow partition pruning to be disabled?
List pgsql-hackers
Hi David.

On 2018/05/02 8:18, David Rowley wrote:
> On 1 May 2018 at 21:44, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote:
>> About the patch in general, it seems like the newly added documentation
>> talks about "Partition Pruning" as something that *replaces* constraint
>> exclusion.  But, I think "Partition Pruning" is not the thing that
>> replaces constraint exclusion.
> 
> Not sure where you see the mention partition pruning replacing
> constraint exclusion.
> 
>> We used to do partition pruning even
>> before and used constraint exclusion as the algorithm.
> 
> That depends on if you think of partition pruning as the new feature
> or the act of removing unneeded partitions. We seem to have settled on
> partition pruning being the new feature given that we named the GUC
> this way. So I don't quite understand what you mean here.
> 
>>  What's new is the
>> algorithm that we now use to perform partition pruning for declaratively
>> partitioned tables.  Also, the characteristics of the new algorithm are
>> such that it can now be used in more situations, thus making it more
>> useful than the earlier method of partition pruning, so that new features
>> like runtime pruning could be realized.  I like that the patch adds
>> various details about the new pruning features, but think that the wording
>> and the flow could be improved a bit.
>>
>> What do you think?
> 
> I re-read the patch and it still looks fine to me. I'm sure it could
> be made better, but I just don't currently see how. I think it would
> be better if you commented on the specifics of what you think could be
> improved rather than a general comment that it could be improved.

Sorry, I may have been a bit vague.  I've read the patch one more time by
considering the phrase "partition pruning" as the name of the new feature
and that constraint exclusion is an optimization technique which doubled
as partition pruning until now.  The new feature achieves results faster
and can be used in more cases than constraint exclusion.  With that
reading, I don't see much to complain about with your patch at a high level.

Except some nitpicking:

+   <para>
+    Partition Pruning is also more powerful than constraint exclusion as
+    partition pruning is not something that is performed only during the
+    planning of a given query.

Maybe, don't repeat "partition pruning" again in the same sentence.  How
about:

.. more powerful than constraint exclusion as *it* is not something..

Or may suggest to rewrite it as:

Partition pruning is also more powerful than constraint exclusion as it
can be performed not only during the planning of a given query, but also
during its execution.

If you accept the above rewrite, the next sentences in the paragraph:

+    In certain cases, partition pruning may also
+    be performed during execution of the query as well.  This allows pruning
+    to be performed using values which are unknown during query planning, for
+    example, using parameters defined in a <command>PREPARE</command>
+    statement, using a value obtained from a subquery or using parameters
from
+    a parameterized nested loop join.

could be adjusted a bit to read as:

For example, this allows pruning to be performed using values which are
unknown during query planning but will be known during execution, such as
using parameters defined in a <command>PREPARE</command> statement (if a
generic plan is chosen), or using a value obtained from a subquery, or
using values from an outer row of a parameterized nested loop join.

+   <para>
+    The partition pruning which is performed during execution is done so at
+    either one or both of the following times:

done so at -> done at

+       If partition pruning can be
+       performed here then there is the added benefit of not having to
+       initialize partitions which are pruned.  Partitions which are pruned
+       during this stage will not show up in the query's
+       <command>EXPLAIN</command> or <command>EXPLAIN ANALYZE</command>.  It
+       is possible to determine the number of partitions which were removed
+       using this method by observing the <quote>Subplans Removed</quote>
+       property in the <command>EXPLAIN</command> output.

While it might be OK to keep the last two sentences, not sure about the
1st, which seems like it's spelling out an implementation detail -- that
there is an initialization step for partitions.  It's a nice performance
enhancement, sure, but might be irrelevant to the users reading this
documentation.

+       nested loop joins.  Since the value of these parameters may change
many
+       times during the execution of the query, partition pruning is
performed
+       whenever one of the execution parameters which is being compared to a
+       partition column or expression changes.

How about writing the last part as: whenever one of the execution
parameters relevant to pruning changes

+   <note>
+    <para>
+     Currently, partition pruning of partitions during the planning of an
+     <command>UPDATE</command> or <command>DELETE</command> command are
+     internally implemented using the constraint exclusion method.  Only
+     <command>SELECT</command> uses the faster partition pruning method.
Also
+     partition pruning performed during execution is only done so for the
+     Append node type.  Both of these limitations are likely to be removed
+     in a future release of <productname>PostgreSQL</productname>.
+    </para>
+   </note>

Do we need to write this given that we decided to decouple even the
UPDATE/DELETE pruning from the constraint_exclusion configuration?  Also,
noting that only Append nodes can use execution-time pruning seems
unnecessary.  I don't see plan node names mentioned like this elsewhere in
the documentation.  But more to the point, it seems like spilling out
finer implementation details (and/or limitations thereof) in the
user-facing documentation.

Thanks again.

Regards,
Amit



pgsql-hackers by date:

Previous
From: Stas Kelvich
Date:
Subject: Re: Global snapshots
Next
From: Dmitry Dolgov
Date:
Subject: FPW stats?