Re: generic plans and "initial" pruning - Mailing list pgsql-hackers

From Amit Langote
Subject Re: generic plans and "initial" pruning
Date
Msg-id CA+HiwqH9R8tPwbxOd_uxTjjtHpd-yTU4eQW7ndDVWrbah_ACEQ@mail.gmail.com
Whole thread Raw
In response to Re: generic plans and "initial" pruning  (Alvaro Herrera <alvherre@alvh.no-ip.org>)
Responses Re: generic plans and "initial" pruning  (Amit Langote <amitlangote09@gmail.com>)
List pgsql-hackers
Thanks for the review.

On Sun, Apr 3, 2022 at 8:33 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
> I noticed a definitional problem in 0001 that's also a bug in some
> conditions -- namely that the bitmapset "validplans" is never explicitly
> initialized to NIL.  In the original coding, the BMS was always returned
> from somewhere; in the new code, it is passed from an uninitialized
> stack variable into the new ExecInitPartitionPruning function, which
> then proceeds to add new members to it without initializing it first.

Hmm, the following blocks in ExecInitPartitionPruning() define
*initially_valid_subplans:

    /*
     * Perform an initial partition prune pass, if required.
     */
    if (prunestate->do_initial_prune)
    {
        /* Determine which subplans survive initial pruning */
        *initially_valid_subplans = ExecFindInitialMatchingSubPlans(prunestate);
    }
    else
    {
        /* We'll need to initialize all subplans */
        Assert(n_total_subplans > 0);
        *initially_valid_subplans = bms_add_range(NULL, 0,
                                                  n_total_subplans - 1);
    }

AFAICS, both assign *initially_valid_subplans a value whose
computation is not dependent on reading it first, so I don't see a
problem.

Am I missing something?

> Indeed that function's header comment explicitly indicates that it is
> not initialized:
>
> + * Initial pruning can be done immediately, so it is done here if needed and
> + * the set of surviving partition subplans' indexes are added to the output
> + * parameter *initially_valid_subplans.
>
> even though this is not fully correct, because when prunestate->do_initial_prune
> is false, then the BMS *is* initialized.
>
> I have no opinion on where to initialize it, but it needs to be done
> somewhere and the comment needs to agree.

I can see that the comment is insufficient, so I've expanded it as follows:

- * Initial pruning can be done immediately, so it is done here if needed and
- * the set of surviving partition subplans' indexes are added to the output
- * parameter *initially_valid_subplans.
+ * On return, *initially_valid_subplans is assigned the set of indexes of
+ * child subplans that must be initialized along with the parent plan node.
+ * Initial pruning is performed here if needed and in that case only the
+ * surviving subplans' indexes are added.

> I think the names ExecCreatePartitionPruneState and
> ExecInitPartitionPruning are too confusingly similar.  Maybe the former
> should be renamed to somehow make it clear that it is a subroutine for
> the former.

Ah, yes.  I've taken out the "Exec" from the former.

> At the top of the file, there's a new comment that reads:
>
>   * ExecInitPartitionPruning:
>   *     Creates the PartitionPruneState required by each of the two pruning
>   *     functions.
>
> What are "the two pruning functions"?  I think here you mean "Append"
> and "MergeAppend".  Maybe spell that out explicitly.

Actually it meant: ExecFindInitiaMatchingSubPlans() and
ExecFindMatchingSubPlans().  They perform "initial" and "exec" set of
pruning steps, respectively.

I realized that both functions have identical bodies at this point,
except that they pass 'true' and 'false', respectively, for
initial_prune argument of the sub-routine
find_matching_subplans_recurse(), which is where the pruning using the
appropriate set of steps contained in PartitionPruneState
(initial_pruning_steps or exec_pruning_steps) actually occurs.  So,
I've updated the patch to just retain the latter, adding an
initial_prune parameter to it to pass to the aforementioned
find_matching_subplans_recurse().

I've also updated the run-time pruning module comment to describe this change:

  * ExecFindMatchingSubPlans:
- *     Returns indexes of matching subplans after evaluating all available
- *     expressions, that is, using execution pruning steps.  This function can
- *     can only be called during execution and must be called again each time
- *     the value of a Param listed in PartitionPruneState's 'execparamids'
- *     changes.
+ *     Returns indexes of matching subplans after evaluating the expressions
+ *     that are safe to evaluate at a given point.  This function is first
+ *     called during ExecInitPartitionPruning() to find the initially
+ *     matching subplans based on performing the initial pruning steps and
+ *     then must be called again each time the value of a Param listed in
+ *     PartitionPruneState's 'execparamids' changes.

> I think this comment needs to be reworded:
>
> + * Subplans would previously be indexed 0..(n_total_subplans - 1) should be
> + * changed to index range 0..num(initially_valid_subplans).

Assuming you meant to ask to write this without the odd notation, I've
expanded the comment as follows:

- * Subplans would previously be indexed 0..(n_total_subplans - 1) should be
- * changed to index range 0..num(initially_valid_subplans).
+ * Current values of the indexes present in PartitionPruneState count all the
+ * subplans that would be present before initial pruning was done.  If initial
+ * pruning got rid of some of the subplans, any subsequent pruning passes will
+ * will be looking at a different set of target subplans to choose from than
+ * those in the pre-initial-pruning set, so the maps in PartitionPruneState
+ * containing those indexes must be updated to reflect the new indexes of
+ * subplans in the post-initial-pruning set.

I've attached only the updated 0001, though I'm still working on the
others to address David's comments.

-- 
Amit Langote
EDB: http://www.enterprisedb.com

Attachment

pgsql-hackers by date:

Previous
From: Daniel Gustafsson
Date:
Subject: Re: Reword docs of feature "Remove temporary files after backend crash"
Next
From: Justin Pryzby
Date:
Subject: Re: CLUSTER on partitioned index