Re: [PoC] Reducing planning time when tables have many partitions - Mailing list pgsql-hackers

From Yuya Watari
Subject Re: [PoC] Reducing planning time when tables have many partitions
Date
Msg-id CAJ2pMkaLi_2vCSrxwWjXvpGd_EExy4sQFL5-3BAwALRKhsviGA@mail.gmail.com
Whole thread Raw
In response to Re: [PoC] Reducing planning time when tables have many partitions  (David Rowley <dgrowleyml@gmail.com>)
Responses Re: [PoC] Reducing planning time when tables have many partitions
List pgsql-hackers
Hello David,

Thank you very much for your thorough review and valuable comments.

I have refactored the patches based on your feedback and attached the
updated versions (v34). Additionally, I have included a diff between
v33 and v34 for your quick reference.

On Thu, Mar 13, 2025 at 1:53 PM David Rowley <dgrowleyml@gmail.com> wrote:
>
> 1) Can you describe the difference between
> PlannerInfo.top_parent_relid_array and RelOptInfo.top_parent_relids?
> If you've added the PlannerInfo field for performance reasons, then
> that needs to be documented. I think the bar for adding another field
> to do the same thing should be quite high.  The
> RelOptInfo.top_parent_relids field already is commented with
> "redundant, but handy", so having another field in another struct
> that's also redundant leads me to think that some design needs more
> thought.
>
> If you need a cheap way to take the same shortcut as you're doing in
> setup_eclass_child_member_iterator() with "if
> (root->top_parent_relid_array == NULL)", then maybe PlannerInfo should
> have a boolean field to record if there are any other member rels

Thank you for highlighting this. I initially introduced
PlannerInfo.top_parent_relid_array primarily for performance reasons
to quickly determine whether a relation is a parent or child,
particularly in setup_eclass_child_member_iterator(). As you
mentioned, earlier versions utilized the check "if
(root->top_parent_relid_array == NULL)" to skip unnecessary operations
when no child relations exist.

However, I have realized that the same behavior can be achieved by
using root->append_rel_array. Specifically, if a relation is a parent,
the corresponding AppendRelInfo is NULL, and if there are no child
relations at all, the entire array itself is NULL. So,
PlannerInfo.top_parent_relid_array is no longer necessary.

In v34-0001, I removed root->top_parent_relid_array and instead
utilized root->append_rel_array. However, this caused issues in
add_setop_child_rel_equivalences(), since this function adds a new
child EquivalenceMember without building a parent-child relationship
in root->append_rel_array. To address this, I have created a dummy
AppendRelInfo object in v34-0002. This is just a workaround, and there
may be a more elegant solution. I'd greatly appreciate any suggestions
or alternative approaches you might have.

> 2) I think the naming of setup_eclass_child_member_iterator() and
> dispose_eclass_child_member_iterator() is confusing. From the names,
> I'd expect these to only be returning em_is_child == true members, but
> that's not the case.

I agree the original naming was misleading. In v34-0001, I have
renamed these functions to
setup_eclass_all_member_iterator_for_relids() and
dispose_eclass_all_member_iterator_for_relids(). To align with this
change, I have also renamed EquivalenceChildMemberIterator to
EquivalenceAllMemberIterator. Does this new naming better address your
concern?

> 3) The header comment for setup_eclass_child_member_iterator() does
> not seem concise enough. It claims "so that it can iterate over
> EquivalenceMembers in 'ec'.", but what does that mean? The definition
> of "EquivalenceMembers in 'ec'" isn't clear. Is that just the class's
> ec_members, or also the child members that are stored somewhere else.
> Users of this function need to know what they'll get so they know
> which members they need to ignore or which they can assume won't be
> returned. If you don't document that, then it's quite hard to
> determine where the faulty code is when we get bugs. The "relids"
> parameter needs to be documented too.

I have clarified the header comment in v34-0001. It now explicitly
states that the iterator iterates over all parent members and child
members whose em_relids are subsets of the given 'relids'. I have also
clearly documented the parameters, including 'relids'.

> 4) add_transformed_child_version sounds like it does some
> transformation, but all it does is add the EMs for the given
> RelOptInfo to the iterator's list. I don't quite follow what's being
> "transformed". Maybe there's a better name?

Thank you for highlighting this. The original name was indeed
misleading. I have renamed this function to
add_eclass_child_members_to_iterator().

--
Best regards,
Yuya Watari

Attachment

pgsql-hackers by date:

Previous
From: Jakub Wartak
Date:
Subject: Re: Draft for basic NUMA observability
Next
From: vignesh C
Date:
Subject: Re: [PATCH] Fixed creation of empty .log files during log rotation