Re: [PoC] Reducing planning time when tables have many partitions - Mailing list pgsql-hackers

From Ashutosh Bapat
Subject Re: [PoC] Reducing planning time when tables have many partitions
Date
Msg-id CAExHW5tLCpbnHrd0Gch2p_JKf1THtEwU5b-BuyK-hyhRPSz3Ng@mail.gmail.com
Whole thread Raw
In response to Re: [PoC] Reducing planning time when tables have many partitions  (David Rowley <dgrowleyml@gmail.com>)
Responses Re: [PoC] Reducing planning time when tables have many partitions
List pgsql-hackers
On Mon, Apr 7, 2025 at 4:34 AM David Rowley <dgrowleyml@gmail.com> wrote:
>
> On Sat, 5 Apr 2025 at 16:55, David Rowley <dgrowleyml@gmail.com> wrote:
> > I am still thinking about the duplicate members being returned from
> > the iterator for child join rels due to them being duplicated into
> > each component relid element in ec_childmembers. I did consider if
> > these could just not be duplicated and instead just put into the
> > ec_childmember element according to their lowest component relid. For
> > that to work, all callers that need these would need to ensure they
> > never pass some subset of child_relids when setting up the
> > EquivalenceMemberIterator. I need to study a bit more to understand if
> > that's doable.
>
> It looks like the child members added by
> add_child_join_rel_equivalences() are only required for pathkey
> requirements.  The EquivalenceMember mentions:
>
>  * for an appendrel child.  These members are used for determining the
>  * pathkeys of scans on the child relation and for explicitly sorting the
>  * child when necessary to build a MergeAppend path for the whole appendrel
>  * tree.  An em_is_child member has no impact on the properties of the EC as a
>
> I used the attached .txt file to highlight the places where the
> iterator returned the same member twice and saw only that
> find_ec_member_matching_expr() does this.
>
> Because during createplan, we'll have the specific RelOptInfo that we
> need the EquivalenceMember for, we'll be passing the "relids" of that
> RelOptInfo to setup_eclass_member_iterator(), in which case, I think
> it's fine to store the member in just one of the ec_childmembers[]
> array slots for just one of the relids making up the
> RELOPT_OTHER_JOINREL's component relids as that means it'll be found
> once only due to how eclass_member_iterator_next() looks at all of the
> ec_childmembers[] elements for the given relids.
>
> Doing this also allows the code in add_child_eq_member() to be simplified.
>
> I made this happen in the attached v41 patch, and that's the last
> outstanding issue that I had for this.
>
> I think this is worthy of getting into v18. Does anyone else think
> differently? It'd be good to know that soon.

I took a quick look at patch 0001. I have some questions and cosmetic comments

- foreach(lc2, cur_ec->ec_members)
+ setup_eclass_member_iterator(&it, cur_ec, rel);
+ while ((cur_em = eclass_member_iterator_next(&it)) != NULL)
{
- EquivalenceMember *cur_em = (EquivalenceMember *) lfirst(lc2);
-
/*
* Ignore child members unless they match the request.
*/

Can this step be executed in the iterator itself?
find_ec_member_matching_expr() also does something similar but uses
bms_is_subset() instead of bms_equal(). The BMS comparison operation
to use itself could be an argument to the iterator. The iterator knows
when it has started looking for child members and it also scans the
ec_members by relids, so I guess we could eliminate many comparisons
and boolean variable check.

generate_join_implied_equalities_normal() also compares relids but it
performs three comparisons, probably one of the comparisons can be
pushed into the iterator.

Scanning the code further I see why we can't do relids comparison in
the iterator itself - because we don't store ec_member in the array
elements correspodning each of relids participating in it. Let's say
an EM has em_relids (10, 11, 12) all child rels, but the EM is stored
only in list of array element corresponding to 10. If someone searches
EMs with em_relids containing 11, 12 only, this EM will be missed.

* the iterator
* will return members where the em_relids overlaps the child_relids specified
* when calling setup_eclass_member_iterator(). The caller may wish to ensure
* the em_relids is a subset of the relids they're searching for.

em_relids should be subset of or equal to child_relids specified,
otherwise it may be missed as explained above. We should rephrase the
sentence as " The caller should ensure ...". "May wish to" indicates
an optional thing, but it doesn't look optional to me.

@@ -725,7 +819,9 @@ get_eclass_for_sort_expr(PlannerInfo *root,
newec = makeNode(EquivalenceClass);
newec->ec_opfamilies = list_copy(opfamilies);
newec->ec_collation = collation;
+ newec->ec_childmembers_size = 0;
newec->ec_members = NIL;

How about renaming ec_members to ec_parent_members to make it
explicit. That might break extensions that use ec_members but that way
they would know that the purpose of this list has changed and it will
nudge them to use ec_parent_member of EC iterator appropriately.

@@ -2461,6 +2577,9 @@ rebuild_eclass_attr_needed(PlannerInfo *root)
{
EquivalenceClass *ec = (EquivalenceClass *) lfirst(lc);
+ /* We don't expect any children yet */
+ Assert(ec->ec_childmembers == NULL);
+

This function's prologue or comments inside it do not mention this.
Also the loop over ec_members prior to your change didn't exclude
child ec_members explicitly. So, I guess, we need to add a comment
mentioning why child ec members are not expected here.

- /*
- * We don't use foreach() here because there's no point in scanning
- * newly-added child members, so we can stop after the last
- * pre-existing EC member.
- */
- num_members = list_length(cur_ec->ec_members);
- for (int pos = 0; pos < num_members; pos++)
+ foreach(lc, cur_ec->ec_members)

Since you are adding a new foreach loop, does it make sense to use
foreach_node() instead?

- /*
- * We don't use foreach() here because there's no point in scanning
- * newly-added child members, so we can stop after the last
- * pre-existing EC member.
- */
- num_members = list_length(cur_ec->ec_members);
- for (int pos = 0; pos < num_members; pos++)
+ foreach(lc, cur_ec->ec_members)

foreach_node instead of foreach?

+ /*
+ * We don't expect any EC child members to exist at this point. Ensure
+ * that's the case, otherwise we might be getting asked to do something
+ * this function hasn't been coded for.
+ */
+ Assert(ec->ec_childmembers == NULL);
+

In find_em_for_rel_target() we have Assert(!em->em_is_child), but here
it's different Assert for the same purpose. It would be good to be
consistent in all such places either Asserting !em->em_is_child in the
loop or Assert(ec->ec_childmembers) OR both as you have done in
process_equivalence().

I am also surprised that the function does not have any comment or
Assert about child ec_members. It might be good to explain why we
don't expect any child members in this function. Or do you think that
the explanation is unnecessary.

@@ -1509,6 +1516,13 @@ update_eclasses(EquivalenceClass *ec, int from, int to)
List *new_members = NIL;
List *new_sources = NIL;
+ /*
+ * We don't expect any EC child members to exist at this point. Ensure
+ * that's the case, otherwise we might be getting asked to do something
+ * this function hasn't been coded for.
+ */
+ Assert(ec->ec_childmembers == NULL);
+

Similar to above.


--
Best Wishes,
Ashutosh Bapat



pgsql-hackers by date:

Previous
From: Kirill Reshke
Date:
Subject: Re: speedup COPY TO for partitioned table.
Next
From: Kirill Reshke
Date:
Subject: Re: speedup COPY TO for partitioned table.