Re: [PoC] Reducing planning time when tables have many partitions - Mailing list pgsql-hackers

From David Rowley
Subject Re: [PoC] Reducing planning time when tables have many partitions
Date
Msg-id CAApHDvpf+2rUi3rOepS9jMjiS3ezH9VcZisS+A+Ob5G5YpBZxg@mail.gmail.com
Whole thread Raw
In response to Re: [PoC] Reducing planning time when tables have many partitions  (Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>)
Responses Re: [PoC] Reducing planning time when tables have many partitions
Re: [PoC] Reducing planning time when tables have many partitions
List pgsql-hackers
On Tue, 8 Apr 2025 at 04:54, Ashutosh Bapat
<ashutosh.bapat.oss@gmail.com> wrote:
> - foreach(lc2, cur_ec->ec_members)
> + setup_eclass_member_iterator(&it, cur_ec, rel);
> + while ((cur_em = eclass_member_iterator_next(&it)) != NULL)
> {
> - EquivalenceMember *cur_em = (EquivalenceMember *) lfirst(lc2);
> -
> /*
> * Ignore child members unless they match the request.
> */
>
> Can this step be executed in the iterator itself?

Yes, all the filtering requested *could* be done within the iterator.
The problem with doing that is that there's not even nearly a single
pattern to what callers need. Here's a summary of all 7 users of it:

1. get_eclass_for_sort_expr:  CHILD members must have em_relids EQUAL
to rel (TYPE 1)
2. generate_join_implied_equalities_normal: ALL members must be a
SUBSET of join_relids (TYPE 2)
3. generate_implied_equalities_for_column: ALL members must have
em_relids EQUAL to rel->relids (TYPE 3)
4. find_em_for_rel: ALL members must be a SUBSET of rel->relids (TYPE 2)
5. find_ec_member_matching_expr: CHILD members must have em_relids
SUBSET of relids (TYPE 4)
6. find_computable_ec_member: CHILD members must have em_relids SUBSET
of relids (TYPE 4)
7. match_pathkeys_to_index: ALL members must have em_relids EQUAL to
index->rel->relids (TYPE 3)

So, 4 distinct requirements and only types 3 and 4 are repeated.
Certainly, we could code up the iterator to handle all those different
requirements, but would that code be easy to follow? Do you think 1
iterator should handle all those requirements with a set of IF
statements? I think that would be more difficult to read than how the
patch has it now.

An alternative that I did consider, and even put a comment on the 2nd
paragraph of the header comment for EquivalenceMemberIterator about,
is the possibility of having multiple iterators for different
purposes. That would save the spaghetti code of IF statements doing it
in 1 iterator and alleviate the performance overhead of the spaghetti
code too.  The problem is that there's still not a common pattern
that's followed often enough. If I wrote 4 iterators, I'd need to give
them meaningful names. Names like
eclass_member_iterator_children_with_subset_of_next are a too long and
if I was writing new code that was to call some function like that,
I'd probably be happier just doing the filtering locally than trying
to find which of the 4 iterators is the one I want.

> find_ec_member_matching_expr() also does something similar but uses
> bms_is_subset() instead of bms_equal(). The BMS comparison operation
> to use itself could be an argument to the iterator. The iterator knows
> when it has started looking for child members and it also scans the
> ec_members by relids, so I guess we could eliminate many comparisons
> and boolean variable check.

Performance-wise, it would be nicer to filter within the iterator as
calling next() once per loop means an external function call on each
loop. Unfortunately, I don't think there's a common enough pattern to
warrant all the iterators that are needed. I feel we'd need to see
some demonstratable performance improvements to warrant adding
special-purpose iterators. I expect those will be quite small, but
happy if someone proves me wrong. I'd rather not hold up the patch for
a hypothetical 1-2% on some workload when the current patch as-is
gives 20x for some tested other workloads.

> generate_join_implied_equalities_normal() also compares relids but it
> performs three comparisons, probably one of the comparisons can be
> pushed into the iterator.

Could be. There's also the overhead of having to document each
iterator so that callers know what's handled and what they need to
handle themselves.

> Scanning the code further I see why we can't do relids comparison in
> the iterator itself - because we don't store ec_member in the array
> elements correspodning each of relids participating in it. Let's say
> an EM has em_relids (10, 11, 12) all child rels, but the EM is stored
> only in list of array element corresponding to 10. If someone searches
> EMs with em_relids containing 11, 12 only, this EM will be missed.

That currently won't happen as the only child members which have
multiple relids is em_relids are RELOPT_OTHER_JOINREL and they're only
searched during create plan. Those searches are always done with all
the relids belonging to that RELOPT_OTHER_JOINREL. This is the reason
I think it's ok to store the member for that in the slot for relid 10
(in this example)

> * the iterator
> * will return members where the em_relids overlaps the child_relids specified
> * when calling setup_eclass_member_iterator(). The caller may wish to ensure
> * the em_relids is a subset of the relids they're searching for.
>
> em_relids should be subset of or equal to child_relids specified,
> otherwise it may be missed as explained above. We should rephrase the
> sentence as " The caller should ensure ...". "May wish to" indicates
> an optional thing, but it doesn't look optional to me.

That comment does not know what the caller needs to do. That's up to
the caller. If every caller needed to do that, we might as well just
filter non-matching ones out in the iterator itself.

I did rewrite that comment quite a bit in the attached version as I
still wasn't quite happy with the wording. I still didn't make any
assumptions about what the caller wants, however.

> @@ -725,7 +819,9 @@ get_eclass_for_sort_expr(PlannerInfo *root,
> newec = makeNode(EquivalenceClass);
> newec->ec_opfamilies = list_copy(opfamilies);
> newec->ec_collation = collation;
> + newec->ec_childmembers_size = 0;
> newec->ec_members = NIL;
>
> How about renaming ec_members to ec_parent_members to make it
> explicit. That might break extensions that use ec_members but that way
> they would know that the purpose of this list has changed and it will
> nudge them to use ec_parent_member of EC iterator appropriately.

The reason I didn't rename that field is mainly because I'm fine with
the current name. In my view, I think of parent members as normal
members of the class and only the child members are 2nd class members.
You can see evidence of that line of thought in the patch with
add_eq_member() not being called add_parent_eq_member().

One reason that I did think about in favour of a rename was to
purposefully break any extension code using that field.  I didn't
look, but I imagine that Citus and maybe Timescale could be extensions
that might have code that needs to be updated. However, I do imagine
it won't take them very long to notice their stuff isn't working. Do
you think we need to make it so their compiler prompts them to fix the
code? The rename is permanent, and breaking extensions is just a
transient thing between now and whenever extension authors release
their v18-supported extension. If I thought the breakage was something
subtle, I'd be more inclined to agree to the rename.

I certainly could get on board with renaming if there's consensus to
do so. I don't think that's going to happen in the next 9 hours. Is it
worth pushing this out to v19 because we can't agree on the name of a
struct field? I'd be disappointed if we miss this because of something
so trivial.

> @@ -2461,6 +2577,9 @@ rebuild_eclass_attr_needed(PlannerInfo *root)
> {
> EquivalenceClass *ec = (EquivalenceClass *) lfirst(lc);
> + /* We don't expect any children yet */
> + Assert(ec->ec_childmembers == NULL);
> +
>
> This function's prologue or comments inside it do not mention this.
> Also the loop over ec_members prior to your change didn't exclude
> child ec_members explicitly. So, I guess, we need to add a comment
> mentioning why child ec members are not expected here.

I've added some comments there. I had already done something similar
in analyzejoins.c

> - /*
> - * We don't use foreach() here because there's no point in scanning
> - * newly-added child members, so we can stop after the last
> - * pre-existing EC member.
> - */
> - num_members = list_length(cur_ec->ec_members);
> - for (int pos = 0; pos < num_members; pos++)
> + foreach(lc, cur_ec->ec_members)
>
> Since you are adding a new foreach loop, does it make sense to use
> foreach_node() instead?

OK. I am in favour of new code using those. It's a nicer API. I've
edited the patch

> + /*
> + * We don't expect any EC child members to exist at this point. Ensure
> + * that's the case, otherwise we might be getting asked to do something
> + * this function hasn't been coded for.
> + */
> + Assert(ec->ec_childmembers == NULL);
> +
>
> In find_em_for_rel_target() we have Assert(!em->em_is_child), but here
> it's different Assert for the same purpose. It would be good to be
> consistent in all such places either Asserting !em->em_is_child in the
> loop or Assert(ec->ec_childmembers) OR both as you have done in
> process_equivalence().

I think this might not be obvious when reading the patch, but I think
where the confusion might be starting is that
"Assert(!cur_em->em_is_child);" isn't asserting what it used to. In
unpatched master, there are a few places that do that to verify the
function is never called after child members have been added.  The
patch changes the meaning of this Assert. The comment was updated in a
bid to try to make this more obvious. The Assert's new purpose is to
ensure no child members snuck into the ec_members list. For the
functions that fundamentally don't expect children to exist yet, I've
added an Assert(cur_ec->ec_childmembers == NULL) to ensure that
remains true. I believe I'm consistent with that and I believe that's
a worthwhile assert to make.

Occasionally, there are Asserts to check there are no child members in
ec_members elsewhere in places where ec_childmembers might contain
children. I believe all the Asserts doing this were all converted from
an "if (ec->em_is_child) continue;" statement. I don't feel the need
to add this assert everywhere that we loop over ec_members. It does
not seem likely that we'd get an em_is_child member in ec_members as
there's only a single place where the ec_members list is adjusted.
I'd be more easily convinced we can just delete all those Asserts than
I could be to add more.

> I am also surprised that the function does not have any comment or
> Assert about child ec_members. It might be good to explain why we
> don't expect any child members in this function. Or do you think that
> the explanation is unnecessary.
>
> @@ -1509,6 +1516,13 @@ update_eclasses(EquivalenceClass *ec, int from, int to)
> List *new_members = NIL;
> List *new_sources = NIL;
> + /*
> + * We don't expect any EC child members to exist at this point. Ensure
> + * that's the case, otherwise we might be getting asked to do something
> + * this function hasn't been coded for.
> + */
> + Assert(ec->ec_childmembers == NULL);
> +
>
> Similar to above.

The patch isn't adding any new reason for this. If you want this then
I'll need to go and figure out all the existing reasons and document
them. However, I think it's a simple case that these functions are
always called before children exist and are coded with that
assumption. I think the comment I added for this in analyzejoins.c
works ok.  I think for cases such as process_equivalence(), it's much
more fundamental that children can't exist yet, as we're still working
on figuring out what the parent members are going to be. So it's not a
case of the function just not being coded to handle children as it is
in analyzejoins.c.

I've attached a version with the foreach_node() changes included. I've
also done some extensive work on the comments to try to make the API
of the interator easier to understand.  There's also a small
optimisation made to eclass_member_iterator_next() to move a goto
label to save having to recheck if the ListCell is NULL when moving to
the next list. We're already pointing to the proven non-empty list
head at that point, so we don't need to check the ListCell is NULL
twice.

David

Attachment

pgsql-hackers by date:

Previous
From: Thomas Munro
Date:
Subject: Re: BitmapHeapScan streaming read user and prelim refactoring
Next
From: Ashutosh Bapat
Date:
Subject: Re: [PATCH] clarify palloc comment on quote_literal_cstr