Re: [PoC] Reducing planning time when tables have many partitions - Mailing list pgsql-hackers

From David Rowley
Subject Re: [PoC] Reducing planning time when tables have many partitions
Date
Msg-id CAApHDvpELk-FTdwrJx0820i6d7N8LCo-ss5a_UYE0ZOO-HEQLg@mail.gmail.com
Whole thread Raw
In response to Re: [PoC] Reducing planning time when tables have many partitions  (Yuya Watari <watari.yuya@gmail.com>)
Responses Re: [PoC] Reducing planning time when tables have many partitions  (Yuya Watari <watari.yuya@gmail.com>)
List pgsql-hackers
Thank you for running all the benchmarks on v10.

On Thu, 8 Dec 2022 at 00:31, Yuya Watari <watari.yuya@gmail.com> wrote:
> The above results show that the reverts I have made in v9-0002 and
> v9-0003 are very important in avoiding degradation. I think we should
> apply these changes again. It is unclear whether v9 or v10 + v9-0002 +
> v9-0003 is better, but the latter performed better in my experiments.

I was hoping to keep the logic which decides to loop over ec_members
or use the bitmap indexes all in equivclass.c, ideally in the iterator
code.

I've looked at the v9-0002 patch and I'm thinking maybe it's ok since
it always loops over ec_nonchild_indexes. We process the base
relations first, so all the EquivalenceMember in PlannerInfo for these
will be at the start of the eq_members list and the Bitmapset won't
have many bitmapwords to loop over.  Additionally, it's only looping
over the nonchild ones, so a large number of partitions existing has
no effect on the number of loops performed.

For v9-0003, I was really hoping to find some kind of workaround so we
didn't need the "if (root->simple_rel_array_size < 32)".  The problem
I have with that is; 1) why is 32 a good choice?, and 2)
simple_rel_array_size is just not a great thing to base the decision
off of.  For #1, we only need to look at the EquivalenceMembers
belonging to base relations here and simple_rel_array_size includes
all relations, including partitions, so even if there's just a few
members belonging to base rels, we may still opt to use the Bitmapset
method.  Additionally, it does look like this patch should be looping
over ec_nonchild_indexes rather than ec_member_indexes and filtering
out the !em->em_is_const && !em->em_is_child EquivalenceMembers.

Since both the changes made in v9-0002 and v9-0003 can just be made to
loop over ec_nonchild_indexes, which isn't going to get big with large
numbers of partitions, then I wonder if we're ok just to do the loop
in all cases rather than conditionally try to do something more
fanciful with counting bits like I had done in
select_outer_pathkeys_for_merge().  I've made v11 work like what
v9-0003 did and I've used v9-0002.  I also found a stray remaining
"bms_membership(eclass->ec_member_indexes) != BMS_MULTIPLE" in
eclass_useful_for_merging() that should have been put back to
"list_length(eclass->ec_members) <= 1".

I've still got a couple of things in mind that I'd like to see done to
this patch.

a) I think the iterator code should have some additional sanity checks
that the results of both methods match when building with
USE_ASSERT_CHECKING. I've got some concerns that we might break
something. The logic about what the em_relids is set to for child
members is a little confusing. See add_eq_member().
b) We still need to think about if adding a RelOptInfo to
PlannerInfo->simple_rel_array[0] is a good idea for solving the append
relation issue. Ideally, we'd have a proper varno for these Vars
instead of setting varno=0 per what's being done in
generate_append_tlist().

I've attached the v11 set of patches.

David

Attachment

pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: Checksum errors in pg_stat_database
Next
From: Zheng Li
Date:
Subject: Re: Support logical replication of DDLs