Re: [PoC] Reducing planning time when tables have many partitions - Mailing list pgsql-hackers
From | David Rowley |
---|---|
Subject | Re: [PoC] Reducing planning time when tables have many partitions |
Date | |
Msg-id | CAApHDvpvofFuGLyv0b_HsZXoRxa6rmG6=BSE65VbN5datyhAJQ@mail.gmail.com Whole thread Raw |
In response to | Re: [PoC] Reducing planning time when tables have many partitions (David Rowley <dgrowleyml@gmail.com>) |
Responses |
Re: [PoC] Reducing planning time when tables have many partitions
|
List | pgsql-hackers |
esOn Tue, 9 Aug 2022 at 19:10, David Rowley <dgrowleyml@gmail.com> wrote: > I've not had a chance to look at the 0003 patch yet. I've looked at the 0003 patch now. The performance numbers look quite impressive, however, there were a few things about the patch that I struggled to figure what they were done the way you did them: + root->eq_not_children_indexes = bms_add_member(root->eq_not_children_indexes, Why is that in PlannerInfo rather than in the EquivalenceClass? if (bms_equal(rel->relids, em->em_relids)) { rel->eclass_member_indexes = bms_add_member(rel->eclass_member_indexes, em_index); } Why are you only adding the eclass_member_index to the RelOptInfo when the em_relids contain a singleton relation? I ended up going and fixing the patch to be more how I imagined it. I've ended up with 3 Bitmapset fields in EquivalenceClass; ec_member_indexes, ec_nonchild_indexes, ec_norel_indexes. I also trimmed the number of helper functions down for obtaining the minimal set of matching EquivalenceMember indexes to just: Bitmapset * get_ecmember_indexes(PlannerInfo *root, EquivalenceClass *ec, Relids relids, bool with_children, bool with_norel_members) Bitmapset * get_ecmember_indexes_strict(PlannerInfo *root, EquivalenceClass *ec, Relids relids, bool with_children, bool with_norel_members) I'm not so much a fan of the bool parameters, but it seemed better than having 8 different functions with each combination of the bool paramters instead of 2. The "strict" version of the function takes the intersection of eclass_member_indexes for each rel mentioned in relids, whereas the non-strict version does a union of those. Each then intersect that with all members in the 'ec', or just the non-child members when 'with_children' is false. They both then optionally bms_add_members() the ec_norel_members if with_norel_members is true. I found it difficult to figure out the best order to do the intersection. That really depends on if the particular query has many EquivalenceClasses with few EquivalenceMembers or few EquivalenceClasses with many EquivalenceMembers. bms_int_members() always recycles the left input. Ideally, that would always be the smallest Bitmapset. Maybe it's worth inventing a new version of bms_int_members() that recycles the input with the least nwords. That would give the subsequent bms_next_member() calls an easier time. Right now they'll need to loop over a bunch of 0 words at the end for many queries. A few problems I ran into along the way: 1. generate_append_tlist() generates Vars with varno=0. That causes problems when we add Exprs from those in add_eq_member() as there is no element at root->simple_rel_array[0] to add eclass_member_indexes to. 2. The existing comment for EquivalenceMember.em_relids claims "all relids appearing in em_expr", but that's just not true when it comes to em_is_child members. So far, I fixed #1 by adding a hack to setup_simple_rel_arrays() to do "root->simple_rel_array[0] = makeNode(RelOptInfo);" I'm not suggesting that's the correct fix. It might be possible to set the varnos to the varnos from the first Append child instead. The fact that #2 is not true adds quite a bit of complexity to the patch and I think the patch might even misbehave as a result. It seems there are cases where a child em_relids can contain additional relids that are not present in the em_expr. For example, when a UNION ALL child has a Const in the targetlist, as explained in a comment in add_child_rel_equivalences(). However, there also seem to be cases where the opposite is true. I had to add the following code in add_eq_member() to stop a regression test failing: if (is_child) expr_relids = bms_add_members(expr_relids, relids); That's to make sure we add eclass_member_indexes to each RelOptInfo mentioned in the em_expr. After doing all that, I noticed that your benchmark was showing that create_join_clause() was the new bottleneck. This was due to having to loop so many times over the ec_sources to find an already built RestrictInfo. I went off and added some new code to optimize the lookup of those in a similar way by adding a new Bitmapset field in RelOptInfo to index which ec_sources it mentioned, which meant having to move ec_sources into PlannerInfo. I don't think this part of the patch is quite right yet as the code I have relies on em_relids being the same as the ones mentioned in the RestrictInfo. That seems not true for em_is_child EMs, so I think we probably need to add a new field to EquivalenceMember that truly is just pull_varnos from em_expr, or else look into some way to make em_relids mean that (like the comment claims). Here are my results from running your benchmark on master (@f6c750d31) with and without the attached patch. npart master (ms) patched (ms) speedup 2 0.28 0.29 95.92% 4 0.37 0.38 96.75% 8 0.53 0.56 94.43% 16 0.92 0.91 100.36% 32 1.82 1.70 107.57% 64 4.05 3.26 124.32% 128 10.83 6.69 161.89% 256 42.63 19.46 219.12% 512 194.31 42.60 456.14% 1024 1104.02 98.37 1122.33% This resulted in some good additional gains in planner performance. The 1024 partition case is now about 11x faster on my machine instead of 4x. The 2 partition does regress slightly. There might be a few things we can do about that, for example, move ec_collation up 1 to shrink EquivalenceClass back down closer to the size it was before. [1] might be enough to make up for the remainder. I've attached a draft patch with my revisions. David [1] https://commitfest.postgresql.org/39/3810/
Attachment
pgsql-hackers by date: