Re: [PoC] Reducing planning time when tables have many partitions - Mailing list pgsql-hackers
From | David Rowley |
---|---|
Subject | Re: [PoC] Reducing planning time when tables have many partitions |
Date | |
Msg-id | CAApHDvqp0UrCf9FLGyPWruiuU6BfPPquDxE2Wxw7ZS2S1FNH0A@mail.gmail.com Whole thread Raw |
In response to | Re: [PoC] Reducing planning time when tables have many partitions (Yuya Watari <watari.yuya@gmail.com>) |
Responses |
Re: [PoC] Reducing planning time when tables have many partitions
|
List | pgsql-hackers |
On Wed, 9 Aug 2023 at 20:15, Yuya Watari <watari.yuya@gmail.com> wrote: > I agree with your opinion that my patch lacks some explanations, so I > will consider adding more comments. However, I received the following > message from David in March. > > On Thu, Mar 9, 2023 at 6:23 AM David Rowley <dgrowleyml@gmail.com> wrote: > > For the main patch, I've been starting to wonder if it should work > > completely differently. Instead of adding members for partitioned and > > inheritance children, we could just translate the Vars from child to > > top-level parent and find the member that way. I wondered if this > > method might be even faster as it would forego > > add_child_rel_equivalences(). I think we'd still need em_is_child for > > UNION ALL children. So far, I've not looked into this in detail. I > > was hoping to find an idea that would allow some means to have the > > planner realise that a LIST partition which allows a single Datum > > could skip pushing base quals which are constantly true. i.e: > > > > create table lp (a int) partition by list(a); > > create table lp1 partition of lp for values in(1); > > explain select * from lp where a = 1; > > > > Seq Scan on lp1 lp (cost=0.00..41.88 rows=13 width=4) > > Filter: (a = 1) > > I am concerned that fixing the current patch will conflict with > David's idea. Of course, I am now trying to experiment with the above > idea, but I should avoid the conflict if he is working on this. David, > what do you think about this? Is it OK to post a new patch to address > the review comments? I am looking forward to your reply. So, I have three concerns with this patch. 1) I really dislike the way eclass_member_iterator_next() has to check bms_overlap() to filter out unwanted EMs. This is required because of how add_child_rel_equivalences() does not pass the "relids" parameter in add_eq_member() as equivalent to pull_varnos(expr). See this code in master: /* * Transform em_relids to match. Note we do *not* do * pull_varnos(child_expr) here, as for example the * transformation might have substituted a constant, but we * don't want the child member to be marked as constant. */ new_relids = bms_difference(cur_em->em_relids, top_parent_relids); new_relids = bms_add_members(new_relids, child_relids); I understand this is done to support Consts in UNION ALL parents, e.g the following query prunes the n=2 UNION ALL branch postgres=# explain select * from (select 1 AS n,* from pg_Class c1 union all select 2 AS n,* from pg_Class c2) where n=1; QUERY PLAN ---------------------------------------------------------------- Seq Scan on pg_class c1 (cost=0.00..18.13 rows=413 width=277) (1 row) ... but the following (existing) comment is just a lie: Relids em_relids; /* all relids appearing in em_expr */ This means that there's some weirdness on which RelOptInfos we set eclass_member_indexes. Do we just set the EM in the RelOptInfos mentioned in the em_expr, or should it be the ones in em_relids? You can see the following code I wrote in the 0001 patch which tries to work around this problem: + /* + * We must determine the exact set of relids in the expr for child + * EquivalenceMembers as what is given to us in 'relids' may differ from + * the relids mentioned in the expression. See add_child_rel_equivalences + */ + if (parent != NULL) + expr_relids = pull_varnos(root, (Node *) expr); + else + { + expr_relids = relids; + /* We expect the relids to match for non-child members */ + Assert(bms_equal(pull_varnos(root, (Node *) expr), relids)); + } So, you can see we go with the relids from the em_expr rather than what's mentioned in em_relids. I believe this means we need the following line: + /* + * Don't return members which have no common rels with with_relids + */ + if (!bms_overlap(em->em_relids, iter->with_relids)) + continue; I don't quite recall if the em_expr can mention relids that are not in em_relids or not or if em_expr's relids always is a subset of em_relids. I'm just concerned this adds complexity and the risk of mixing up the meaning (even more than it is already in master). I'm not sure I'm confident that all this is correct, and I wrote the 0001 patch. Maybe this can be fixed by changing master so that em_relids always matches pull_varnos(em_expr)? I'm unsure if there are any other complexities other than having to ensure we don't set em_is_const for child members. 2) The 2nd reason is what I hinted at that you quoted in the email I sent you in March. I think if it wasn't for UNION ALL and perhaps table inheritance and we only needed child EMs for partitions of partitioned tables, then I think we might be able to get away with just translating Exprs child -> parent before looking up the EM and likewise when asked to get join quals for child rels, we'd translate the child relids to their top level parents, find the quals then translate those back to child form again. EquivalenceClasses would then only contain a few members and there likely wouldn't be a great need to do any indexing like we are in the 0001 patch. I'm sure someone somewhere probably has a query that would go faster with them, but it's likely going to be rare therefore probably not worth it. Unfortunately, I'm not smart enough to just tell you this will or will not work just off hand. The UNION ALL branch pruning adds complexity that I don't recall the details of. To know, someone would either need to tell me, or I'd need to go try to make it work myself and then discover the reason it can't be made to work. I'm happy for you to try this, but if you don't I'm not sure when I can do it. I think it would need to be at least explored before I'd ever consider thinking about committing this patch. 3) I just don't like the way the patch switches between methods of looking up EMs as it means we could return EMs in a different order depending on something like how many partitions were pruned or after the DBA does ATTACH PARTITION. That could start causing weird problems like plan changes due to a change in which columns were selected in generate_implied_equalities_for_column(). I don't have any examples of actual problems, but it's pretty difficult to prove there aren't any. Of course, I do recall the complaint about the regression for more simple queries and that's why I wrote the iterator code to have it use the linear search when the number of EMs is small, so we can't exactly just delete the linear search method as we'd end up with that performance regression again. I think the best way to move this forward is to explore not putting partitioned table partitions in EMs and instead see if we can translate to top-level parent before lookups. This might just be too complex to translate the Exprs all the time and it may add overhead unless we can quickly determine somehow that we don't need to attempt to translate the Expr when the given Expr is already from the top-level parent. If that can't be made to work, then maybe that shows the current patch has merit. David
pgsql-hackers by date: