Re: [PoC] Reducing planning time when tables have many partitions - Mailing list pgsql-hackers

From Yuya Watari
Subject Re: [PoC] Reducing planning time when tables have many partitions
Date
Msg-id CAJ2pMkaNzmvMUm9igQwRH0AAo39gsjnE1VXupPGyLR2T7ENnUQ@mail.gmail.com
Whole thread Raw
In response to Re: [PoC] Reducing planning time when tables have many partitions  (Andrey Lepikhov <a.lepikhov@postgrespro.ru>)
Responses Re: [PoC] Reducing planning time when tables have many partitions
Re: [PoC] Reducing planning time when tables have many partitions
List pgsql-hackers
Hello Ashutosh and Andrey,

Thank you for your email, and I really apologize for my late response.

On Thu, Sep 7, 2023 at 3:43 PM Ashutosh Bapat
<ashutosh.bapat.oss@gmail.com> wrote:
> It seems that  you are still investigating and fixing issues. But the
> CF entry is marked as "needs review". I think a better status is
> "WoA". Do you agree with that?

Yes, I am now investigating and fixing issues. I agree with you and
changed the entry's status to "Waiting on Author". Thank you for your
advice.

On Tue, Sep 19, 2023 at 5:21 PM Andrey Lepikhov
<a.lepikhov@postgrespro.ru> wrote:
> Working on self-join removal in the thread [1] nearby, I stuck into the
> problem, which made an additional argument to work in this new direction
> than a couple of previous ones.
> With indexing positions in the list of equivalence members, we make some
> optimizations like join elimination more complicated - it may need to
> remove some clauses and equivalence class members.
> For changing lists of derives or ec_members, we should go through all
> the index lists and fix them, which is a non-trivial operation.

Thank you for looking into this and pointing that out. I understand
that this problem will occur somewhere like your patch [1] quoted
below because we need to modify RelOptInfo->eclass_child_members in
addition to ec_members. Is my understanding correct? (Of course, I
know ec_[no]rel_members, but I doubt we need them.)

=====
+static void
+update_eclass(EquivalenceClass *ec, int from, int to)
+{
+   List       *new_members = NIL;
+   ListCell   *lc;
+
+   foreach(lc, ec->ec_members)
+   {
+       EquivalenceMember  *em = lfirst_node(EquivalenceMember, lc);
+       bool                is_redundant = false;
+
        ...
+
+       if (!is_redundant)
+           new_members = lappend(new_members, em);
+   }
+
+   list_free(ec->ec_members);
+   ec->ec_members = new_members;
=====

I think we may be able to remove the eclass_child_members field by
making child members on demand. v20 makes child members at
add_[child_]join_rel_equivalences() and adds them into
RelOptInfo->eclass_child_members. Instead of doing that, if we
translate on demand when child members are requested,
RelOptInfo->eclass_child_members is no longer necessary. After that,
there is only ec_members, which consists of parent members, so
removing clauses will still be simple. Do you think this idea will
solve your problem? If so, I will experiment with this and share a new
patch version. The main concern with this idea is that the same child
member will be created many times, wasting time and memory. Some
techniques like caching might solve this.

[1] https://www.postgresql.org/message-id/flat/64486b0b-0404-e39e-322d-0801154901f3%40postgrespro.ru

--
Best regards,
Yuya Watari



pgsql-hackers by date:

Previous
From: Dilip Kumar
Date:
Subject: Re: logical decoding and replication of sequences, take 2
Next
From: Etsuro Fujita
Date:
Subject: Comment about set_join_pathlist_hook()