Re: [PoC] Reducing planning time when tables have many partitions - Mailing list pgsql-hackers

From Yuya Watari
Subject Re: [PoC] Reducing planning time when tables have many partitions
Date
Msg-id CAJ2pMkYR_X-=pq+39-W5kc0OG7q9u5YUwDBCHnkPur17DXnxuQ@mail.gmail.com
Whole thread Raw
In response to Re: [PoC] Reducing planning time when tables have many partitions  (David Rowley <dgrowleyml@gmail.com>)
Responses Re: [PoC] Reducing planning time when tables have many partitions
List pgsql-hackers
Dear David,

On Mon, Jan 30, 2023 at 9:14 PM David Rowley <dgrowleyml@gmail.com> wrote:
> I've attached what I worked on today.

I really appreciate your quick response and the v15 patches. The bug
fixes in the v15 look good to me.

After receiving your email, I realized that this version does not
apply to the current master. This conflict is caused by commits of
2489d76c49 [1] and related. I have attached the rebased version, v16,
to this email. Resolving many conflicts was a bit of hard work, so I
may have made some mistakes.

Unfortunately, the rebased version did not pass regression tests. This
failure is due to segmentation faults regarding a null reference to
RelOptInfo. I show the code snippet that leads to the segfault as
follows.

=====
@@ -572,9 +662,31 @@ add_eq_member(EquivalenceClass *ec, Expr *expr,
Relids relids,
+    i = -1;
+    while ((i = bms_next_member(expr_relids, i)) >= 0)
+    {
+        RelOptInfo *rel = root->simple_rel_array[i];
+
+        rel->eclass_member_indexes =
bms_add_member(rel->eclass_member_indexes, em_index);
+    }
=====

The segfault occurred because root->simple_rel_array[i] is sometimes
NULL. This issue is similar to the one regarding
root->simple_rel_array[0]. Before the commit of 2489d76c49, we only
had to consider the nullability of root->simple_rel_array[0]. We
overcame this problem by creating the RelOptInfo in the
setup_append_rel_entry() function. However, after the commit,
root->simple_rel_array[i] with non-zero 'i' can also be NULL. I'm not
confident with its cause, but is this because non-base relations
appear in the expr_relids? Seeing the commit, I found the following
change in pull_varnos_walker():

=====
@@ -153,7 +161,11 @@ pull_varnos_walker(Node *node,
pull_varnos_context *context)
        Var        *var = (Var *) node;

        if (var->varlevelsup == context->sublevels_up)
+       {
            context->varnos = bms_add_member(context->varnos, var->varno);
+           context->varnos = bms_add_members(context->varnos,
+                                             var->varnullingrels);
+       }
        return false;
    }
    if (IsA(node, CurrentOfExpr))
=====

We get the expr_relids by pull_varnos(). This commit adds
var->varnullingrels to its result. From my observations, indices 'i'
such that root->simple_rel_array[i] is null come from
var->varnullingrels. This change is probably related to the segfault.
I don't understand the commit well, so please let me know if I'm
wrong.

To address this problem, in v16-0003, I moved EquivalenceMember
indexes in RelOptInfo to PlannerInfo. This change allows us to store
indexes whose corresponding RelOptInfo is NULL.

> I'm happier
> that this simple_rel_array[0] entry now only exists when planning set
> operations, but I'd probably feel better if there was some other way
> that felt less like we're faking up a RelOptInfo to store
> EquivalenceMembers in.

Of course, I'm not sure if my approach in v16-0003 is ideal, but it
may help solve your concern above. Since simple_rel_array[0] is no
longer necessary with my patch, I removed the setup_append_rel_entry()
function in v16-0004. However, to work the patch, I needed to change
some assertions in v16-0005. For more details, please see the commit
message of v16-0005. After these works, the attached patches passed
all regression tests in my environment.

Instead of my approach, imitating the following change to
get_eclass_indexes_for_relids() is also a possible solution. Ignoring
NULL RelOptInfos enables us to avoid the segfault, but we have to
adjust EquivalenceMemberIterator to match the result, and I'm not sure
if this idea is correct.

=====
@@ -3204,6 +3268,12 @@ get_eclass_indexes_for_relids(PlannerInfo
*root, Relids relids)
    {
        RelOptInfo *rel = root->simple_rel_array[i];

+       if (rel == NULL)        /* must be an outer join */
+       {
+           Assert(bms_is_member(i, root->outer_join_rels));
+           continue;
+       }
+
        ec_indexes = bms_add_members(ec_indexes, rel->eclass_indexes);
    }
    return ec_indexes;
=====

[1] https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=2489d76c4906f4461a364ca8ad7e0751ead8aa0d

-- 
Best regards,
Yuya Watari

Attachment

pgsql-hackers by date:

Previous
From: "Jonathan S. Katz"
Date:
Subject: Re: First draft of back-branch release notes is done
Next
From: Tomas Vondra
Date:
Subject: Re: pglz compression performance, take two