Re: [PoC] Reducing planning time when tables have many partitions - Mailing list pgsql-hackers

From Tom Lane
Subject Re: [PoC] Reducing planning time when tables have many partitions
Date
Msg-id 1384699.1656883702@sss.pgh.pa.us
Whole thread Raw
In response to Re: [PoC] Reducing planning time when tables have many partitions  (Yuya Watari <watari.yuya@gmail.com>)
Responses Re: [PoC] Reducing planning time when tables have many partitions
Re: [PoC] Reducing planning time when tables have many partitions
List pgsql-hackers
Yuya Watari <watari.yuya@gmail.com> writes:
> On Thu, Mar 24, 2022 at 11:03 AM David Rowley <dgrowleyml@gmail.com> wrote:
>> I think a better way to solve this would be just to have a single hash
>> table over all EquivalenceClasses that allows fast lookups of
>> EquivalenceMember->em_expr.

> If the predicate were "em->em_expr == something", the hash table whose
> key is em_expr would be effective. However, the actual predicates are
> not of this type but the following.

> // Find EquivalenceMembers whose relids is equal to the given relids
> (1) bms_equal(em->em_relids, relids)

> // Find EquivalenceMembers whose relids is a subset of the given relids
> (2) bms_is_subset(em->em_relids, relids)

Yeah, that's a really interesting observation, and I agree that
David's suggestion doesn't address it.  Maybe after we fix this
problem, matching of em_expr would be the next thing to look at,
but your results say it isn't the first thing.

I'm not real thrilled with trying to throw hashtables at the problem,
though.  As David noted, they'd be counterproductive for simple
queries.  Sure, we could address that with duplicate code paths,
but that's a messy and hard-to-tune approach.  Also, I find the
idea of hashing on all subsets of relids to be outright scary.
"m is not so large in most cases" does not help when m *is* large.

For the bms_equal class of lookups, I wonder if we could get anywhere
by adding an additional List field to every RelOptInfo that chains
all EquivalenceMembers that match that RelOptInfo's relids.
The trick here would be to figure out when to build those lists.
The simple answer would be to do it lazily on-demand, but that
would mean a separate scan of all the EquivalenceMembers for each
RelOptInfo; I wonder if there's a way to do better?

Perhaps the bms_is_subset class could be handled in a similar
way, ie do a one-time pass to make a List of all EquivalenceMembers
that use a RelOptInfo.

            regards, tom lane



pgsql-hackers by date:

Previous
From: Przemysław Sztoch
Date:
Subject: Re: [PATCH] Completed unaccent dictionary with many missing characters
Next
From: Andrew Dunstan
Date:
Subject: Re: JSON/SQL: jsonpath: incomprehensible error message