Home > mailing lists

Re: Speeding up parts of the planner using a binary search treestructure for nodes - Mailing list pgsql-hackers

From	Amit Langote
Subject	Re: Speeding up parts of the planner using a binary search treestructure for nodes
Date	June 3, 2020 03:14:51
Msg-id	CA+HiwqG8OxgxeQNyAvzwcSKXME8gg=xh-wVnyXNNhw9UNPi0VA@mail.gmail.com Whole thread
In response to	Speeding up parts of the planner using a binary search tree structurefor nodes (David Rowley <dgrowleyml@gmail.com>)
List	pgsql-hackers

Tree view

Hi David,

On Fri, May 29, 2020 at 2:16 PM David Rowley <dgrowleyml@gmail.com> wrote:
> In [1] I mentioned that I'd like to look into coding a data structure
> to allow Node types to be looked up more efficiently than what List
> allows.  There are many places in the planner where we must determine
> if a given Node is in some List of Nodes. This is an O(n) operation.
> We could reduce these to as low as O(log n) with a binary search tree.
>
> A few places in the code that does this which can become a problem
> when the lists being searched are large:
>
> get_eclass_for_sort_expr()
> tlist_member()
>
> For the former, it's quite easy to see this come up in the profiles by
> querying a partitioned table with many partitions. e.g
>
> create table listp (a int) partition by list(a);
> select 'create table listp'||x::Text || ' partition of listp for
> values in('||x::text||');' from generate_Series(1,10000)x;
> \gexec
> create index on listp(a);
> explain select * from listp order by a;
>
> There are a few places that this new data structure could be used. My
> primary interest at the moment is EquivalenceClass.ec_members. There
> are perhaps other interesting places, such as planner targetlists, but
> obviously, any changes would need to be proved worthwhile before
> they're made.

I'm glad to see you mention this problem.

I have often wondered if we could do something about these data
structures growing bigger with the number of partitions in the first
place.  I could imagine that when "child" EC members were designed,
Tom maybe didn't think we'd continue relying on the abstraction for
the 10000 partitions case.  Making the lookup algorithm smarter with
binary or hashed search, as you are trying to do, might help us
somewhat, but would it be interesting to consider alternative designs
for the underlying abstraction?  Sorry, I don't have any concrete
ideas to offer at the moment, but thought sharing this perspective of
the problem might inspire others whose may have some.

-- 
Amit Langote
EnterpriseDB: http://www.enterprisedb.com

pgsql-hackers by date:

From: Kyotaro Horiguchi
Date: 03 June 2020, 03:06:22
Subject: Re: Should we remove a fallback promotion? take 2

From: Amit Kapila
Date: 03 June 2020, 03:22:08
Subject: Re: elog(DEBUG2 in SpinLocked section.

Re: Speeding up parts of the planner using a binary search treestructure for nodes - Mailing list pgsql-hackers

Previous

Next