Re: Memory consumed by paths during partitionwise join planning - Mailing list pgsql-hackers

From Ashutosh Bapat
Subject Re: Memory consumed by paths during partitionwise join planning
Date
Msg-id CAExHW5s_y_VxHcGZ1XX5_J5Yf34DWHU=Czj6BRf23Qx3LhE6jg@mail.gmail.com
Whole thread Raw
In response to Re: Memory consumed by paths during partitionwise join planning  (Andrei Lepikhov <a.lepikhov@postgrespro.ru>)
Responses Re: Memory consumed by paths during partitionwise join planning
List pgsql-hackers
On Fri, Feb 16, 2024 at 8:42 AM Andrei Lepikhov
<a.lepikhov@postgrespro.ru> wrote:
> Live example: right now, I am working on the code like MSSQL has - a
> combination of NestLoop and HashJoin paths and switching between them in
> real-time. It requires both paths in the path list at the moment when
> extensions are coming. Even if one of them isn't referenced from the
> upper pathlist, it may still be helpful for the extension.

There is no guarantee that every path presented to add_path will be
preserved. Suboptimal paths are freed as and when add_path discovers
that they are suboptimal. So I don't think an extension can rely on
existence of a path. But having a refcount makes it easy to preserve
the required paths by referencing them.

>
> >> About partitioning. As I discovered planning issues connected to
> >> partitions, the painful problem is a rule, according to which we are
> >> trying to use all nomenclature of possible paths for each partition.
> >> With indexes, it quickly increases optimization work. IMO, this can help
> >> a 'symmetrical' approach, which could restrict the scope of possible
> >> pathways for upcoming partitions if we filter some paths in a set of
> >> previously planned partitions.
> >
> > filter or free?
> Filter.
> I meant that Postres tries to apply IndexScan, BitmapScan,
> IndexOnlyScan, and other strategies, passing throughout the partition
> indexes. The optimizer spends a lot of time doing that. So, why not
> introduce a symmetrical strategy and give away from the search some
> indexes of types of scan based on the pathifying experience of previous
> partitions of the same table: if you have dozens of partitions, Is it
> beneficial for the system to find a bit more optimal IndexScan on one
> partition having SeqScans on 999 other?
>
IIUC, you are suggesting that instead of planning each
partition/partitionwise join, we only create paths with the strategies
which were found to be optimal with previous partitions. That's a good
heuristic but it won't work if partition properties - statistics,
indexes etc. differ between groups of partitions.

--
Best Wishes,
Ashutosh Bapat



pgsql-hackers by date:

Previous
From: Ashutosh Bapat
Date:
Subject: Re: partitioning and identity column
Next
From: Daniel Gustafsson
Date:
Subject: Re: numeric_big in make check?