Re: How to retain lesser paths at add_path()? - Mailing list pgsql-hackers

From Dmitry Dolgov
Subject Re: How to retain lesser paths at add_path()?
Date
Msg-id 20201105154131.4q3sp2iwqru4hgv2@localhost
Whole thread Raw
In response to Re: How to retain lesser paths at add_path()?  (Kohei KaiGai <kaigai@heterodb.com>)
List pgsql-hackers
> On Tue, Jan 14, 2020 at 12:46:02AM +0900, Kohei KaiGai wrote:
> The v2 patch is attached.
>
> This adds two dedicated lists on the RelOptInfo to preserve lesser paths
> if extension required to retain the path-node to be removed in usual manner.
> These lesser paths are kept in the separated list, so it never expand the length
> of pathlist and partial_pathlist. That was the arguable point in the discussion
> at the last October.
>
> The new hook is called just before the path-node removal operation, and
> gives extension a chance for extra decision.
> If extension considers the path-node to be removed can be used in the upper
> path construction stage, they can return 'true' as a signal to preserve this
> lesser path-node.
> In case when same kind of path-node already exists in the preserved_pathlist
> and the supplied lesser path-node is cheaper than the old one, extension can
> remove the worse one arbitrarily to keep the length of preserved_pathlist.
> (E.g, PG-Strom may need one GpuJoin path-node either pathlist or preserved-
> pathlist for further opportunity of combined usage with GpuPreAgg path-node.
> It just needs "the best GpuJoin path-node" somewhere, not two or more.)
>
> Because PostgreSQL core has no information which preserved path-node can
> be removed, extensions that uses path_removal_decision_hook() has responsibility
> to keep the length of preserved_(partial_)pathlist reasonable.

Hi,

Thanks for the patch! I had a quick look at it and have a few questions:

* What would be the exact point/hook at which an extension can use
  preserved pathlists? I guess it's important, since I can imagine it's
  important for one of the issues mentioned in the thread about such an
  extension have to re-do significant part of the calculations from
  add_path.

* Do you have any benchmark results with some extension using this
  hook? The idea with another pathlist of "discarded" paths sounds like
  a lightweight solution, and indeed I've performed few tests with two
  workloads (simple queries, queries with joins of 10 tables) and the
  difference between master and patched versions is rather small (no
  stable difference for the former, couple of percent for the latter).
  But it's of course with an empty hook, so it would be good to see
  other benchmarks as well.

* Does it make sense to something similar with add_path_precheck,
  which also in some situations excluding paths?

* This part sounds dangerous for me:

> Because PostgreSQL core has no information which preserved path-node can
> be removed, extensions that uses path_removal_decision_hook() has responsibility
> to keep the length of preserved_(partial_)pathlist reasonable.

  since an extension can keep limited number of paths in the list, but
  then the same hook could be reused by another extension which will
  also try to limit such paths, but together they'll explode.



pgsql-hackers by date:

Previous
From: Seino Yuki
Date:
Subject: Re: enable pg_stat_statements to track rows processed by REFRESH MATERIALIZED VIEW
Next
From: Alvaro Herrera
Date:
Subject: Re: redundant error messages