Re: Parameter for planner estimate of recursive queries - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Parameter for planner estimate of recursive queries
Date
Msg-id 2770446.1648056989@sss.pgh.pa.us
Whole thread Raw
In response to Re: Parameter for planner estimate of recursive queries  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: Parameter for planner estimate of recursive queries
List pgsql-hackers
Robert Haas <robertmhaas@gmail.com> writes:
> On Tue, Jan 25, 2022 at 4:44 AM Peter Eisentraut
> <peter.eisentraut@enterprisedb.com> wrote:
>> On the one hand, this smells like a planner hint.  But on the other
>> hand, it doesn't look like we will come up with proper graph-aware
>> selectivity estimation system any time soon, so just having all graph
>> OLTP queries suck until then because the planner hint is hardcoded
>> doesn't seem like a better solution.  So I think this setting can be ok.

> I agree. It's a bit lame, but seems pretty harmless, and I can't see
> us realistically doing a lot better with any reasonable amount of
> work.

Yeah, agreed on all counts.  The thing that makes it lame is that
there's no reason to expect that the same multiplier is good for
every recursive query done in an installation, or even in a session.

One could imagine dealing with that by adding custom syntax to WITH,
as we have already done once:

WITH RECURSIVE cte1 AS SCALE 1.0 (SELECT ...

But I *really* hesitate to go there, mainly because once we do
something like that we can't ever undo it.  I think Simon's
proposal is a reasonable low-effort compromise.

Some nitpicks:

* The new calculation needs clamp_row_est(), since the float
GUC could be fractional or even zero.

* Do we want to prevent the GUC value from being zero?  It's not
very sensible, plus I think we might want to reserve that value
to mean "use the built-in calculation", in case we ever do put
in some smarter logic here.  But I'm not sure what a reasonable
non-zero lower bound would be.

* The proposed docs claim that a smaller setting works by biasing
the planner towards fast-start plans, but I don't think I believe
that explanation.  I'd venture that we want text more along the
lines of "This may help the planner choose the most appropriate
method for joining the work table to the query's other tables".

            regards, tom lane



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: ubsan
Next
From: Andres Freund
Date:
Subject: Re: shared-memory based stats collector - v67