Home > mailing lists

Re: Should work_mem be stable for a prepared statement? - Mailing list pgsql-hackers

From	James Hunter
Subject	Re: Should work_mem be stable for a prepared statement?
Date	March 5 21:53:02
Msg-id	CAJVSvF4jK4V7Ci_RA_911-BFJzJXNBM-9M2BxNaysBuXMZHbsw@mail.gmail.com Whole thread Raw
In response to	Re: Should work_mem be stable for a prepared statement? (Tom Lane <tgl@sss.pgh.pa.us>)
List	pgsql-hackers

Tree view

On Fri, Feb 28, 2025 at 2:34 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Sami Imseih <samimseih@gmail.com> writes:
> > However, I think any GUC that can influence the planner
> > should be considered for consistency in behavior.
> > It was mentioned above with the enable_* GUCs, but another
> > one I can think of is the max_parallel_workers_per_gather which
> > should then force a re-plan if changed. I have seen users need to turn
> > that off in a hurry when it impacts their oltp workload.
>
> The argument for treating work_mem specially is that it has effects at
> both plan time and run time, so that the planner's cost assumptions
> are invalidated if the executor uses a different value than the
> planner did.  I don't believe that argument applies to anything else;
> certainly it doesn't apply to the enable_* flags.

I take the view:
 * for the plan time effects, the point of a prepared statement is to
plan exactly once, at the time the statement is prepared. So I don't
think it makes sense to replan when a GUC changes. (So, no to option
(b).)
* for the run time effects, the current run time state of the world
should control. (So, no to option (a).)

But I would leave (a) open as an option for extensions.

>  I'm also not
> convinced that the argument requires us to solve the problem by
> re-planning.  It would work just as well to stamp each PlannedStmt
> with the value that the planner used and refer to that in the
> executor instead of looking directly at the GUC.

I propose something similar, in [1], except that instead of stamping
the PlannedStmt with a single (pair of) GUC value, I stamp it with a
separate value for every Plan or SubPlan that needs it. And then a
hook / extension can impose Jeff's option (a), which is what you
describe here.

> This is all kind of moot though, now that Jeff has revealed that
> what he's really interested in is some sort of refactoring.
> Maybe that refactoring is one that would conveniently apply to
> other GUCs, or maybe it isn't.  I'm content to await details
> before arguing about what we "should" do.

I think refactoring is good (as you'd expect, given my proposal!), but
-- expanding on what you wrote above -- I think it's also important to
consider "plan time effects" and "run time effects" separately. The
same GUC / field / whatever might sometimes be used at both plan and
run time, but conceptually these are different.

A plan-time setting is intended to bias the planner so that it chooses
certain classes of plans over others. A plan-time setting is "wrong"
only so far as it causes the planner to choose a sub-optimal plan.

A run-time setting reflects the current state of the PostgreSQL
instance, at the time the query is run.

PQ is a good example. At plan time, we choose PQ, and place a Gather
node on the Plan, based on an assumption
(max_parallel_workers_per_gather) about how much parallelism we'll
actually get at run-time. And then, at run-time, we assign actual
workers based on max_parallel_workers and other queries running at the
same time.

It may turn out that we can't actually get *any* actual workers, at
runtime; so then we have added a Gather node for nothing. This has a
non-zero cost. If we had known, at plan time, that we wouldn't get any
parallel workers, then we should have chosen a serial plan. But, so
what? Planning is always an estimate.

Yeah, "Gather" overhead is lower than the cost of choosing a
sub-optimal join method, but I think the same principle applies:
* at plan time, we make the best estimate and prediction we can, based
on info available at that time;
* at run time, we make the best decision we can, based on the actual
state of the database instance.

And in cases where we think, at runtime, that the planner's
assumptions were so wrong that they'll lead us to execute a
sub-optimal plan, then maybe we can re-plan. But I explicitly wouldn't
re-plan a prepared statement, since the customer's expectation is that
it has already been prepared.

James Hunter

[1]
https://www.postgresql.org/message-id/flat/CAJVSvF5n3_uEGW5GZSRehDuTfz7XVDohbn7tVJ%2B2ZnweQEVFrQ%40mail.gmail.com#abc6e69a396bb9f6505bf33260670a1f

pgsql-hackers by date:

From: Aleksander Alekseev
Date: 05 March, 21:50:52
Subject: Re: Hook for Selectivity Estimation in Query Planning

From: Tom Lane
Date: 05 March, 21:53:27
Subject: Re: making EXPLAIN extensible

Re: Should work_mem be stable for a prepared statement? - Mailing list pgsql-hackers

Previous

Next