Re: Disable parallel query by default - Mailing list pgsql-hackers

From Laurenz Albe
Subject Re: Disable parallel query by default
Date
Msg-id ea48b093a5237d77b461cf82260f1e6f8548a10f.camel@cybertec.at
Whole thread Raw
In response to Disable parallel query by default  ("Scott Mead" <scott@meads.us>)
List pgsql-hackers
On Tue, 2025-05-13 at 17:53 -0400, Scott Mead wrote:
> On Tue, May 13, 2025, at 5:07 PM, Greg Sabino Mullane wrote:
> > On Tue, May 13, 2025 at 4:37 PM Scott Mead <scott@meads.us> wrote:
> > > I'll open by proposing that we prevent the planner from automatically
> > > selecting parallel plans by default
> >
> > That seems a pretty heavy hammer, when we have things like
> > parallel_setup_cost that should be tweaked first.
>
> I agree it's a big hammer and I thought through parallel_setup_cost
> quite a bit myself.  The problem with parallel_setup_cost is that it
> doesn't actually represent the overhead of a setting up parallel
> query for a busy system.  It does define the cost of setup for a
> *single* parallel session, but it cannot accurately express the
> cost of CPU and other overhead associated with the second, third,
> fourth, etc... query that is executed as parallel.  The expense to
> the operating system is a function of the _rate_ of parallel query
> executions being issued.  Without new infrastructure, there's no way
> to define something that will give me a true representation of the
> cost of issuing a query with parallelism.

There is no way for the optimizer to represent that your system is
under CPU overload currently.  But I agree with Greg that
parallel_setup_cost is the setting that should be adjusted.
If PostgreSQL is more reluctant to even start considering a parallel plan,
that would be a move in the right direction in a case like this:

> > > What is the fallout?  When a high-volume, low-latency query flips to
> > > parallel execution on a busy system, we end up in a situation where
> > > the database is effectively DDOSing itself with a very high rate of
> > > connection establish and tear-down requests.  Even if the query ends
> > > up being faster (it generally does not), the CPU requirements for the
> > > same workload rapidly double or worse, with most of it being spent
> > > in the OS (context switch, fork(), destroy()).  When looking at the
> > > database, you'll see a high load average, and high wait for CPU with
> > > very little actual work being done within the database.

You are painting a bleak picture indeed.  I get to see PostgreSQL databases
in trouble regularly, but I have not seen anything like what you describe.
If a rather cheap, very frequent query is suddenly estimated to be
expensive enough to warrant a parallel plan, I'd suspect that the estimates
must be seriously off.

With an argument like that, you may as well disable nested loop joins.
I have seen enough cases where disabling nested loop joins, without any
deeper analysis, made very slow queries reasonably fast.

Sure enough, I often see systems where I recommend disabling parallel
query - in fact, whenever throughput is more important than response time.
But I also see many cases where parallel query works just like it should
and leads to a better user experience.

I have come to disable JIT by default, but not parallel query.

The primary problem that I encounter with parallel query is that dynamic
shared memory segments grow to a size where they cause OOM errors.
That's the most frequent reason for me to recommend disabling parallel query.

Yours,
Laurenz Albe



pgsql-hackers by date:

Previous
From: Richard Guo
Date:
Subject: Re: Assert failure in base_yyparse
Next
From: Dilip Kumar
Date:
Subject: Re: Backward movement of confirmed_flush resulting in data duplication.