Re: bg worker: general purpose requirements - Mailing list pgsql-hackers

From Markus Wanner
Subject Re: bg worker: general purpose requirements
Date
Msg-id 4C986B80.1020207@bluegap.ch
Whole thread Raw
In response to Re: bg worker: general purpose requirements  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: bg worker: general purpose requirements
List pgsql-hackers
On 09/21/2010 02:49 AM, Robert Haas wrote:
> OK.  At least for me, what is important is not only how many GUCs
> there are but how likely they are to require tuning and how easy it
> will be to know what the appropriate value is.  It seems fairly easy
> to tune the maximum number of background workers, and it doesn't seem
> hard to tune an idle timeout, either.  Both of those are pretty
> straightforward trade-offs between, on the one hand, consuming more
> system resources, and on the other hand, better throughput and/or
> latency.

Hm.. I thought of it the other way around. It's more obvious and direct
for me to determine a min and max of the amount of parallel jobs I want
to perform at once. Based on the number of spindles, CPUs and/or nodes
in the cluster (in case of Postgres-R). Admittedly, not necessarily per
database, but at least overall.

I wouldn't known what to set a timeout to. And you didn't make a good
argument for any specific value so far. Nor did you offer a reasoning
for how to find one. It's certainly very workload and feature specific.

> On the other hand, the minimum number of workers to keep
> around per-database seems hard to tune.  If performance is bad, do I
> raise it or lower it?

Same applies for the timeout value.

> And it's certainly not really a hard minimum
> because it necessarily bumps up against the limit on overall number of
> workers if the number of databases grows too large; one or the other
> has to give.

I'd consider the case of min_spare_background_workers * number of
databases > max_background_workers to be a configuration error, about
which the coordinator should warn.

> I think we need to look for a way to eliminate the maximum number of
> workers per database, too.

Okay, might make sense, yes.

Dropping both of these per-database GUCs, we'd simply end up with having
max_background_workers around all the time.

A timeout would mainly help to limit the max amount of time workers sit
around idle. I fail to see how that's more helpful than the proposed
min/max. Quite the opposite, it's impossible to get any useful guarantees.

It assumes that the workload remains the same over time, but doesn't
cope well with sudden spikes and changes in the workload. Unlike the
proposed min/max combination, which forks new bgworkers in advance, even
if the database already uses lots of them. And after the spike, it
quickly reduces the amount of spare bgworkers to a certain max. While
not perfect, it's definitely more adaptive to the workload (at least in
the usual case of having only few databases).

Maybe we need a more sophisticated algorithm in the coordinator. For
example measuring the avg. amount of concurrent jobs per database over
time and adjust the number of idle backends according to that, the
current workload and the max_background_workers, or some such. The
min/max GUCs were simply easier to implement, but I'm open to a more
sophisticated thing.

Regards

Markus Wanner


pgsql-hackers by date:

Previous
From: Boszormenyi Zoltan
Date:
Subject: Re: libpq changes for synchronous replication
Next
From: Dimitri Fontaine
Date:
Subject: Re: Configuring synchronous replication