Re: Auto-tuning work_mem and maintenance_work_mem - Mailing list pgsql-hackers

From Josh Berkus
Subject Re: Auto-tuning work_mem and maintenance_work_mem
Date
Msg-id 525589B3.6080604@agliodbs.com
Whole thread Raw
In response to Auto-tuning work_mem and maintenance_work_mem  (Bruce Momjian <bruce@momjian.us>)
Responses Re: Auto-tuning work_mem and maintenance_work_mem  (Bruce Momjian <bruce@momjian.us>)
List pgsql-hackers
On 10/09/2013 09:30 AM, Stephen Frost wrote:
>>> I went with shared_buffers because unlike the others, it is a fixed
>>> > > allocation quantity, while the other are much more variable and harder
>>> > > to set.  I figured we could keep our 25% estimate of shared_buffers and
>>> > > everything else would fall in line.
>>> > >
>> > 
>> > I understand, but your proposal change a logic to opposite direction. Maybe
>> > better is wait to new GUC parameter, and then implement this feature, so be
>> > logical and simply understandable.
> I disagree- having a better default than what we have now is going to
> almost certainly be a huge improvement in the vast majority of cases.
> How we arrive at the default isn't particularly relevant as long as we
> document it.  Users who end up using the default don't do so because
> they read the docs and said "oh, yeah, the way they calculated the
> default makes a lot of sense", then end up using it because they never
> open the config file, at all.

FWIW,  I've been using the following calculations as "starting points"
for work_mem with both clients and students.  In 80-90% of cases, the
user never adjusts the thresholds again, so I'd say that passes the test
for a "good enough" setting.

The main goal is (a) not to put a default low ceiling on work_mem for
people who have lots of RAM and (b) lower the limit for users who have
way too many connections on a low-RAM machine.

# Most web applications should use the formula below, because their
# queries often require no work_mem.
# work_mem = ( AvRAM / max_connections ) ROUND DOWN to 2^x
# work_mem = 4MB  # for 2GB server with 300 connections
# Solaris: cut the above in half.

# Formula for most BI/DW applications, or others running many complex
# queries:
# work_mem = ( AvRAM / ( 2 * max_connections ) ) ROUND DOWN to 2^x
# work_mem = 128MB   # DW server with 32GB RAM and 40 connections

AvRAM is "available ram", which for purposes of this approach would be
4X shared_buffers.  So the final formula would be:

shared_buffers * 4 / max_connections = work_mem

*however*, there's a couple problems with autotuning the above:

1) it's strongly workload-dependant; we need to know if the user is
doing DW or OLTP.

2) few users adjust their max_connections downwards, even when it's
warranted.

3) we also need to know if the user is on a platform like Solaris or
FreeBSD which doesn't overcommit RAM allocations per-backend.

BTW, in extensive testing of DW workloads, I've never seen an individual
backend allocate more than 3X work_mem total.

So if we want a completely generic limit, I would say:

1MB << (shared_buffers * 2 / max_connections) << 256MB

That is: divide double shared buffers by max_connections.  If that's
over 1MB, raise it, but not further than 256MB.

Overall, our real answer to autotuning work_mem is to have work_mem
admissions control, per Kevin's proposal a couple years ago.

maintenance_work_mem is easier, because we only really care about the
number of autovacuum daemons, which is usually 3.  so:

8MB << (shared_buffers / autovacuum_workers) << 256MB

... would do it.

Note that I'd expect to adjust the upper limits of these ranges each
year, as larger and larger RAM becomes commonplace and as we work out
PG's issues with using large RAM blocks.

I'm not sure that temp_buffers can be autotuned at all.  We'd have to
make assumptions about how many temp tables a particular application
uses, which is going to be either "a lot" or "none at all".  However, at
a stab:

1MB << (shared_buffers * 4 / max_connections) << 512MB

QUESTION: at one time (7.2?), we allocated work_mem purely by doubling
RAM requests, which meant that setting work_mem to any non-binary value
meant you actually got the next lowest binary value.  Is that no longer
true?

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com



pgsql-hackers by date:

Previous
From: Stephen Frost
Date:
Subject: Re: Auto-tuning work_mem and maintenance_work_mem
Next
From: David Fetter
Date:
Subject: Re: Auto-tuning work_mem and maintenance_work_mem