Re: vacuum_cost_page_miss default value and modern hardware - Mailing list pgsql-hackers

From Robert Haas
Subject Re: vacuum_cost_page_miss default value and modern hardware
Date
Msg-id CA+TgmoZCb3L8w=B2=0nVnM1Hy9FwtigTPMDz=36k+3rBr99Pbw@mail.gmail.com
Whole thread Raw
In response to Re: vacuum_cost_page_miss default value and modern hardware  (Magnus Hagander <magnus@hagander.net>)
Responses Re: vacuum_cost_page_miss default value and modern hardware  (Peter Geoghegan <pg@bowt.ie>)
List pgsql-hackers
On Thu, Jan 14, 2021 at 12:29 PM Magnus Hagander <magnus@hagander.net> wrote:
> +1 for this in principle.

I'm not opposed to this change and I agree that the relative expense
of dirtying a page is higher than what the current defaults suggest.
So I also think Peter is going in the right general direction, though
like you I am not sure about the specifics.

In practice, most users get hosed by not vacuuming aggressively
enough, rather than by vacuuming too aggressively. For instance,
suppose you have a table and workload such that the table needs to be
vacuumed once per hour to maintain good performance. As you make the
table bigger and bigger, you will eventually reach a size where the
configured cost limits aren't high enough to permit this to happen.
The system has no option to disregard the configured limit, even for
an emergency autovacuum. Eventually the user is forced into an outage
either by the table becoming so bloated that VACUUM FULL is required,
or by running out of XIDs. It seems bad that we ship a set of default
settings that are guaranteed to hose any database with a reasonable
number of updates once the database size exceeds some limit. The fact
that we decreased autovacuum_cost_delay by 10x increased the limit by
10x, which is good, but the problem remains. I don't know exactly how
to do better, and any proposal in that area would be much more
complicated than what Peter is proposing here, but it's something to
think about.

There are also users I've seen get hosed by vacuuming too
aggressively. I have seen this happen in two ways. One is too much
dirty data. The other is too much read I/O, pushing hot data out of
the cache, leading to a storm of random I/O later when the foreground
workload needs to get that stuff back, basically killing the system,
sometimes for hours, while it tries to get back the stuff it lost.
That might seem like an argument against further raising the possible
I/O rate, which would be the effect of the change Peter is proposing,
but that's not really my position. I think the bigger problem with all
this is that it's too hard to configure; almost nobody can work out
what a given set of configuration parameters actually means in MB/s or
GB/hour. In the past I've proposed that maybe we should redesign this
whole system to work in those kinds of units, which people actually
understand, but I don't know if that's the right idea. Still another
approach would be to try to reduce the degree to which the cache gets
trashed, or make it have less harmful effect on future performance by
reading things back in more efficiently. I don't really know.

TL;DR: This change is fine with me, but this whole system has much
deeper issues.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: pg_preadv() and pg_pwritev()
Next
From: Robert Haas
Date:
Subject: Re: adding wait_start column to pg_locks