Re: Should we update the random_page_cost default value? - Mailing list pgsql-hackers

From Robert Treat
Subject Re: Should we update the random_page_cost default value?
Date
Msg-id CABV9wwNNggGGTon39CSHasiQTkmUNyNk_g2ufRU6mU0B5EKpWA@mail.gmail.com
Whole thread Raw
In response to Re: Should we update the random_page_cost default value?  (Andres Freund <andres@anarazel.de>)
Responses Re: Should we update the random_page_cost default value?
List pgsql-hackers
On Mon, Oct 6, 2025 at 1:06 PM Andres Freund <andres@anarazel.de> wrote:
>
> Hi,
>
> On 2025-10-06 12:57:20 -0400, Bruce Momjian wrote:
> > On Mon, Oct  6, 2025 at 11:14:13AM -0400, Andres Freund wrote:
> > > > It obviously contradicts the advice to set the value closer to 1.0. But
> > > > why is that? SSDs are certainly better with random I/0, even if the I/O
> > > > is not concurrent and the SSD is not fully utilized. So the 4.0 seems
> > > > off, the value should be higher than what we got for SSDs ...
> > >
> > > I'd guess that the *vast* majority of PG workloads these days run on networked
> > > block storage. For those typically the actual latency at the storage level is
> > > a rather small fraction of the overall IO latency, which is instead dominated
> > > by network and other related cost (like the indirection to which storage
> > > system to go to and crossing VM/host boundaries).  Because the majority of the
> > > IO latency is not affected by the storage latency, but by network lotency, the
> > > random IO/non-random IO difference will play less of a role.
> >
> > Yes, the last time we discussed changing the default random page cost,
> > September 2024, the argument was that while SSDs should be < 4, cloud
> > storage might be > 4, so 4 was still a good value:
> >
> >
https://www.postgresql.org/message-id/flat/877caxaxt6.fsf%40wibble.ilmari.org#8a10b7b8cf05410291d076f8def58c29
>
> I think it's exactly the other way round. The difference between random and
> sequential IO is *smaller* on cloud storage than on local storage, due to
> network IO being the biggest component of IO latency on cloud storage - and
> network latency is the same for random and sequential IO.
>

One of the interesting things about Tomas' work, if you look at the
problem from the other end, is that this exposes a thought-line that I
suspect is almost completely untested "in the field", specifically the
idea of *raising* random_page_cost as a means to improve performance.
Given we have literal decades of anecdata that says lowering it to
something closer to 1 is the right answer, one could make the argument
that perhaps the right default is actually 1, and the recommended
tuning advice would simply become to raise it depending on specifics
of your workload (with some help in explaining how larger numbers are
likely to affect planning). As a default, we "know" (based on
anecdata) that this would improve performance for some large number of
workloads out of the box, and to the degree that others are not
helped, everyone would now be tuning in the same direction. I'll grant
you that this is a rather counterintuitive suggestion.


Robert Treat
https://xzilla.net



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: Should we update the random_page_cost default value?
Next
From: Robert Treat
Date:
Subject: Re: Add mode column to pg_stat_progress_vacuum