On 10/6/25 11:02, Michael Banck wrote:
> Hi,
>
> On Mon, Oct 06, 2025 at 02:59:16AM +0200, Tomas Vondra wrote:
>> I started looking at how we calculated the 4.0 default back in 2000.
>> Unfortunately, there's a lot of info, as Tom pointed out in 2024 [2].
>> But he outlined how the experiment worked:
>>
>> - generate large table (much bigger than RAM)
>> - measure runtime of seq scan
>> - measure runtime of full-table index scan
>> - calculate how much more expensive a random page access is
>
> Ok, but I also read somewhere (I think it might have been Bruce in a
> recent (last few years) discussion of random_page_cost) that on top of
> that, we assumed 90% (or was it 95%?) of the queries were cached in
> shared_buffers (probably preferably the indexes), so that while random
> access is massively slower than sequential access (surely not 4x by
> 2000) is offset by that. I only quickly read your mail, but I didn't see
> any discussion of caching on first glance, or do you think it does not
> matter much?
>
I think you're referring to this:
https://www.postgresql.org/message-id/1156772.1730397196%40sss.pgh.pa.us
As Tom points out, that's not really how we calculated the 4.0 default.
We should probably remove that from the docs.
regards
--
Tomas Vondra