On Tuesday 2025-10-21 00:23, Tom Lane wrote:
> HEAD repeats
>
> read(4k)
> lseek(~128k forward)
>
> which is to be expected if we have to read data block headers
> that are ~128K apart; while patched repeats
>
> read(4k)
> read(~128k)
>
> which is a bit odd in itself, why isn't it merging the reads better?
The read(4k) happens because of the getc() calls that read the next
block's length.
As noticed in a message above [1], glibc seems to do 4KB buffering by
default, for some reason. setvbuf() can mitigate this.
[1] https://www.postgresql.org/message-id/1po8os49-r63o-2923-p37n-12698o1qn7p0%40tzk.arg
I'm attaching a patch that sets glibc buffering to 1MB just as a proof
of concept. It's obviously WIP, it allocates and never frees. :-)
Feel free to pick it up and change it as you see fit.
With this patch, read() calls are unified in strace. lseeks() remain,
even if they are not actually reading anything.
It seems to me that glibc could implement an optimisation for fseeko():
store the current position in the file, and do not issue the lseek()
system call if the position does not change.
>> I was using an HDD,
>
> Ah. Your original message mentioned NVMe so I was assuming you
> were also looking at solid-state drives. I can imagine that
> seeking is more painful on HDDs ...
Sorry for the confusion, in all this time I've run tests on too many
different hardware combinations. Not the best way to draw conclusions,
but it's what I had available at each time.
Dimitris