Re: Add time spent in posix_fadvise() in I/O read time ? - Mailing list pgsql-hackers

From Cédric Villemain
Subject Re: Add time spent in posix_fadvise() in I/O read time ?
Date
Msg-id ae066b11-7c9f-4521-9e7d-f04825f73ccb@Data-Bene.io
Whole thread Raw
In response to Re: Add time spent in posix_fadvise() in I/O read time ?  (Frédéric Yhuel <frederic.yhuel@dalibo.com>)
List pgsql-hackers
On 19/03/2025 14:25, Frédéric Yhuel wrote:
>
>
> On 3/14/25 09:43, Frédéric Yhuel wrote:
>> One thing I've noticed is that posix_fadvise(,,POSIX_FADV_WILLNEED) 
>> isn't always non-blocking on Linux. As Ted Ts'o explains in this old 
>> thread[1], it blocks when the request queue fills up.
>
> When posix_fadvise() blocks, it doesn't seem to be completely off-cpu 
> (but mostly, at least on my machine), and I assume that this is the 
> reason for the sentence "A value higher than needed to keep the disks 
> busy will only result in extra CPU overhead" in the documentation for 
> effective_io_concurrency? My guess is that Linux uses a spinlock 
> somewhere, and that this explains the cpu overhead.
>
> Also, on Linux, it seems we can control the size of the request queue 
> with the /sys/block/XXX/queue/nr_requests setting. On my machine, it 
> is set to 64 by default. When I set it up to 128, postgres spends less 
> time on posix_fadvise(), and correspondingly more on pread().
>
>

If you're still on this topic, some additions:

* io_uring_prep_fadvise() syscall is maybe interesting to look at, I 
don't know what has been done here within Andres work nor if it is 
applicable (I didn't look at this code yet), but "git grep" didn't raise 
anything.
    This is supposed to be "async posix_fadvise"

* nr_request, I have 1023 (nvme) with Linux 6.12.22-amd64 #1 SMP 
PREEMPT_DYNAMIC Debian 6.12.22-1 (2025-04-10) x86_64.

Note that the scheduler also change the behavior of the queue, with 
mq-deadline there is 25% of the queue reserved for sync requests.
See 
https://github.com/torvalds/linux/commit/07757588e5076748308dd95ee2e3cd0b82ebb8c4

* https://wiki.postgresql.org/wiki/Readahead - if you want to 
read/check/enhance its content.

I hope it helps.

---
Cédric Villemain +33 6 20 30 22 52
https://www.Data-Bene.io
PostgreSQL Support, Expertise, Training, R&D




pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: PostgreSQL 18 Beta 1 release announcement draft
Next
From: "Jonathan S. Katz"
Date:
Subject: Re: PostgreSQL 18 Beta 1 release announcement draft