Thread: macOS prefetching support

macOS prefetching support

From
Peter Eisentraut
Date:
It seems to me that we could implement prefetching support 
(USE_PREFETCH) on macOS using the fcntl() command F_RDADVISE.  The man 
page description is a bit terse:

      F_RDADVISE       Issue an advisory read async with no copy to user.

But it seems to be the right idea.  Was this looked into before?  I 
couldn't find anything in the archives.

Attached is a patch to implement this.  It seems to work, but of course 
it's kind of hard to tell whether it actually does anything useful.

(Even if the performance effects were negligible, this would be useful 
to get the prefetch code some exercise on this platform.)

Thoughts?
Attachment

Re: macOS prefetching support

From
Thomas Munro
Date:
On Wed, Aug 14, 2024 at 7:04 PM Peter Eisentraut <peter@eisentraut.org> wrote:
> Attached is a patch to implement this.  It seems to work, but of course
> it's kind of hard to tell whether it actually does anything useful.

Header order problem: pg_config_os.h defines __darwin__, but
pg_config_manual.h is included first, and tests __darwin__.  I hacked
my way around that, and then made a table of 40,000,000 integers in a
2GB buffer pool.  I used "select count(pg_buffercache_evict(buffered))
from pg_buffer_cache", and "sudo purge", to clear the two layers of
cache for each test, and then measured:

maintenance_io_concurrency=0,  ANALYZE: 2311ms
maintenance_io_concurrency=10, ANALYZE:  652ms
maintenance_io_concurrency=25, ANALYZE:  389ms

It works!



Re: macOS prefetching support

From
Peter Eisentraut
Date:
On 14.08.24 14:36, Thomas Munro wrote:
> On Wed, Aug 14, 2024 at 7:04 PM Peter Eisentraut <peter@eisentraut.org> wrote:
>> Attached is a patch to implement this.  It seems to work, but of course
>> it's kind of hard to tell whether it actually does anything useful.
> 
> Header order problem: pg_config_os.h defines __darwin__, but
> pg_config_manual.h is included first, and tests __darwin__.  I hacked
> my way around that, and then made a table of 40,000,000 integers in a
> 2GB buffer pool.  I used "select count(pg_buffercache_evict(buffered))
> from pg_buffer_cache", and "sudo purge", to clear the two layers of
> cache for each test, and then measured:
> 
> maintenance_io_concurrency=0,  ANALYZE: 2311ms
> maintenance_io_concurrency=10, ANALYZE:  652ms
> maintenance_io_concurrency=25, ANALYZE:  389ms
> 
> It works!

Cool!  I'll work on a more polished patch.




Re: macOS prefetching support

From
Thomas Munro
Date:
On Sat, Aug 17, 2024 at 6:58 AM Peter Eisentraut <peter@eisentraut.org> wrote:
> What to do about the order of the symbols and include files.  I threw
> something into src/include/port/darwin.h, but I'm not sure if that's
> good.  Alternatively, we could not use __darwin__ but instead the more
> standard and predefined defined(__APPLE__) && defined(__MACH__).

Hmm.  fd.h and fd.c test for F_NOCACHE, which is pretty closely
related.  Now I'm wondering why we actually need this in
pg_config_manual.h at all.  Who would turn it off at compile time, and
why would they not be satisfied with setting relevant GUCs to 0?  Can
we just teach fd.h to define USE_PREFETCH if
defined(POSIX_FADV_WILLNEED) || defined(F_RDADVISE)?

(I have also thought multiple times about removing the configure
probes for F_FULLFSYNC, and just doing #ifdef.  Oh, that's in my patch
for CF #4453.)

> How to document it.  The current documentation makes references mainly
> to the availability of posix_fadvise().  That seems quite low-level.
> How could a user of a prepared package even find out about that?  Should
> we just say "requires OS support" (kind of like I did here) and you can
> query the effective state by looking at the *_io_concurrency settings?
> Or do we need a read-only parameter that shows whether prefetch support
> exists (kind of along the lines of huge pages)?

I think that's fine.  I don't really like the word "prefetch", could
mean many different things.  What about "requires OS support for
issuing read-ahead advice", which uses a word that appears in both of
those interfaces?  And yeah the GUCs being nailed to zero means you
don't have it.

> (There is also the possibility that we provide an implementation of
> posix_fadvise() for macOS that wraps the platform-specific code in this
> patch.  And then Apple could just take that. ;-) )

Yeah might be simpler... as long as we make sure that we test for
having the function AND the relevant POSIX_FADV_xxx in places, which I
see that we do.



Re: macOS prefetching support

From
Tom Lane
Date:
Thomas Munro <thomas.munro@gmail.com> writes:
> Hmm.  fd.h and fd.c test for F_NOCACHE, which is pretty closely
> related.  Now I'm wondering why we actually need this in
> pg_config_manual.h at all.  Who would turn it off at compile time, and
> why would they not be satisfied with setting relevant GUCs to 0?

+1 for not bothering with a pg_config_manual.h knob.

            regards, tom lane



Re: macOS prefetching support

From
Thomas Munro
Date:
On Sat, Aug 17, 2024 at 6:58 AM Peter Eisentraut <peter@eisentraut.org> wrote:
> solaris         fake

I'm half tempted to suggest that we take this exception out.  If it's
there, we call it.  It doesn't do anything right now, but it's a cheap
empty user space function, and I heard they are thinking about adding
a real syscall.  (In a burst of enthusiasm for standards and stuff, I
talked to people involved in all the OSes using ZFS, to suggest
hooking that up; they added some other stuff I asked about so I think
it's a real threat.)  But I haven't noticed any users on this list in
many years, to opine either way.

It doesn't do anything on Cygwin either.  Actually, it's worse than
nothing, it looks like it makes two useless system calls adjusting the
"sequential" flag on the file handle. But really, of the 3 ways to
build on Windows, only MSVC has real users, so it makes no useless
system calls at all, so I don't propose to change anything due to this
observation.

> win32           no

I guess you could implement this in two ways:

* CreateFileMapping(), MapViewOfFile(), PrefetchVirtualMemory(),
UnmapViewOfFile(), Closehandle().  That's a lot system calls and maybe
 expensive VM stuff, who knows.

* ReadFileEx() in OVERLAPPED mode (async), into a dummy buffer, with a
completion callback that frees the buffer and OVERLAPPED object.
That's a lot of extra work allocating memory, copying to user space,
scheduling a callback, and freeing memory for nothing.

Both are terrible, but likely still better than an I/O stall, I dunno.
I think the VMS, NT, Unix-hater view would be: why would you want such
a stupid programming interface anyway, when you could use async I/O
properly?  I love Unix, but they'd have a point.



Re: macOS prefetching support

From
Peter Eisentraut
Date:
On 17.08.24 00:01, Thomas Munro wrote:
> On Sat, Aug 17, 2024 at 6:58 AM Peter Eisentraut <peter@eisentraut.org> wrote:
>> What to do about the order of the symbols and include files.  I threw
>> something into src/include/port/darwin.h, but I'm not sure if that's
>> good.  Alternatively, we could not use __darwin__ but instead the more
>> standard and predefined defined(__APPLE__) && defined(__MACH__).
> 
> Hmm.  fd.h and fd.c test for F_NOCACHE, which is pretty closely
> related.  Now I'm wondering why we actually need this in
> pg_config_manual.h at all.  Who would turn it off at compile time, and
> why would they not be satisfied with setting relevant GUCs to 0?  Can
> we just teach fd.h to define USE_PREFETCH if
> defined(POSIX_FADV_WILLNEED) || defined(F_RDADVISE)?

I thought USE_PREFETCH existed so that we don't have the run-time 
overhead for all the bookkeeping code if we don't have any OS-level 
prefetch support at the end.  But it looks like most of that bookkeeping 
code is skipped anyway if the *_io_concurrency settings are at 0.  So 
yes, getting rid of USE_PREFETCH globally would be useful.

> (I have also thought multiple times about removing the configure
> probes for F_FULLFSYNC, and just doing #ifdef.  Oh, that's in my patch
> for CF #4453.)

Understandable, but we should be careful here that we don't create 
setups that can cause bugs like 
<https://www.postgresql.org/message-id/48da4a1f-ccd9-4988-9622-24f37b1de2b4@eisentraut.org>.

> I think that's fine.  I don't really like the word "prefetch", could
> mean many different things.  What about "requires OS support for
> issuing read-ahead advice", which uses a word that appears in both of
> those interfaces?

I like that term.




Re: macOS prefetching support

From
Thomas Munro
Date:
On Sat, Aug 24, 2024 at 12:28 AM Peter Eisentraut <peter@eisentraut.org> wrote:
> In terms of $subject, this patch seems sufficient for now.

WFM.  I noticed you don't have an EINTR retry loop, but the man page
doesn't say you need one, so overall this patch LGTM.

+ * posix_fadvise() is the simplest standardized interface that accomplishes
+ * this.  We could add an implementation using libaio in the future; but note
+ * that this API is inappropriate for libaio, which wants to have a buffer
+ * provided to read into.

I would consider just dropping that comment about libaio, if touching
this paragraph.  Proposals exist for AIO of course, but not with
libaio, and better predictions with useful discussion of the topic
seem unlikely to fit in this margin.



Re: macOS prefetching support

From
Peter Eisentraut
Date:
On 26.08.24 07:54, Thomas Munro wrote:
> On Sat, Aug 24, 2024 at 12:28 AM Peter Eisentraut <peter@eisentraut.org> wrote:
>> In terms of $subject, this patch seems sufficient for now.
> 
> WFM.  I noticed you don't have an EINTR retry loop, but the man page
> doesn't say you need one, so overall this patch LGTM.
> 
> + * posix_fadvise() is the simplest standardized interface that accomplishes
> + * this.  We could add an implementation using libaio in the future; but note
> + * that this API is inappropriate for libaio, which wants to have a buffer
> + * provided to read into.
> 
> I would consider just dropping that comment about libaio, if touching
> this paragraph.  Proposals exist for AIO of course, but not with
> libaio, and better predictions with useful discussion of the topic
> seem unlikely to fit in this margin.

committed with that change




Re: macOS prefetching support

From
Thomas Munro
Date:
Oh, I missed something: I think we're missing FileAcces(), in case the
vfd has to be re-opened, no?



Re: macOS prefetching support

From
Peter Eisentraut
Date:
On 29.08.24 03:22, Thomas Munro wrote:
> Oh, I missed something: I think we're missing FileAcces(), in case the
> vfd has to be re-opened, no?

Fixed, thanks.




Re: macOS prefetching support

From
Peter Eisentraut
Date:
On 03.09.24 03:47, Thomas Munro wrote:
> On Mon, Aug 19, 2024 at 1:35 AM Peter Eisentraut <peter@eisentraut.org> wrote:
>> On 17.08.24 00:01, Thomas Munro wrote:
>>> I think that's fine.  I don't really like the word "prefetch", could
>>> mean many different things.  What about "requires OS support for
>>> issuing read-ahead advice", which uses a word that appears in both of
>>> those interfaces?
>>
>> I like that term.
> 
> A couple of other places still use the old specific terminology.  PSA.

looks good to me