Re: [PATCH 1/1] Initial mach based shared memory support. - Mailing list pgsql-hackers

From James Hilliard
Subject Re: [PATCH 1/1] Initial mach based shared memory support.
Date
Msg-id CADvTj4ocq2M-PFg=3PFb7qqJp+KoGP+VcVVcUY+SiwrKLzp+bg@mail.gmail.com
Whole thread Raw
In response to Re: [PATCH 1/1] Initial mach based shared memory support.  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: [PATCH 1/1] Initial mach based shared memory support.
List pgsql-hackers
On Mon, Jan 18, 2021 at 5:29 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> James Hilliard <james.hilliard1@gmail.com> writes:
> > OSX implements sysv shmem support via a mach wrapper, however the mach
> > sysv shmem wrapper has some severe restrictions that prevent us from
> > allocating enough memory segments in some cases.
> > ...
> > In order to avoid hitting these limits we can bypass the wrapper layer
> > and just use mach directly.
>
> I wanted to review this, but it's impossible because the kernel calls
> you're using have you've-got-to-be-kidding documentation like this:
>
> https://developer.apple.com/documentation/kernel/1402504-mach_vm_page_info?language=objc
FYI there's a good chance I'm not using some of these info API's correctly,
from my understanding due to mach semantics a number of the sanity checks
we are doing for sysv shmem are probably unnecessary when using mach
API's due to the mach port task bindings(mach_task_self()) effectively
ensuring our maps are already task bound and won't conflict with other tasks.

The mach vm API for our use case does appear to be superior to the shmem API
on top of not being bound by the sysv wrapper limits due to it being
natively task
bound as opposed to normal sysv shared memory maps which appears to be
system wide global maps and thus requires lots of key conflict checks which
should be entirely unnecessary when using task bound mach vm maps.

For example PGSharedMemoryIsInUse() from my understanding should always
return false for the mach vm version as it is already task scoped.

I'm not 100% sure here as I'm not very familiar with all the ways postgres uses
shared memory however. Is it true that postgresql shared memory is strictly used
by child processes fork()'d from the parent postmaster? If that is the
case I think
as long as we ensure all our map operations are bound to the parent postmaster
mach_task_self() port then we should never conflict with another postmaster's
process tree's memory maps.

I tried to largely copy existing sysv shmem semantics in this first
proof of concept as
I'm largely unfamiliar with postgresql internals but I suspect we can
safely remove
much of the map key conflict checking code entirely.

Here's a few additional references I found which may be helpful.

This one is more oriented for kernel programming, however much of the info is
relevant to the userspace interface as well it would appear:

https://developer.apple.com/library/archive/documentation/Darwin/Conceptual/KernelProgramming/vm/vm.html#//apple_ref/doc/uid/TP30000905-CH210-TPXREF109

This book gives an overview of the mach userspace interface as well, although
I don't think it's a fully complete API reference, but it covers some
of the basic use
cases:
https://flylib.com/books/en/3.126.1.89/1/

Apparently the hurd kernel uses a virtual memory interface that is based on
the mach kernel, and it has more detailed API docs, note that these tend to
not have the "mach_" prefix but appear to largely function the same:
https://www.gnu.org/software/hurd/gnumach-doc/Virtual-Memory-Interface.html

There is additional documentation from the OSFMK kernel(which was combined with
XNU by apple from my understanding) here related to the virtual memory
interface,
like the herd docs these tend to not have the "mach_" prefix:
http://web.mit.edu/darwin/src/modules/xnu/osfmk/man/
>
> Google finds the source code too, but that's equally bereft of useful
> documentation.  So I wonder where you obtained the information necessary
> to write this patch.
>
> I did notice however that mach_shmem.c seems to be 95% copy-and-paste from
> sysv_shmem.c, including a lot of comments that don't feel particularly
> true or relevant here.  I wonder whether we need a separate file as
> opposed to a few #ifdef's around the kernel calls.
>
>                         regards, tom lane



pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: Parallel INSERT (INTO ... SELECT ...)
Next
From: Bharath Rupireddy
Date:
Subject: Re: [patch] Help information for pg_dump