Re: DSM robustness failure (was Re: Peripatus/failures) - Mailing list pgsql-hackers

From Thomas Munro
Subject Re: DSM robustness failure (was Re: Peripatus/failures)
Date
Msg-id CAEepm=2Jc0F50Q_HVrOCzBAJakogC96FkPL57V7H9C=NRgv3bg@mail.gmail.com
Whole thread Raw
In response to Re: DSM robustness failure (was Re: Peripatus/failures)  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: DSM robustness failure (was Re: Peripatus/failures)  (Larry Rosenman <ler@lerctr.org>)
Re: DSM robustness failure (was Re: Peripatus/failures)  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Thu, Oct 18, 2018 at 1:10 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> ... However, I'm still slightly interested in how it
> was that that broke DSM so thoroughly ...

Me too.  Frustratingly, that vm object might still exist on Larry's
machine if it hasn't been rebooted (since we failed to shm_unlink()
it), so if we knew its name we could write a program to shm_open(),
mmap(), dump out to a file for analysis and then we could work out
which of the sanity tests it failed and maybe get some clues.
Unfortunately it's not in any of our logs AFAIK, and I can't see any
way to get a list of existing shm_open() objects from the kernel.
From sys/kern/uipc_shm.c:

 * TODO:
 *
 * (1) Need to export data to a userland tool via a sysctl.  Should ipcs(1)
 *     and ipcrm(1) be expanded or should new tools to manage both POSIX
 *     kernel semaphores and POSIX shared memory be written?

Gah.  So basically that's hiding in shm_dictionary in the kernel and I
don't know a way to look at it from userspace (other than trying to
open all 2^32 random paths we're capable of generating).

-- 
Thomas Munro
http://www.enterprisedb.com


pgsql-hackers by date:

Previous
From: Larry Rosenman
Date:
Subject: Re: DSM robustness failure (was Re: Peripatus/failures)
Next
From: Larry Rosenman
Date:
Subject: Re: DSM robustness failure (was Re: Peripatus/failures)