Re: DSM robustness failure (was Re: Peripatus/failures) - Mailing list pgsql-hackers

From Larry Rosenman
Subject Re: DSM robustness failure (was Re: Peripatus/failures)
Date
Msg-id 20181018050231.d4xt3or5wg2g2npo@ler-imac.local
Whole thread Raw
In response to Re: DSM robustness failure (was Re: Peripatus/failures)  (Larry Rosenman <ler@lerctr.org>)
Responses Re: DSM robustness failure (was Re: Peripatus/failures)  (Thomas Munro <thomas.munro@enterprisedb.com>)
List pgsql-hackers
On Wed, Oct 17, 2018 at 08:19:52PM -0500, Larry Rosenman wrote:
> On Thu, Oct 18, 2018 at 02:17:14PM +1300, Thomas Munro wrote:
> > On Thu, Oct 18, 2018 at 1:10 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > > ... However, I'm still slightly interested in how it
> > > was that that broke DSM so thoroughly ...
> >
> > Me too.  Frustratingly, that vm object might still exist on Larry's
> > machine if it hasn't been rebooted (since we failed to shm_unlink()
> > it), so if we knew its name we could write a program to shm_open(),
> > mmap(), dump out to a file for analysis and then we could work out
> > which of the sanity tests it failed and maybe get some clues.
> > Unfortunately it's not in any of our logs AFAIK, and I can't see any
> > way to get a list of existing shm_open() objects from the kernel.
> > From sys/kern/uipc_shm.c:
> >
> >  * TODO:
> >  *
> >  * (1) Need to export data to a userland tool via a sysctl.  Should ipcs(1)
> >  *     and ipcrm(1) be expanded or should new tools to manage both POSIX
> >  *     kernel semaphores and POSIX shared memory be written?
> >
> > Gah.  So basically that's hiding in shm_dictionary in the kernel and I
> > don't know a way to look at it from userspace (other than trying to
> > open all 2^32 random paths we're capable of generating).
>
> It has *NOT* been rebooted.  I can give y'all id's if you want to go
> poking around.
Let me know soon(ish) if any of you want to poke at this machine, as I'm
likely to forget and reboot it.....


>
>
> >
> > --
> > Thomas Munro
> > http://www.enterprisedb.com
>
> --
> Larry Rosenman                     http://www.lerctr.org/~ler
> Phone: +1 214-642-9640                 E-Mail: ler@lerctr.org
> US Mail: 5708 Sabbia Drive, Round Rock, TX 78665-2106



--
Larry Rosenman                     http://www.lerctr.org/~ler
Phone: +1 214-642-9640                 E-Mail: ler@lerctr.org
US Mail: 5708 Sabbia Drive, Round Rock, TX 78665-2106

Attachment

pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: Checkpoint start logging is done inside critical section
Next
From: Amit Kapila
Date:
Subject: Re: Checkpoint start logging is done inside critical section