Re: DSM robustness failure (was Re: Peripatus/failures) - Mailing list pgsql-hackers

From Larry Rosenman
Subject Re: DSM robustness failure (was Re: Peripatus/failures)
Date
Msg-id 20181017225030.hbyktqed2t4vig34@ler-imac.local
Whole thread Raw
In response to Re: DSM robustness failure (was Re: Peripatus/failures)  (Thomas Munro <thomas.munro@enterprisedb.com>)
Responses Re: DSM robustness failure (was Re: Peripatus/failures)  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Thu, Oct 18, 2018 at 11:08:33AM +1300, Thomas Munro wrote:
> On Thu, Oct 18, 2018 at 9:43 AM Thomas Munro
> <thomas.munro@enterprisedb.com> wrote:
> > On Thu, Oct 18, 2018 at 9:00 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > > I would argue that both dsm_postmaster_shutdown and dsm_postmaster_startup
> > > are broken here; the former because it makes no attempt to unmap
> > > the old control segment (which it oughta be able to do no matter how badly
> > > broken the contents are), and the latter because it should not let
> > > garbage old state prevent it from establishing a valid new segment.
> >
> > Looking.
>
> (CCing Amit Kapila)
>
> To reproduce this, I attached lldb to a backend and did "mem write
> &dsm_control->magic 42", and then delivered SIGKILL to the backend.
> Here's one way to fix it.  I think we have no choice but to leak the
> referenced segments, but we can free the control segment.  See
> comments in the attached patch for rationale.
>
On the original failure, I recompiled and reinstalled the 2 Python's I
have on this box, and at least 9.3 went back to OK.


> --
> Thomas Munro
> http://www.enterprisedb.com



--
Larry Rosenman                     http://www.lerctr.org/~ler
Phone: +1 214-642-9640                 E-Mail: ler@lerctr.org
US Mail: 5708 Sabbia Drive, Round Rock, TX 78665-2106

Attachment

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: MSVC compilers complain about snprintf
Next
From: Tom Lane
Date:
Subject: Re: DSM robustness failure (was Re: Peripatus/failures)