Re: "could not reattach to shared memory" on buildfarm member dory - Mailing list pgsql-hackers

From Tom Lane
Subject Re: "could not reattach to shared memory" on buildfarm member dory
Date
Msg-id 17897.1525108069@sss.pgh.pa.us
Whole thread Raw
In response to Re: "could not reattach to shared memory" on buildfarm member dory  (Noah Misch <noah@leadboat.com>)
Responses Re: "could not reattach to shared memory" on buildfarm member dory
Re: "could not reattach to shared memory" on buildfarm member dory
List pgsql-hackers
[ Thanks to Stephen for cranking up a continuous build loop on dory ]

Noah Misch <noah@leadboat.com> writes:
> On Tue, Apr 24, 2018 at 11:37:33AM +1200, Thomas Munro wrote:
>> Maybe try asking what's mapped there with VirtualQueryEx() on failure?

> +1.  An implementation of that:
> https://www.postgresql.org/message-id/20170403065106.GA2624300%40tornado.leadboat.com

So I tried putting in that code, and it turns the problem from something
that maybe happens in every third buildfarm run or so, to something that
happens at least a dozen times in a single "make check" step.  This seems
to mean that either EnumProcessModules or GetModuleFileNameEx is itself
allocating memory, and sometimes that allocation comes out of the space
VirtualFree just freed :-(.

So we can't use those functions.  We have however proven that no new
module gets loaded during VirtualFree or MapViewOfFileEx, so there
doesn't seem to be anything more to be learned from them anyway.

What it looks like to me is that MapViewOfFileEx allocates some memory and
sometimes that comes out of the wrong place.  This is, um, unfortunate.
It also appears that VirtualFree might sometimes allocate some memory,
and that'd be even more unfortunate, but it's hard to be certain; the
blame might well fail on VirtualQuery instead.  (Ain't Heisenbugs fun?)

The solution I was thinking about last night was to have
PGSharedMemoryReAttach call MapViewOfFileEx to map the shared memory
segment at an unspecified address, then unmap it, then call VirtualFree,
and finally call MapViewOfFileEx with the real target address.  The idea
here is to get these various DLLs to set up any memory allocation pools
they're going to set up before we risk doing VirtualFree.  I am not,
at this point, convinced this will fix it :-( ... but I'm not sure what
else to try.

In any case, it's still pretty unclear why dory is showing this problem
and other buildfarm members are not.  whelk for instance seems to be
loading all the same DLLs and more besides.

            regards, tom lane


pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: Postgres, fsync, and OSs (specifically linux)
Next
From: Heikki Linnakangas
Date:
Subject: BufFileSize() doesn't work on a "shared" BufFiles