Re: [HACKERS] OK, so culicidae is *still* broken - Mailing list pgsql-hackers

From Tom Lane
Subject Re: [HACKERS] OK, so culicidae is *still* broken
Date
Msg-id 32559.1492278226@sss.pgh.pa.us
Whole thread Raw
In response to Re: [HACKERS] OK, so culicidae is *still* broken  (Andres Freund <andres@anarazel.de>)
Responses Re: [HACKERS] OK, so culicidae is *still* broken
List pgsql-hackers
Andres Freund <andres@anarazel.de> writes:
> On April 14, 2017 9:42:41 PM PDT, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> 2017-04-15 04:31:21.657 GMT [16792] FATAL:  could not reattach to
>> shared memory (key=6280001, addr=0x7f692fece000): Invalid argument
>>
>> Presumably, this is the same issue we've seen on Windows where the
>> shmem address range gets overlapped by code loaded at a randomized
>> address.  Is there any real hope of making that work?

> Seems to work reasonably regularly on other branches... On phone only, so can't dig into details, but it seems
there'ssome chance involved.  Let's see what the next few runs will do.  Will crank frequency once home. 

I poked at this on a Fedora 25 box, and was able to reproduce failures at
a rate of one every half dozen or so runs of the core regression tests,
which seems to about match what is happening on culicidae.

Looking at the postmaster's memory map, it seems that shmem segments
get mapped in the same part of the address space as shared libraries,
ie they all end up in 0x00007Fxxxxxxxxxx.  So it's not terribly
surprising that there's a risk of collision with a shared library.

I think what may be the most effective way to proceed is to provide
a way to force the shmem segment to be mapped at a chosen address.
It looks like, at least on x86_64 Linux, mapping shmem at
0x00007E0000000000 would work reliably.

Since we only care about this for testing purposes, I don't think
it has to be done in any very clean or even documented way.
I'm inclined to propose that we put something into sysv_shmem.c
that will check for an environment variable named, say, PG_SHMEM_ADDR,
and if it's set will use the value as the address in the initial
shmat() call.  For a bit of extra safety we could do that only in
EXEC_BACKEND builds.

Then you'd just need to add PG_SHMEM_ADDR=0x7E0000000000 to
culicidae's build_env and you'd be good to go.
        regards, tom lane



pgsql-hackers by date:

Previous
From: Thomas Mercieca
Date:
Subject: [HACKERS] Extracting GiST index structure stats?
Next
From: Tom Lane
Date:
Subject: Re: [HACKERS] Self-signed certificate instructions