retry shm attach for windows (WAS: Re: [HACKERS] OK, so culicidae is*still* broken) - Mailing list pgsql-hackers

From Amit Kapila
Subject retry shm attach for windows (WAS: Re: [HACKERS] OK, so culicidae is*still* broken)
Date
Msg-id CAA4eK1+R6hSx6t_yvwtx+NRzneVp+MRqXAdGJZChcau8Uij-8g@mail.gmail.com
Whole thread Raw
Responses Re: retry shm attach for windows (WAS: Re: [HACKERS] OK, so culicidaeis *still* broken)  (Michael Paquier <michael.paquier@gmail.com>)
List pgsql-hackers
On Sat, May 20, 2017 at 5:56 PM, Noah Misch <noah@leadboat.com> wrote:
> On Sat, Apr 15, 2017 at 02:30:18PM -0700, Andres Freund wrote:
>> On 2017-04-15 17:24:54 -0400, Tom Lane wrote:
>> > Andres Freund <andres@anarazel.de> writes:
>> > > On 2017-04-15 17:09:38 -0400, Tom Lane wrote:
>> > >> Why doesn't Windows' ability to map the segment into the new process
>> > >> before it executes take care of that?
>> >
>> > > Because of ASLR of the main executable (i.e. something like PIE).
>> >
>> > Not following.  Are you saying that the main executable gets mapped into
>> > the process address space immediately, but shared libraries are not?
>
> At the time of the pgwin32_ReserveSharedMemoryRegion() call, the child process
> contains only ntdll.dll and the executable.
>
>> Without PIE/ASLR we can somewhat rely on pgwin32_ReserveSharedMemoryRegion
>> to find the space that PGSharedMemoryCreate allocated still unoccupied.
>
> I've never had access to a Windows system that can reproduce the fork
> failures.  My best theory is that antivirus or similar software injects an
> additional DLL at that early stage.
>
>> > I wonder whether we could work around that by just destroying the created
>> > process and trying again if we get a collision.  It'd be a tad
>> > inefficient, but hopefully collisions wouldn't happen often enough to be a
>> > big problem.
>>
>> That might work, although it's obviously not pretty.
>
> I didn't like that idea when Michael proposed it in 2015.  Since disabling
> ASLR on the exe proved insufficient, I do like it now.  It degrades nicely; if
> something raises the collision rate from 1% to 10%, that just looks like fork
> latency degrading.
>

So it seems both you and Tom are leaning towards some sort of retry
mechanism for shm reattach on Windows.  I also think that is a viable
option to negate the impact of ASLR.  Attached patch does that.  Note
that, as I have mentioned above I think we need to do it for shm
reserve operation as well.  I think we need to decide how many retries
are sufficient before bailing out.  As of now, I have used 10 to have
some similarity with PGSharedMemoryCreate(), but we can choose some
different count as well.  One might say that we can have "number of
retries" as a guc parameter, but I am not sure about it, so not used.
Another point to consider is that do we want the same retry mechanism
for EXEC_BACKEND builds (function
PGSharedMemoryReAttach is different on Windows and EXEC_BACKEND
builds).  I think it makes sense to have retry mechanism for
EXEC_BACKEND builds, so done that way in the patch.  Yet another point
which needs some thought is for reattach operation, before retrying do
we want to reserve the shm by using VirtualAllocEx?

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Attachment

pgsql-hackers by date:

Previous
From: Rafia Sabih
Date:
Subject: Re: [HACKERS] Increasing parallel workers at runtime
Next
From: Mahi Gurram
Date:
Subject: Re: [HACKERS] Regarding Postgres Dynamic Shared Memory (DSA)