Thread: A note about testing EXEC_BACKEND on recent Linuxen

A note about testing EXEC_BACKEND on recent Linuxen

From
Tom Lane
Date:
I just wasted a couple hours trying to determine why an EXEC_BACKEND
build would randomly fail on Fedora Core 4.  It seems the reason is that
by default, recent Linux kernels randomize the stack base address ---
not by a lot, but enough to cause child processes to sometimes be unable
to attach to the shared memory segment at the same place the postmaster
did.

You can work around this by doing (as root)echo 0 >/proc/sys/kernel/randomize_va_space
before starting the postmaster.  You'll probably want to set it back to
1 when done experimenting with EXEC_BACKEND, since address randomization
is a useful security hack.

Just seems like something that should be in our archives ...
        regards, tom lane


Re: A note about testing EXEC_BACKEND on recent Linuxen

From
Mitchell Skinner
Date:
On Thu, 2006-01-26 at 18:40 -0500, Tom Lane wrote:
> You can work around this by doing (as root)
>     echo 0 >/proc/sys/kernel/randomize_va_space
> before starting the postmaster.  You'll probably want to set it back to
> 1 when done experimenting with EXEC_BACKEND, since address randomization
> is a useful security hack.

I haven't fiddled with this myself, but according to Arjan van de Ven's
post to fedora-devel on July 30th 2005,
> setarch has an -R option to start the binary without randomisation.

...so that you can turn it off per-execution rather than system-wide.
I'm not sure if children inherit the setting.

Mitch



Re: A note about testing EXEC_BACKEND on recent Linuxen

From
Bruce Momjian
Date:
Added code comment:
       /*        *  Attach process to shared data structures.  If testing        *  EXEC_BACKEND on Linux, you must run
thisas root        *  before starting the postmaster:        *        *      echo 0
>/proc/sys/kernel/randomize_va_space       *        *  This prevents a randomized stack base address that causes
* child shared memory to be at a different address than        *  the parent, making it impossible to attached to
shared       *  memory.  Return the value to '1' when finished.        */       CreateSharedMemoryAndSemaphores(false,
0);


---------------------------------------------------------------------------

Tom Lane wrote:
> I just wasted a couple hours trying to determine why an EXEC_BACKEND
> build would randomly fail on Fedora Core 4.  It seems the reason is that
> by default, recent Linux kernels randomize the stack base address ---
> not by a lot, but enough to cause child processes to sometimes be unable
> to attach to the shared memory segment at the same place the postmaster
> did.
> 
> You can work around this by doing (as root)
>     echo 0 >/proc/sys/kernel/randomize_va_space
> before starting the postmaster.  You'll probably want to set it back to
> 1 when done experimenting with EXEC_BACKEND, since address randomization
> is a useful security hack.
> 
> Just seems like something that should be in our archives ...
> 
>             regards, tom lane
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Have you searched our list archives?
> 
>                http://archives.postgresql.org
> 

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: A note about testing EXEC_BACKEND on recent Linuxen

From
Martijn van Oosterhout
Date:
On Wed, Feb 01, 2006 at 10:59:39AM -0500, Bruce Momjian wrote:
>          *  Attach process to shared data structures.  If testing
>          *  EXEC_BACKEND on Linux, you must run this as root
>          *  before starting the postmaster:
>          *
>          *      echo 0 >/proc/sys/kernel/randomize_va_space
>          *
>          *  This prevents a randomized stack base address that causes
>          *  child shared memory to be at a different address than
>          *  the parent, making it impossible to attached to shared
>          *  memory.  Return the value to '1' when finished.

Hmm, are there no other ways that this problem can manifest itself?
ISTM that we're relying completely on the kernel to map it in the same
place each time. Maybe one day someone changes the startup procedure to
allocate some more memory and it gets mapped somewhere else.

A better solution would be to explicitly map it in the child processes.
Basically, store the shared memory base at the beginning of the shared
memory block. Then when the child maps it it can verify the location.
It if doesn't match, unmap and remap at the right place.

The first half should probably at least be implemented to detect the
situation (ERROR: Shared memory block mapped at wrong location). But
given that you could probably solve the problem completely... Ofcourse,
there's the risk the child has already allocated memory where it should
be (the randomize space flag might cause this).

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

Re: A note about testing EXEC_BACKEND on recent Linuxen

From
Tom Lane
Date:
Martijn van Oosterhout <kleptog@svana.org> writes:
> Hmm, are there no other ways that this problem can manifest itself?
> ISTM that we're relying completely on the kernel to map it in the same
> place each time. Maybe one day someone changes the startup procedure to
> allocate some more memory and it gets mapped somewhere else.

In the normal non-EXEC_BACKEND scenario, there's no issue, so I see no
great need to worry about this unduly.

> A better solution would be to explicitly map it in the child processes.
> Basically, store the shared memory base at the beginning of the shared
> memory block. Then when the child maps it it can verify the location.
> It if doesn't match, unmap and remap at the right place.

That is utterly irrelevant to the problem, unfortunately; the shmat
request already specifies where we need to map it, and the problem
arises when that bit of address space is already taken in the child
process.
        regards, tom lane


Re: A note about testing EXEC_BACKEND on recent Linuxen

From
Bruce Momjian
Date:
Tom Lane wrote:
> Martijn van Oosterhout <kleptog@svana.org> writes:
> > Hmm, are there no other ways that this problem can manifest itself?
> > ISTM that we're relying completely on the kernel to map it in the same
> > place each time. Maybe one day someone changes the startup procedure to
> > allocate some more memory and it gets mapped somewhere else.
> 
> In the normal non-EXEC_BACKEND scenario, there's no issue, so I see no
> great need to worry about this unduly.
> 
> > A better solution would be to explicitly map it in the child processes.
> > Basically, store the shared memory base at the beginning of the shared
> > memory block. Then when the child maps it it can verify the location.
> > It if doesn't match, unmap and remap at the right place.
> 
> That is utterly irrelevant to the problem, unfortunately; the shmat
> request already specifies where we need to map it, and the problem
> arises when that bit of address space is already taken in the child
> process.

FYI, the shared memory address was originally relocatable by using
offsets to address it from the child, but now that we use fork() on
Unix, it isn't an issue, and Win32 seems to be OK.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: A note about testing EXEC_BACKEND on recent Linuxen

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> FYI, the shared memory address was originally relocatable by using
> offsets to address it from the child, but now that we use fork() on
> Unix, it isn't an issue, and Win32 seems to be OK.

In the worst case we could go back to using offsets everywhere, but I'm
really reluctant to do that for reasons of code clarity and reliability.
The main problem with it is that you have to explicitly cast to and from
the correct pointer type, which is not only ugly but completely defeats
any chance of the compiler catching wrong-type errors.

What I am seeing on Fedora 4 (I suppose it's common to most recent Linux
versions) is that the system preferentially maps the shared memory
segment just below the stack, which is good from the point of view of
preserving maximum address space for the heap, but not so great if stack
size changes or the stack moves a bit.  It'd be possible to get a little
more flexibility by requesting a non-default attach location for the
postmaster's initial shmat call.  I'm not interested in getting into
that if we don't absolutely have to, but it might be the best answer if
we find ourselves seeing similar issues on specific platforms (ie
Windows) in the future.
        regards, tom lane


Re: A note about testing EXEC_BACKEND on recent Linuxen

From
Bruce Momjian
Date:
Tom Lane wrote:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > FYI, the shared memory address was originally relocatable by using
> > offsets to address it from the child, but now that we use fork() on
> > Unix, it isn't an issue, and Win32 seems to be OK.
> 
> In the worst case we could go back to using offsets everywhere, but I'm
> really reluctant to do that for reasons of code clarity and reliability.
> The main problem with it is that you have to explicitly cast to and from
> the correct pointer type, which is not only ugly but completely defeats
> any chance of the compiler catching wrong-type errors.

As I remember, the offsets also had a performance and/or storage impact
we don't want either.

> What I am seeing on Fedora 4 (I suppose it's common to most recent Linux
> versions) is that the system preferentially maps the shared memory
> segment just below the stack, which is good from the point of view of
> preserving maximum address space for the heap, but not so great if stack
> size changes or the stack moves a bit.  It'd be possible to get a little
> more flexibility by requesting a non-default attach location for the
> postmaster's initial shmat call.  I'm not interested in getting into
> that if we don't absolutely have to, but it might be the best answer if
> we find ourselves seeing similar issues on specific platforms (ie
> Windows) in the future.

Agreed.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073