Re: 9.4 beta1 crash on Debian sid/i386 - Mailing list pgsql-hackers

From Andres Freund
Subject Re: 9.4 beta1 crash on Debian sid/i386
Date
Msg-id 20140519124313.GA5098@alap3.anarazel.de
Whole thread Raw
In response to Re: 9.4 beta1 crash on Debian sid/i386  (Christoph Berg <christoph.berg@credativ.de>)
Responses Re: 9.4 beta1 crash on Debian sid/i386  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On 2014-05-19 13:53:18 +0200, Christoph Berg wrote:
> I've done some more digging. The problem exists also on plain 32bit
> kernels, not only 64bit running a 32bit userland. (Tested on Debian
> Wheezy's 3.2.57 kernel.)

Too bad.

> Debian/Ubuntu have been using hardened PostgreSQL builds for years
> now, including running the regression tests - apparently we were
> always close to a crash, it just had not happened yet.

There might be some user defined workloads triggering it as well...

> So there's a few points to consider:
> * ASLR leaves only 125MB for brk()-style heap plus stack
> * RLIMIT_STACK is treated as an upper limit, not a reservation
> * PostgreSQL thinks max_stack_depth=2MB plus check_stack_depth() is
>   safe, instead of having a SIGBUS handler
> * PostgreSQL allocates lots of heap using brk() instead of mmap()

* postgres on debian is build with -pie.

> If any of that wouldn't hold, the problem wouln't appear.

> I'm not sure where to go from here. Getting the kernel (or the libc)
> changed seems hard, and that would probably only affect future
> distributions anyway.

Hm, this certainly looks like the kind of bug that should get backported
to -stable et al.

> A short-term fix might be to reduce
> max_stack_depth for the regression tests, which tests the
> functionality, but leaves the problem open for production.
> Implementing a SIGBUS/SIGSEGV handler would probably mean that the
> whole ouch-lets-restart-on-error logic would become ineffective,
> unless we go check with address caused the error and decided if it was
> part of the stack or not.

Meh. I am pretty staunchly set against trying this. This is putting
complex tape over the problem. And we'd have significant problems
discerning the different kinds of SIGBUS errors or such.

Isn't the far more obvious thing ot just not build postgres with -pie on
32bit? It's hardly a security benefit if it allows plain user to crash
the server.
Besides the stack problem, have you measured whether it's viable to use
-pie on 32bit performancewise? That's stuff not that cheap, especially
on 32bit.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



pgsql-hackers by date:

Previous
From: Christoph Berg
Date:
Subject: Re: 9.4 beta1 crash on Debian sid/i386
Next
From: Andres Freund
Date:
Subject: Re: 9.4 beta1 crash on Debian sid/i386