Re: doPickSplit stack buffer overflow in XLogInsert? - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: doPickSplit stack buffer overflow in XLogInsert?
Date
Msg-id 53903F83.30107@vmware.com
Whole thread Raw
In response to Re: doPickSplit stack buffer overflow in XLogInsert?  (Andres Freund <andres@2ndquadrant.com>)
List pgsql-hackers
On 05/06/2014 07:36 PM, Andres Freund wrote:
> On 2014-05-06 13:33:01 +0300, Heikki Linnakangas wrote:
>> On 03/31/2014 09:08 PM, Robert Haas wrote:
>>> On Wed, Mar 26, 2014 at 9:45 PM, Peter Geoghegan <pg@heroku.com> wrote:
>>>> On Wed, Nov 27, 2013 at 9:10 AM, Noah Misch <noah@leadboat.com> wrote:
>>>>> The threat is that rounding the read size up to the next MAXALIGN would cross
>>>>> into an unreadable memory page, resulting in a SIGSEGV.  Every palloc chunk
>>>>> has MAXALIGN'd size under the hood, so the excess read of "toDelete" cannot
>>>>> cause a SIGSEGV.  For a stack variable, it depends on the ABI.  I'm not aware
>>>>> of an ABI where the four bytes past the end of this stack variable could be
>>>>> unreadable, which is not to claim I'm well-read on the topic.  We should fix
>>>>> this in due course on code hygiene grounds, but I would not back-patch it.
>>>>
>>>> Attached patch silences the "Invalid read of size n" complaints of
>>>> Valgrind. I agree with your general thoughts around backpatching. Note
>>>> that the patch addresses a distinct complaint from Kevin's, as
>>>> Valgrind doesn't take issue with the invalid reads past the end of
>>>> spgxlogPickSplit variables on the stack.
>>>
>>> Is the needless zeroing this patch introduces apt to cause a
>>> performance problem?
>>>
>>> This function is actually pretty wacky.  If we're stuffing bytes with
>>> undefined contents into the WAL record, maybe the answer isn't to
>>> force the contents of those bytes to be defined, but rather to elide
>>> them from the WAL record.
>>
>> Agreed. I propose the attached, which removes the MAXALIGNs. It's not
>> suitable for backpatching, though, as it changes the format of the WAL
>> record.
> y
> Not knowing anything about spgist this looks sane to me.

Since we did a catversion bump anyway, I revisited this. Committed an 
expanded version of the patch I posted earlier. I went through all the 
SP-GiST WAL record types and removed the alignment requirements of 
tuples from all of them. Most of them didn't have a problem because the 
structs happened to have suitable alignment by accident, but seems best 
to not rely on that, for the sake of consistency and robustness if the 
structs are modified later.

I ran this through my testing tool that compares page image on master 
and standby after every WAL record replay, with full_page_writes on and 
off, and found no errors.

> Do we need a
> backpatchable solution given that we seem to agree that these bugs
> aren't likely to cause harm?

I don't think we do.

- Heikki



pgsql-hackers by date:

Previous
From: Amit Langote
Date:
Subject: Re: Need to backpatch 2985e16 to 9.3 and further (HS regression test out)
Next
From:
Date:
Subject: Re: pg_receivexlog add synchronous mode