All-zero page in GIN index causes assertion failure - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject All-zero page in GIN index causes assertion failure
Date
Msg-id 55ACADCD.6020206@iki.fi
Whole thread Raw
Responses Re: All-zero page in GIN index causes assertion failure
List pgsql-hackers
This is a continuation of the discussion at 
http://www.postgresql.org/message-id/CAMkU=1zUc=h0oCZntaJaqqW7gxxVxCWsYq8DD2t7oHgsgVEsgA@mail.gmail.com, 
I'm starting a new thread as this is a separate issue than the original 
LWLock bug.

> On Thu, Jul 16, 2015 at 12:03 AM, Jeff Janes <jeff.janes@gmail.com> wrote:
>
>> On Wed, Jul 15, 2015 at 8:44 AM, Heikki Linnakangas <hlinnaka@iki.fi>
>> wrote:
>>
>> I don't see how this is related to the LWLock issue, but I didn't see it
>> without your patch.  Perhaps the system just didn't survive long enough to
>> uncover it without the patch (although it shows up pretty quickly).  It
>> could just be an overzealous Assert, since the casserts off didn't show
>> problems.
>
>> bt and bt full are shown below.
>>
>> Cheers,
>>
>> Jeff
>>
>> #0  0x0000003dcb632625 in raise () from /lib64/libc.so.6
>> #1  0x0000003dcb633e05 in abort () from /lib64/libc.so.6
>> #2  0x0000000000930b7a in ExceptionalCondition (
>>     conditionName=0x9a1440 "!(((PageHeader) (page))->pd_special >=
>> (__builtin_offsetof (PageHeaderData, pd_linp)))", errorType=0x9a12bc
>> "FailedAssertion",
>>     fileName=0x9a12b0 "ginvacuum.c", lineNumber=713) at assert.c:54
>> #3  0x00000000004947cf in ginvacuumcleanup (fcinfo=0x7fffee073a90) at
>> ginvacuum.c:713
>>
>
> It now looks like this *is* unrelated to the LWLock issue.  The assert that
> it is tripping over was added just recently (302ac7f27197855afa8c) and so I
> had not been testing under its presence until now.  It looks like it is
> finding all-zero pages (index extended but then a crash before initializing
> the page?) and it doesn't like them.
>
> (gdb) f 3
> (gdb) p *(char[8192]*)(page)
> $11 = '\000' <repeats 8191 times>
>
> Presumably before this assert, such pages would just be permanently
> orphaned.

Yeah, so it seems. It's normal to have all-zero pages in the index, if 
you crash immediately after the relation has been extended, but before 
the new page has been WAL-logged. What is your test case like; did you 
do crash-testing?

ISTM ginvacuumcleanup should check for PageIsNew, and put the page to 
the FSM. That's what btvacuumpage() gistvacuumcleanup() do. 
spgvacuumpage() seems to also check for PageIsNew(), but it seems broken 
in a different way: it initializes the page and marks the page as dirty, 
but it is not WAL-logged. That is a problem at least if checksums are 
enabled: if you crash you might have a torn page on disk, with invalid 
checksum.

- Heikki



pgsql-hackers by date:

Previous
From: Haribabu Kommi
Date:
Subject: Re: Parallel Seq Scan
Next
From: Alvaro Herrera
Date:
Subject: Re: [COMMITTERS] pgsql: Retain comments on indexes and constraints at ALTER TABLE ... TY