Re: PANIC in GIN code - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: PANIC in GIN code
Date
Msg-id 5591B3DC.2080304@iki.fi
Whole thread Raw
In response to Re: PANIC in GIN code  (Jeff Janes <jeff.janes@gmail.com>)
Responses Re: PANIC in GIN code  (Jeff Janes <jeff.janes@gmail.com>)
List pgsql-hackers
On 06/29/2015 07:20 PM, Jeff Janes wrote:
> On Mon, Jun 29, 2015 at 1:37 AM, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
>
> Another piece of info here that might be relevant.  Almost all
> UPDATE_META_PAGE xlog records other than the last one have two backup
> blocks.  The last UPDATE_META_PAGE record only has one backup block.
>
> And the metapage is mostly zeros:
>>>
>>> head -c 8192 /tmp/data2_invalid_page/base/16384/16420 | od
>>> 0000000 000000 000000 161020 073642 000000 000000 000000 000000
>>> 0000020 000000 000000 000000 000000 053250 000000 053250 000000
>>> 0000040 006140 000000 000001 000000 000001 000000 000000 000000
>>> 0000060 031215 000000 000452 000000 000000 000000 000000 000000
>>> 0000100 025370 000000 000000 000000 000002 000000 000000 000000
>>> 0000120 000000 000000 000000 000000 000000 000000 000000 000000
>>> *
>>> 0020000
>>>
>>
>> Hmm. Looking at ginRedoUpdateMetapage, I think I see the problem: it
>> doesn't initialize the page. It copies the metapage data, but it doesn't
>> touch the page headers. The only way I can see that that would cause
>> trouble is if the index somehow got truncated away or removed in the
>> standby. That could happen in crash recovery, if you drop the index and the
>> crash, but that should be harmless, because crash recovery doesn't try to
>> read the metapage, only update it (by overwriting it), and by the time
>> crash recovery has completed, the index drop is replayed too.
>>
>> But AFAICS that bug is present in earlier versions too.

Nope, looking closer, in previous versions the page was always read from 
disk, even though we're overwriting it. That was made smarter in 9.5, by 
using the ZERO_OR_LOCK mode, but that means that the page headers indeed 
need to be initialized.

> Yes, I did see this error reported previously but it was always after the
> first appearance of the PANIC, so I assumed it was a sequella to that and
> didn't investigate it further at that time.
>
>> Can you reproduce this easily? How?
>
> I can reproduce it fairly easy.
>
> [instructions to reproduce]

I was actually not able to reproduce it that way, but came up with a 
much simpler method. The problem occurs when the metapage update record 
WAL record is replayed, and the metapage is not in the page cache yet. 
Usually it is, as if you do pretty much anything at all with the index, 
the metapage stays in cache. But VACUUM produces a metapage update 
record to update the stats, and it's pretty easy to arrange things so 
that that's the first record after checkpoint, and start recover from 
that checkpoint:

postgres=# create table foo (t text[]);
CREATE TABLE
postgres=# insert into foo values ('{foo}');
INSERT 0 1
postgres=# create index i_foo on foo using gin (t);
CREATE INDEX
postgres=# vacuum foo;
VACUUM
postgres=# vacuum foo;
VACUUM
postgres=# checkpoint;
CHECKPOINT
postgres=# vacuum foo;
VACUUM

Now kill -9 postmaster, and restart. Voila, the page headers are all zeros:

postgres=# select * from page_header(get_raw_page('i_foo', 0));    lsn    | checksum | flags | lower | upper | special
|pagesize | 
 
version |
prune_xid
-----------+----------+-------+-------+-------+---------+----------+---------+-
---------- 0/1891270 |        0 |     0 |     0 |     0 |       0 |        0 |     0 |        0
(1 row)

postgres=# select * from gin_metapage_info(get_raw_page('i_foo', 
0));ERROR:  input page is not a GIN metapage
DETAIL:  Flags 0189, expected 0008

I just pushed a fix for this, but unfortunately it didn't make it 9.5alpha1.

- Heikki



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Solaris testers wanted for strxfrm() behavior
Next
From: Jim Nasby
Date:
Subject: Re: pg_stat_*_columns?