Home > mailing lists

Re: [COMMITTERS] pgsql: Properly set relpersistence for fake relcache entries. - Mailing list pgsql-hackers

From	Andres Freund
Subject	Re: [COMMITTERS] pgsql: Properly set relpersistence for fake relcache entries.
Date	September 21, 2012 00:55:11
Msg-id	201209202355.05332.andres@2ndquadrant.com Whole thread Raw
In response to	Re: Re: [COMMITTERS] pgsql: Properly set relpersistence for fake relcache entries. (Tom Lane <tgl@sss.pgh.pa.us>)
Responses	Re: [COMMITTERS] pgsql: Properly set relpersistence for fake relcache entries. (Marko Tiikkaja <pgmail@joh.to>)
List	pgsql-hackers

Tree view

On Monday, September 17, 2012 03:58:37 PM Tom Lane wrote:
> Andres Freund <andres@2ndquadrant.com> writes:
> > Btw, I played with this some more on Saturday and I think, while
> > definitely a bad bug, the actual consequences aren't as bad as at least
> > I initially feared.
> > 
> > Fake relcache entries are currently set in 3 scenarios during recovery:
> > 1. removal of ALL_VISIBLE in heapam.c
> > 2. incomplete splits and incomplete deletions in nbtxlog.c
> > 3. incomplete splits in ginxlog.c
> > [ #1 doesn't really hurt in 9.1, and the others are low probability ]
> 
> OK, that explains why we've not seen a blizzard of trouble reports.
> Still seems like a good idea to fix it ASAP, though.
Btw, I think RhodiumToad/Andrew Gierth and I some time ago helped a user in the 
IRC Channel that had symptoms matching this bug.

Situation was that he started to get very high IO and xid wraparound shutdown 
warnings due to never finishing and not canceleable autovacuums. After some 
investigation it turned out that btree indexes were processed at that time. We 
found they had cyclic btpo_next pointers leading to an endless loop in 
_bt_pagedel.
We solved the issue by forcing leftsib = P_NONE inside the
while (P_ISDELETED(opaque) || opaque->btpo_next != target)
which let a queue DROP INDEX get the necessary locks.

Unfortuantely this was on a busy production system with a nearing shutdown, so 
not much was kept for further diagnosis.

After this bug was discovered I asked the user and indeed they previously 
shutdown the database twice in quick succession during heavy activity with -m 
immediate which could exactly lead to such a problem due to incompletely 
processed page splits.

Greetings,

Andres
-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services

pgsql-hackers by date:

From: Pavel Stehule
Date: 21 September 2012, 00:39:59
Subject: Re: Assigning NULL to a record variable

From: Tom Lane
Date: 21 September 2012, 00:55:26
Subject: Re: Assigning NULL to a record variable

Re: [COMMITTERS] pgsql: Properly set relpersistence for fake relcache entries. - Mailing list pgsql-hackers

Previous

Next