Re: Database corruption help - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: Database corruption help
Date
Msg-id 49968023.8030404@enterprisedb.com
Whole thread Raw
In response to Re: Database corruption help  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Tom Lane wrote:
> Aside from the "how did this happen" puzzle, the real point of any
> investigation of course ought to be whether we can make heap_page_prune
> more robust.  At the very least it's undesirable to be leaving the page
> in a state where VACUUM FULL will decide it can't shrink.

I'm as puzzled as you are on how it happened.

The fundamental problem here is that we have HOT-updated tuples that are 
missing the root tuple. heap_prune_chain doesn't know how to remove the 
dead tuples from such chains, and neither does vacuum. What's worse is 
that there's no index pointer to the live tuple in the chain, and even 
VACUUM FULL doesn't fix that.

Could we modify heap_page_prune so that it detects such orphaned HOT 
tuples, and clears the HOT_UPDATED flag for them? That would at least 
let you recover. I'm pretty wary of trying to "fix" things after 
corruption in general since you can't be sure what's correct and what's 
not, but clearing the HOT_UPDATED flag seems safe and more likely to 
help than hurt. Clearing the flag would let the tuple become indexed 
again in a VACUUM FULL or REINDEX, so it would fix the inconsistency 
that otherwise a sequential scan can see the tuple but an index scan can 
not. It would also allow the next heap_page_prune operation to remove 
the remaining dead tuples in the chain.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


pgsql-hackers by date:

Previous
From: KaiGai Kohei
Date:
Subject: Re: Updates of SE-PostgreSQL 8.4devel patches (r1530)
Next
From: Zdenek Kotala
Date:
Subject: Re: [patch] fix for regression tests (locale cs_CZ)