On 14 December 2012 20:15, Greg Smith <greg@2ndquadrant.com> wrote:
> On 12/14/12 3:00 PM, Jeff Davis wrote:
>>
>> After some thought, I don't see much value in introducing multiple
>> instances of corruption at a time. I would think that the smallest unit
>> of corruption would be the hardest to detect, so by introducing many of
>> them in one pass makes it easier to detect.
>
>
> That seems reasonable. It would eliminate a lot of issues with reproducing
> a fault too. I can just print the impacted block number presuming it will
> show up in a log, and make it possible to override picking one at random
> with a command line input.
Discussing this makes me realise that we need a more useful response
than just "your data is corrupt", so user can respond "yes, I know,
I'm trying to save whats left".
We'll need a way of expressing some form of corruption tolerance.
zero_damaged_pages is just insane, much better if we set
corruption_tolerance = N to allow us to skip N corrupt pages before
failing, with -1 meaning keep skipping for ever. Settable by superuser
only.
-- Simon Riggs http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services