Home > mailing lists

Re: DropRelFileNodeBuffers API change (was Re: [BUGS] BUG #5599: Vacuum fails due to index corruption issues) - Mailing list pgsql-hackers

From	Robert Haas
Subject	Re: DropRelFileNodeBuffers API change (was Re: [BUGS] BUG #5599: Vacuum fails due to index corruption issues)
Date	August 15, 2010 22:58:26
Msg-id	AANLkTik3=YZTCcuZ_feJ0vrTdN1rN3gf4dR0Z6950wi0@mail.gmail.com Whole thread
In response to	Re: DropRelFileNodeBuffers API change (was Re: [BUGS] BUG #5599: Vacuum fails due to index corruption issues) (Tom Lane <tgl@sss.pgh.pa.us>)
Responses	Re: DropRelFileNodeBuffers API change (was Re: [BUGS] BUG #5599: Vacuum fails due to index corruption issues)
List	pgsql-hackers

Tree view

On Sun, Aug 15, 2010 at 9:25 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> Oh, sorry.  I was thinking we were talking about complete truncation
>> rather than partial truncation.  I'm still pretty unhappy with the
>> proposed fix, though, because it gives up performance in a broad range
>> of cases to cater to an extremely narrow failure case.
>
> It doesn't matter: broken is broken, and failure to recover from a
> truncate() error is broken.  You mustn't think that this is a
> Windows-only problem.
>
>> Considering
>> the rarity of the proposed problem, are we sure that it isn't better
>> to adopt a solution like what Heikki proposed?  If truncation fails,
>> try to zero the pages; if that also fails, PANIC.
>
> ... and PANIC is absolutely, entirely, 100% unacceptable here.  I don't
> think you understand the context.  We've already written the truncate
> action to WAL (as we must: WAL before data change).  If we PANIC, that
> means we'll PANIC again during WAL replay.  So that solution means a
> down, and perhaps unrecoverably broken, database.

All right, that would be bad.

> There's also the
> little problem that "zero the page" is only a safe recovery action for
> heap pages not index pages; smgrtruncate is certainly not entitled to
> assume its working on a heap.

This doesn't seem right to me; we don't WAL log relation extension,
even on an index.

> I think the performance argument is based on fear not facts anyway.
> In practice, in modern installations, you're only going to be talking
> about marginally more work in autovacuum.  ISTM we should fix the bug,
> in a simple/reliable/backpatchable way, and then anyone who's worried
> about performance can measure the size of the problem, and try to
> think of a workable solution if he doesn't like the results.

So, what's the worst case for this bug?  "DELETE FROM table; VACUUM table;"?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

pgsql-hackers by date:

From: Charles Pritchard
Date: 15 August 2010, 22:39:29
Subject: Re: JSON Patch for PostgreSQL - BSON Support?

From: Tom Lane
Date: 15 August 2010, 23:20:22
Subject: Re: patch: utf8_to_unicode (trivial)

Re: DropRelFileNodeBuffers API change (was Re: [BUGS] BUG #5599: Vacuum fails due to index corruption issues) - Mailing list pgsql-hackers

Previous

Next