On Sun, Aug 15, 2010 at 9:25 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> Oh, sorry. I was thinking we were talking about complete truncation
>> rather than partial truncation. I'm still pretty unhappy with the
>> proposed fix, though, because it gives up performance in a broad range
>> of cases to cater to an extremely narrow failure case.
>
> It doesn't matter: broken is broken, and failure to recover from a
> truncate() error is broken. You mustn't think that this is a
> Windows-only problem.
>
>> Considering
>> the rarity of the proposed problem, are we sure that it isn't better
>> to adopt a solution like what Heikki proposed? If truncation fails,
>> try to zero the pages; if that also fails, PANIC.
>
> ... and PANIC is absolutely, entirely, 100% unacceptable here. I don't
> think you understand the context. We've already written the truncate
> action to WAL (as we must: WAL before data change). If we PANIC, that
> means we'll PANIC again during WAL replay. So that solution means a
> down, and perhaps unrecoverably broken, database.
All right, that would be bad.
> There's also the
> little problem that "zero the page" is only a safe recovery action for
> heap pages not index pages; smgrtruncate is certainly not entitled to
> assume its working on a heap.
This doesn't seem right to me; we don't WAL log relation extension,
even on an index.
> I think the performance argument is based on fear not facts anyway.
> In practice, in modern installations, you're only going to be talking
> about marginally more work in autovacuum. ISTM we should fix the bug,
> in a simple/reliable/backpatchable way, and then anyone who's worried
> about performance can measure the size of the problem, and try to
> think of a workable solution if he doesn't like the results.
So, what's the worst case for this bug? "DELETE FROM table; VACUUM table;"?
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company