Robert Haas <robertmhaas@gmail.com> writes:
> Oh, sorry. I was thinking we were talking about complete truncation
> rather than partial truncation. I'm still pretty unhappy with the
> proposed fix, though, because it gives up performance in a broad range
> of cases to cater to an extremely narrow failure case.
It doesn't matter: broken is broken, and failure to recover from a
truncate() error is broken. You mustn't think that this is a
Windows-only problem.
> Considering
> the rarity of the proposed problem, are we sure that it isn't better
> to adopt a solution like what Heikki proposed? If truncation fails,
> try to zero the pages; if that also fails, PANIC.
... and PANIC is absolutely, entirely, 100% unacceptable here. I don't
think you understand the context. We've already written the truncate
action to WAL (as we must: WAL before data change). If we PANIC, that
means we'll PANIC again during WAL replay. So that solution means a
down, and perhaps unrecoverably broken, database. There's also the
little problem that "zero the page" is only a safe recovery action for
heap pages not index pages; smgrtruncate is certainly not entitled to
assume its working on a heap.
I think the performance argument is based on fear not facts anyway.
In practice, in modern installations, you're only going to be talking
about marginally more work in autovacuum. ISTM we should fix the bug,
in a simple/reliable/backpatchable way, and then anyone who's worried
about performance can measure the size of the problem, and try to
think of a workable solution if he doesn't like the results.
regards, tom lane