Re: DropRelFileNodeBuffers API change (was Re: [BUGS] BUG #5599: Vacuum fails due to index corruption issues) - Mailing list pgsql-hackers

From Robert Haas
Subject Re: DropRelFileNodeBuffers API change (was Re: [BUGS] BUG #5599: Vacuum fails due to index corruption issues)
Date
Msg-id AANLkTimcna8WUNfEGpAzpp5PEJ5pX+zP072NJXWTaN-D@mail.gmail.com
Whole thread Raw
In response to DropRelFileNodeBuffers API change (was Re: [BUGS] BUG #5599: Vacuum fails due to index corruption issues)  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: DropRelFileNodeBuffers API change (was Re: [BUGS] BUG #5599: Vacuum fails due to index corruption issues)
List pgsql-hackers
On Sun, Aug 15, 2010 at 2:58 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> In the discussion of bug #5599 we pretty much agreed to do this:
>> Seems like we need to think harder about recovering from a truncate
>> failure.  A few random ideas:
>> 1. Write the dirty buffers before dropping them.  Kind of ugly from a
>> performance viewpoint, but simple and safe.
>
> I looked at making this happen, and noted that DropRelFileNodeBuffers
> is used both for the truncation case and for dropping relation buffers
> during smgrdounlink.  In the latter case, it's still appropriate to
> drop dirty buffers without writing them, both for performance reasons
> and because we don't really care about any errors: we have already
> committed the relation DROP, and are not going to look at the file
> contents again in any case.  So this means that two different behaviors
> are now required for truncation and dropping.
>
> The cleanest fix is an API change to add a boolean write-or-not
> parameter to DropRelFileNodeBuffers.  That's what I want to do in HEAD
> and 9.0, but I'm unsure whether it's a safe fix in the back branches.
> Does anyone have an opinion whether it's likely that any third-party
> code is calling DropRelFileNodeBuffers directly?  If there is, then
> changing its API in a minor release would be an unfriendly thing to do.
> We could avoid that by some ugly expedient like inserting a second copy
> of the function in back branches.
>
> Comments?

I really hate this solution, because writing out data that we're about
to throw away just in case we can't actually throw it away seems like
a real waste from a performance standpoint.  Could we avoid this
altogether by allocating a new relfilenode on truncate?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: LockDatabaseObject vs. LockSharedObject
Next
From: Greg Stark
Date:
Subject: Re: DropRelFileNodeBuffers API change (was Re: [BUGS] BUG #5599: Vacuum fails due to index corruption issues)