Re: Why does PostgreSQL ftruncate before unlink? - Mailing list pgsql-general

From Jon Nelson
Subject Re: Why does PostgreSQL ftruncate before unlink?
Date
Msg-id CAKuK5J0ydiqPHsSQFPoMJAzb_fKLRoY9xgzAi5T8v8uAJdya-A@mail.gmail.com
Whole thread Raw
In response to Re: Why does PostgreSQL ftruncate before unlink?  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Why does PostgreSQL ftruncate before unlink?  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-general
On Sun, Feb 23, 2014 at 9:49 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Jeff Janes <jeff.janes@gmail.com> writes:
>> On Sunday, February 23, 2014, Scott Marlowe <scott.marlowe@gmail.com> wrote:
>>> I'm guessing that this is so that it can be rolled back. Unlink is
>>> likely issued at commit;
>
>> I would hope that ftruncate is issued at commit as well.  That doesn't
>> sound undoable.
>
> It's more subtle than that.  I'm too lazy to look at the comments in md.c
> right now, but basically the reason for not doing an instant unlink is
> to ensure that if a relation is truncated and then re-extended, open file
> pointers held by other backends will still be valid.  The ftruncate is
> done to ensure that allocated disk space goes away as soon as that's safe
> (ie, at commit of the truncation); but immediate unlink would require
> forcing more cross-backend synchronization than we want to have.
>
> If memory serves, the inode should get removed during the next checkpoint.

I was moments away from commenting to say that I had traced the flow
of the code to md.c and found the comments there quite illuminating. I
wonder if there is a different way to solve the underlying issue
without relying on ftruncate (which seems to be somewhat expensive).

--
Jon


pgsql-general by date:

Previous
From: Tom Lane
Date:
Subject: Re: Why does PostgreSQL ftruncate before unlink?
Next
From: Tom Lane
Date:
Subject: Re: Why does PostgreSQL ftruncate before unlink?