Re: Avoiding unnecessary writes during relation drop and - Mailing list pgsql-hackers

From Simon Riggs
Subject Re: Avoiding unnecessary writes during relation drop and
Date
Msg-id 1111332420.11750.359.camel@localhost.localdomain
Whole thread Raw
In response to Avoiding unnecessary writes during relation drop and truncate  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Avoiding unnecessary writes during relation drop and truncate  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Sat, 2005-03-19 at 18:53 -0500, Tom Lane wrote:
> Currently, in places like heap_drop_with_catalog, we issue a
> FlushRelationBuffers() call followed by smgrscheduleunlink().
> The latter doesn't actually do anything right away, but schedules
> a file unlink to occur after transaction commit.
> 
> It strikes me that the FlushRelationBuffers call is unnecessary and
> causes useless I/O, namely writing out pages into a file that's
> about to be deleted anyway.  If we simply removed it then any buffers
> belonging to the victim relation would stay in memory until commit;
> then they'd be dropped *without* write by the smgr unlink operation
> (which already calls DropRelFileNodeBuffers).
> 
> This doesn't cause any problems with rolling back the transaction before
> commit; we can perfectly well leave dirty pages in the buffer pool in
> that case.  About the only downside I can see is that the Flush allows
> buffer pages to be freed slightly sooner, and hence possibly used for
> something else later in the same transaction ... but that's hardly worth
> the cost of writing data that might not need to be written at all.
> 
> Similar remarks apply to the partial FlushRelationBuffers calls that are
> currently done just before partial or full truncation of a relation ---
> except that those are even sillier, because we are writing data that we
> are definitely going to tell the kernel to forget about immediately
> afterward.  We should just drop any buffers that are past the truncation
> point.  smgrtruncate isn't roll-back-able anyway, so the caller already
> has to be certain that the pages aren't going to be needed anymore
> regardless of any subsequent rollback.
> 
> Can anyone see a flaw in this logic?
> 
> I think that the FlushRelationBuffers calls associated with deletion
> are leftover from a time when we actually deleted the target file
> immediately (ie, back when DROP TABLE wasn't rollback-safe).  The
> ones associated with truncation were probably just modeled on the
> deletion logic without sufficient thought.

Yes, I think FlushRelationBuffers can be simply removed. I'd wanted to
get rid of it before, but hadn't seen how to.

Not sure I understand all of your other comments though. If all you mean
to do is to simply remove the call, then please ignore this:

ISTM that buffers belonging to the victim relation would not necessarily
stay in memory. If they were pinned still, then there would be a lock
that would have prevented the DROP TABLE from going through. The buffers
are not pinned and so will stay in memory until aged out by the dirty
write process. You're right, there's no particular benefit of doing them
earlier - we have a bgwriter now that can do that for us when it comes
to it.

Removing FlushRelationBuffers in those circumstances will save a scan of
shared_buffers, but will it save I/O? Perhaps not, but I care more about
the O(N) operation on shared_buffers than I do about the I/O.

Best Regards, Simon Riggs







pgsql-hackers by date:

Previous
From: Simon Riggs
Date:
Subject: Re: GUC variable for setting number of local buffers
Next
From: Bernd Helmle
Date:
Subject: Re: rewriter in updateable views