Re: Avoiding unnecessary writes during relation drop and - Mailing list pgsql-hackers
From | Simon Riggs |
---|---|
Subject | Re: Avoiding unnecessary writes during relation drop and |
Date | |
Msg-id | 1111332420.11750.359.camel@localhost.localdomain Whole thread Raw |
In response to | Avoiding unnecessary writes during relation drop and truncate (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: Avoiding unnecessary writes during relation drop and truncate
|
List | pgsql-hackers |
On Sat, 2005-03-19 at 18:53 -0500, Tom Lane wrote: > Currently, in places like heap_drop_with_catalog, we issue a > FlushRelationBuffers() call followed by smgrscheduleunlink(). > The latter doesn't actually do anything right away, but schedules > a file unlink to occur after transaction commit. > > It strikes me that the FlushRelationBuffers call is unnecessary and > causes useless I/O, namely writing out pages into a file that's > about to be deleted anyway. If we simply removed it then any buffers > belonging to the victim relation would stay in memory until commit; > then they'd be dropped *without* write by the smgr unlink operation > (which already calls DropRelFileNodeBuffers). > > This doesn't cause any problems with rolling back the transaction before > commit; we can perfectly well leave dirty pages in the buffer pool in > that case. About the only downside I can see is that the Flush allows > buffer pages to be freed slightly sooner, and hence possibly used for > something else later in the same transaction ... but that's hardly worth > the cost of writing data that might not need to be written at all. > > Similar remarks apply to the partial FlushRelationBuffers calls that are > currently done just before partial or full truncation of a relation --- > except that those are even sillier, because we are writing data that we > are definitely going to tell the kernel to forget about immediately > afterward. We should just drop any buffers that are past the truncation > point. smgrtruncate isn't roll-back-able anyway, so the caller already > has to be certain that the pages aren't going to be needed anymore > regardless of any subsequent rollback. > > Can anyone see a flaw in this logic? > > I think that the FlushRelationBuffers calls associated with deletion > are leftover from a time when we actually deleted the target file > immediately (ie, back when DROP TABLE wasn't rollback-safe). The > ones associated with truncation were probably just modeled on the > deletion logic without sufficient thought. Yes, I think FlushRelationBuffers can be simply removed. I'd wanted to get rid of it before, but hadn't seen how to. Not sure I understand all of your other comments though. If all you mean to do is to simply remove the call, then please ignore this: ISTM that buffers belonging to the victim relation would not necessarily stay in memory. If they were pinned still, then there would be a lock that would have prevented the DROP TABLE from going through. The buffers are not pinned and so will stay in memory until aged out by the dirty write process. You're right, there's no particular benefit of doing them earlier - we have a bgwriter now that can do that for us when it comes to it. Removing FlushRelationBuffers in those circumstances will save a scan of shared_buffers, but will it save I/O? Perhaps not, but I care more about the O(N) operation on shared_buffers than I do about the I/O. Best Regards, Simon Riggs
pgsql-hackers by date: