VACUUM and spoiling the buffer manager cache - Mailing list pgsql-hackers

From Simon Riggs
Subject VACUUM and spoiling the buffer manager cache
Date
Msg-id 1172660341.3760.902.camel@silverbirch.site
Whole thread Raw
Responses Re: VACUUM and spoiling the buffer manager cache  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Round about v.161 of src/storage/buffer/bufmgr.c, during the development
of 8.0 a change was introduced to prevent VACUUM from changing the state
of the Adaptive Replacement Cache buffer management strategy. At the
time that change made lots of sense. Since then we have changed the
buffer management strategy and this behaviour of VACUUM may no longer
make as much sense as it did then.

VACUUM's current behaviour is to take blocks it has touched and place
them on the head of the freelist, allowing them to be reused. This is a
good strategy with clean blocks, but it is a poor strategy for dirty
blocks. Once a dirty block has been placed on the freelist, the very
next request for a free buffer will need to both write the block to disk
*and* this will typically require a WAL flush to occur also.

The WAL flushing behaviour has been described in detail on this thread:
http://archives.postgresql.org/pgsql-hackers/2006-12/msg00674.php
though this proposal has nothing to do with FREEZEing rows.

The effects of this behaviour are that when VACUUM is running alone it
has to make more WAL flushes than it really needs to, so is slightly
slower. That could be improved, but isn't my priority on this post.

When VACUUM operates alongside a concurrent workload the other
non-VACUUM backends become involved in cleaning the VACUUM's dirty
blocks. This slows the non-VACUUM backends down and effectively favours
the VACUUM rather than masking its effects, as we were trying to
achieve. This behaviour noticeably increases normal transaction response
time for extended periods, with noticeable WAL spikes as the WAL drive
repeatedly fsyncs, much more than without the VACUUM workload.

The proposal would be to stop VACUUM from putting its blocks onto the
freelist if they are dirty. This then allows the bgwriter to write the
VACUUM's dirty blocks, which avoids the increased response times due to
WAL flushing. It also incidentally improves a lone VACUUM, since the
bgwriter is able to help write out the dirty blocks. VACUUM pays the
cost to test if they are dirty, but its minor anyway.

The clock cycle buffer management strategy is less prone to cache
spoiling behaviour than was the earlier LRU methods, fixed or adaptive.
A simple solution does effectively smooth out the poor response times
seen while a VACUUM is in progress.

The in-line patch is a one-line change to the buffer manager code, and
is one of a few versions experimented with. The additional line is a
simple test to see whether the VACUUM'd block is dirty before deciding
what to do with it. [A separate patch is available, if requested,
identified as vacstrategy.v2.patch]

Independent verification of test results is requested. 



Index: src/backend/storage/buffer/bufmgr.c
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/storage/buffer/bufmgr.c,v
retrieving revision 1.215
diff -c -r1.215 bufmgr.c
*** src/backend/storage/buffer/bufmgr.c 1 Feb 2007 19:10:27 -0000
1.215
--- src/backend/storage/buffer/bufmgr.c 26 Feb 2007 13:09:35 -0000
***************
*** 907,913 ****                       else                       {                               /* VACUUM accesses
don'tbump usage
 
count, instead... */
!                               if (buf->refcount == 0 &&
buf->usage_count == 0)                                       immed_free_buffer = true;                       }
    }
 
--- 907,914 ----                       else                       {                               /* VACUUM accesses
don'tbump usage
 
count, instead... */
!                               if (buf->refcount == 0 &&
buf->usage_count == 0 &&
!                                       !(buf->flags & BM_DIRTY))
immed_free_buffer= true;                       }               }
 


--  Simon Riggs              EnterpriseDB   http://www.enterprisedb.com




pgsql-hackers by date:

Previous
From: "Zeugswetter Andreas ADI SD"
Date:
Subject: Re: COMMIT NOWAIT Performance Option
Next
From: Gregory Stark
Date:
Subject: Re: COMMIT NOWAIT Performance Option