Re: bgwriter changes - Mailing list pgsql-hackers

From Neil Conway
Subject Re: bgwriter changes
Date
Msg-id 41C16C73.1020007@samurai.com
Whole thread Raw
In response to Re: bgwriter changes  ("Zeugswetter Andreas DAZ SD" <ZeugswetterA@spardat.at>)
Responses Re: bgwriter changes
List pgsql-hackers
Zeugswetter Andreas DAZ SD wrote:
> This has the disadvantage of converging against 0 dirty pages.
> A system that has less than maxpages dirty will write every page with 
> every bgwriter run.

Yeah, I'm concerned about the bgwriter being overly aggressive if we 
disable bgwriter_percent. If we leave the settings as they are (delay = 
200, maxpages = 100, shared_buffers = 1000 by default), we will be 
writing all the dirty pages to disk every 2 seconds, which seems far too 
much.

It might also be good to reduce the delay, in order to more proactively 
keep the LRUs clean (e.g. scanning to find N dirty pages once per second 
is likely to reach father away from the LRU than scanning for N/M pages 
once per 1/M seconds). On the other hand the more often the bgwriter 
scans the buffer pool, the more times the BufMgrLock needs to be 
acquired -- and in a system in which pages aren't being dirtied very 
rapidly (or the dirtied pages tend to be very hot), each of those scans 
is going to take a while to find enough dirty pages using #2. So perhaps 
it is best to leave the delay as is for 8.0.

> This might have the disadvantage of either leaving too much for the 
> checkpoint or writing too many dirty pages in one run. Is writing a lot 
> in one run actually a problem though ? Or does the bgwriter pause
> periodically while writing the pages of one run ?

The bgwriter does not pause between writing pages. What would be the 
point of doing that? The kernel is going to be caching the write() anyway.

> If this is expressed in pages it would naturally need to be more than the 
> current maxpages (to accomodate for clean pages). The suggested 2% sounded 
> way too low for me (that leaves 98% to the checkpoint).

I agree this might be a problem, but it doesn't necessarily leave 98% to 
be written at checkpoint: if the buffers in the LRU change over time, 
the set of pages searched by the bgwriter will also change. I'm not sure 
how quickly the pages near the LRU change in a "typical workload"; 
moreover, I think this would vary between different workloads.

-Neil


pgsql-hackers by date:

Previous
From: "Zeugswetter Andreas DAZ SD"
Date:
Subject: Re: bgwriter changes
Next
From: "Jim Buttafuoco"
Date:
Subject: Re: [Fwd: Re: race condition for drop schema cascade?]