Re: Bgwriter strategies - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: Bgwriter strategies
Date
Msg-id 468E12FA.70603@enterprisedb.com
Whole thread Raw
In response to Re: Bgwriter strategies  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Bgwriter strategies  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Tom Lane wrote:
> Heikki Linnakangas <heikki@enterprisedb.com> writes:
>>             imola-336    imola-337    imola-340
>> writes by checkpoint      38302          30410          39529
>> writes by bgwriter     350113        2205782        1418672
>> writes by backends    1834333         265755         787633
>> writes total        2222748        2501947        2245834
>> allocations        2683170        2657896        2699974
> 
>> It looks like Tom's idea is not a winner; it leads to more writes than 
>> necessary.
> 
> The incremental number of writes is not that large; only about 10% more.
> The interesting thing is that those "extra" writes must represent
> buffers that were re-touched after their usage_count went to zero, but
> before they could be recycled by the clock sweep.  While you'd certainly
> expect some of that, I'm surprised it is as much as 10%.  Maybe we need
> to play with the buffer allocation strategy some more.
> 
> The very small difference in NOTPM among the three runs says that either
> this whole area is unimportant, or DBT2 isn't a good test case for it;
> or maybe that there's something wrong with the patches?
> 
>> On imola-340, there's still a significant amount of backend writes. I'm 
>> still not sure what we should be aiming at. Is 0 backend writes our goal?
> 
> Well, the lower the better, but not at the cost of a very large increase
> in total writes.
> 
>> Imola-340 was with a patch along the lines of 
>> Itagaki's original patch, ensuring that there's as many clean pages in 
>> front of the clock head as were consumed by backends since last bgwriter 
>> iteration.
> 
> This seems intuitively wrong, since in the presence of bursty request
> behavior it'll constantly be getting caught short of buffers.  I think
> you need a safety margin and a moving-average decay factor.  Possibly
> something like
> 
>     buffers_to_clean = Max(buffers_used * 1.1,
>                            buffers_to_clean * 0.999);
> 
> where buffers_used is the current observation of demand.  This would
> give us a safety margin such that buffers_to_clean is not less than
> the largest demand observed in the last 100 iterations (0.999 ^ 100
> is about 0.90, cancelling out the initial 10% safety margin), and it
> takes quite a while for the memory of a demand spike to be forgotten
> completely.

That would be overly aggressive on a workload that's steady on average, 
but consists of small bursts. Like this: 0 0 0 0 100 0 0 0 0 100 0 0 0 0 
100. You'd end up writing ~100 pages on every bgwriter round, but you 
only need an average of 20 pages per round. That'd be effectively the 
same as keeping all buffers with usage_count=0 clean.

BTW, I believe that kind of workload is actually very common. That's 
what you get if one transaction causes say 10-100 buffer allocations, 
and you execute one such transaction every few seconds.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


pgsql-hackers by date:

Previous
From: Heikki Linnakangas
Date:
Subject: Re: Bgwriter strategies
Next
From: Greg Smith
Date:
Subject: Re: Bgwriter strategies