Re: Bgwriter strategies - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Bgwriter strategies
Date
Msg-id 21957.1183670880@sss.pgh.pa.us
Whole thread Raw
In response to Bgwriter strategies  (Heikki Linnakangas <heikki@enterprisedb.com>)
Responses Re: Bgwriter strategies  (Greg Smith <gsmith@gregsmith.com>)
Re: Bgwriter strategies  (Heikki Linnakangas <heikki@enterprisedb.com>)
Re: Bgwriter strategies  (Heikki Linnakangas <heikki@enterprisedb.com>)
List pgsql-hackers
Heikki Linnakangas <heikki@enterprisedb.com> writes:
>             imola-336    imola-337    imola-340
> writes by checkpoint      38302          30410          39529
> writes by bgwriter     350113        2205782        1418672
> writes by backends    1834333         265755         787633
> writes total        2222748        2501947        2245834
> allocations        2683170        2657896        2699974

> It looks like Tom's idea is not a winner; it leads to more writes than 
> necessary.

The incremental number of writes is not that large; only about 10% more.
The interesting thing is that those "extra" writes must represent
buffers that were re-touched after their usage_count went to zero, but
before they could be recycled by the clock sweep.  While you'd certainly
expect some of that, I'm surprised it is as much as 10%.  Maybe we need
to play with the buffer allocation strategy some more.

The very small difference in NOTPM among the three runs says that either
this whole area is unimportant, or DBT2 isn't a good test case for it;
or maybe that there's something wrong with the patches?

> On imola-340, there's still a significant amount of backend writes. I'm 
> still not sure what we should be aiming at. Is 0 backend writes our goal?

Well, the lower the better, but not at the cost of a very large increase
in total writes.

> Imola-340 was with a patch along the lines of 
> Itagaki's original patch, ensuring that there's as many clean pages in 
> front of the clock head as were consumed by backends since last bgwriter 
> iteration.

This seems intuitively wrong, since in the presence of bursty request
behavior it'll constantly be getting caught short of buffers.  I think
you need a safety margin and a moving-average decay factor.  Possibly
something like
buffers_to_clean = Max(buffers_used * 1.1,                       buffers_to_clean * 0.999);

where buffers_used is the current observation of demand.  This would
give us a safety margin such that buffers_to_clean is not less than
the largest demand observed in the last 100 iterations (0.999 ^ 100
is about 0.90, cancelling out the initial 10% safety margin), and it
takes quite a while for the memory of a demand spike to be forgotten
completely.
        regards, tom lane


pgsql-hackers by date:

Previous
From: Heikki Linnakangas
Date:
Subject: Bgwriter strategies
Next
From: Tom Lane
Date:
Subject: Re: usleep feature for pgbench