On Thu, Jul 18, 2013 at 10:40 AM, Josh Berkus <josh@agliodbs.com> wrote:
>
> Right, that's what I just tested. The results are interesting. I
> changed the defaults as follows:
>
> bgwriter_delay = 100ms
> bgwriter_lru_maxpages = 512
> bgwriter_lru_multiplier = 3.0
>
> ... and the number of buffers being written by the bgwriter went *down*,
> almost to zero. Mind you, I wanna gather a full week of data, but there
> seems to be something counterintuitive going on here.
>
> One potential factor is that they have their shared_buffers set
> unusually high (5GB out of 16GB).
>
> Here's the stats:
>
> postgres=# select * from pg_stat_bgwriter;
> -[ RECORD 1 ]---------+------------------------------
> checkpoints_timed | 330
> checkpoints_req | 47
> checkpoint_write_time | 55504727
> checkpoint_sync_time | 286743
> buffers_checkpoint | 2809031
> buffers_clean | 789
> maxwritten_clean | 0
> buffers_backend | 457456
> buffers_backend_fsync | 0
> buffers_alloc | 943734
> stats_reset | 2013-07-17 17:09:18.945194-07
>
> So we're not hitting maxpages anymore, at all. So why isn't the
> bgwriter doing any work?
Does their workload have a lot of bulk operations, which use a
ring-buffer strategy and so intentionally evict their own buffers?
Do you have a simple select * from pg_stat_bgwriter from the period
before the change? You posted the query that does averaging and
aggregation, but I couldn't figure out how to backtrack from that to
the original numbers.
Cheers,
Jeff