Thread: Re: : PG9.0 - Checkpoint tuning and pg_stat_bgwriter

Re: : PG9.0 - Checkpoint tuning and pg_stat_bgwriter

From

Venkat Balaji

Date:

04 October 2011, 10:50:45

Hello,

Sorry. I should have put some more details in the email.

I have got a situation where in i see the production system is loaded with the checkpoints and at-least 1000+ buffers are being written for every checkpoint.

Checkpoint occurs every 3 to 4 minutes and every checkpoint takes 150 seconds minimum to write off the buffers and 150+ seconds for checkpoint syncing. A warning messages can be seen in the dbserver logs "checkpoint occuring too frequently".

I had a look at the pg_stat_bgwriter as well. Below is what i see.

select * from pg_stat_bgwriter;

-------------------+-----------------+--------------------+---------------+------------------+-----------------+---------------------------------------------------------------

9785 | 36649 | 493002109 | 282600872 | 1276056 | 382124461 | 7417638175

(1 row)

I am thinking of increasing the checkpoint_segments.

Below are our current settings -

checkpoint_segments = 8

checkpoint_timeout = 5 mins

checkpoint_completion_target = 0.5

bgwriter_delay = 200ms

bgwriter_lru_maxpages = 100

bgwriter_lru_multiplier = 2

Looking forward for suggestions.

Thanks

On Thu, Sep 29, 2011 at 12:40 PM, Venkat Balaji <venkat.balaji@verse.in> wrote:

Hello Everyone,

We are experience a huge drop in performance for one of our production servers.

I suspect this is because of high IO due to frequent Checkpoints. Attached is the excel sheet with checkpoint information we tracked.

Below is the configuration we have

checkpoint_segments = default
checkpoint_timeout = default

I suspect archive data generation to be around 250 MB.

Please share your thoughts !

Thanks
VB

Re: : PG9.0 - Checkpoint tuning and pg_stat_bgwriter

From

Heikki Linnakangas

Date:

04 October 2011, 11:09:10

On 04.10.2011 13:50, Venkat Balaji wrote:
> I have got a situation where in i see the production system is loaded with
> the checkpoints and at-least 1000+ buffers are being written for every
> checkpoint.

1000 buffers isn't very much, that's only 8 MB, so that's not alarming
itself.

> I am thinking of increasing the checkpoint_segments.
>
> Below are our current settings -
>
> checkpoint_segments = 8
> checkpoint_timeout = 5 mins
> checkpoint_completion_target = 0.5
> bgwriter_delay = 200ms
> bgwriter_lru_maxpages = 100
> bgwriter_lru_multiplier = 2
>
> Looking forward for suggestions.

Yep, increase checkpoint_segments. And you probably want to raise
checkpoint_timeout too.

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com

Re: : PG9.0 - Checkpoint tuning and pg_stat_bgwriter

From

Venkat Balaji

Date:

04 October 2011, 11:23:22

Thanks Heikki !

Regards,

On Tue, Oct 4, 2011 at 4:38 PM, Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> wrote:

On 04.10.2011 13:50, Venkat Balaji wrote:
I have got a situation where in i see the production system is loaded with
the checkpoints and at-least 1000+ buffers are being written for every
checkpoint.

1000 buffers isn't very much, that's only 8 MB, so that's not alarming itself.

I am thinking of increasing the checkpoint_segments.

Below are our current settings -

checkpoint_segments = 8
checkpoint_timeout = 5 mins
checkpoint_completion_target = 0.5
bgwriter_delay = 200ms
bgwriter_lru_maxpages = 100
bgwriter_lru_multiplier = 2

Looking forward for suggestions.

Yep, increase checkpoint_segments. And you probably want to raise checkpoint_timeout too.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

Re: : PG9.0 - Checkpoint tuning and pg_stat_bgwriter

From

pasman pasmański

Date:

04 October 2011, 14:16:06

8 checkpoint segments is very small, try 50

2011/10/4, Venkat Balaji <venkat.balaji@verse.in>:
> Hello,
>
> Sorry. I should have put some more details in the email.
>
> I have got a situation where in i see the production system is loaded with
> the checkpoints and at-least 1000+ buffers are being written for every
> checkpoint.
>
> Checkpoint occurs every 3 to 4 minutes and every checkpoint takes 150
> seconds minimum to write off the buffers and 150+ seconds for checkpoint
> syncing. A warning messages can be seen in the dbserver logs "checkpoint
> occuring too frequently".
>
> I had a look at the pg_stat_bgwriter as well. Below is what i see.
>
>  select * from pg_stat_bgwriter;
>
>  checkpoints_timed | checkpoints_req | buffers_checkpoint | buffers_clean |
> maxwritten_clean | buffers_backend | buffers_alloc
>
-------------------+-----------------+--------------------+---------------+------------------+-----------------+---------------------------------------------------------------
>               9785 |           36649 |                 493002109      |
> 282600872 |        1276056         |      382124461    | 7417638175
> (1 row)
>
> I am thinking of increasing the checkpoint_segments.
>
> Below are our current settings -
>
> checkpoint_segments = 8
> checkpoint_timeout = 5 mins
> checkpoint_completion_target = 0.5
> bgwriter_delay = 200ms
> bgwriter_lru_maxpages = 100
> bgwriter_lru_multiplier = 2
>
> Looking forward for suggestions.
>
> Thanks
> VB
>
>
>
>
> On Thu, Sep 29, 2011 at 12:40 PM, Venkat Balaji
> <venkat.balaji@verse.in>wrote:
>
>> Hello Everyone,
>>
>> We are experience a huge drop in performance for one of our production
>> servers.
>>
>> I suspect this is because of high IO due to frequent Checkpoints. Attached
>> is the excel sheet with checkpoint information we tracked.
>>
>> Below is the configuration we have
>>
>> checkpoint_segments = default
>> checkpoint_timeout = default
>>
>> I suspect archive data generation to be around 250 MB.
>>
>> Please share your thoughts !
>>
>> Thanks
>> VB
>>
>>
>>
>>
>


--
------------
pasman

Re: : PG9.0 - Checkpoint tuning and pg_stat_bgwriter

From

Greg Smith

Date:

04 October 2011, 22:37:14

On 10/04/2011 03:50 AM, Venkat Balaji wrote:
> I had a look at the pg_stat_bgwriter as well.

Try saving it like this instead:

select now(),* from pg_stat_bgwriter;

And collect two data points, space a day or more apart.  That gives a
lot more information about the rate at which things are actually
happening.  The long-term totals are less interesting than that.

Generally the first round of tuning work here is to increase
checkpoint_segments until most checkpoints appear in checkpoints_timed
rather than checkpoints_req.  After that, increasing checkpoint_timeout
might also be useful.

--
Greg Smith   2ndQuadrant US    greg@2ndQuadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us

Re: : PG9.0 - Checkpoint tuning and pg_stat_bgwriter

From

Scott Marlowe

Date:

04 October 2011, 22:47:19

On Tue, Oct 4, 2011 at 4:32 PM, Greg Smith <greg@2ndquadrant.com> wrote:
> On 10/04/2011 03:50 AM, Venkat Balaji wrote:
>>
>> I had a look at the pg_stat_bgwriter as well.
>
> Try saving it like this instead:
>
> select now(),* from pg_stat_bgwriter;
>
> And collect two data points, space a day or more apart.  That gives a lot
> more information about the rate at which things are actually happening.  The
> long-term totals are less interesting than that.
>
> Generally the first round of tuning work here is to increase
> checkpoint_segments until most checkpoints appear in checkpoints_timed
> rather than checkpoints_req.  After that, increasing checkpoint_timeout
> might also be useful.

That last paragraph should be printed out and posted on every pgsql
admin's cubicle wall.

Re: : PG9.0 - Checkpoint tuning and pg_stat_bgwriter

From

Venkat Balaji

Date:

05 October 2011, 02:50:26

I was thinking to increase checkpoint_segments to around 16 or 20.

I think 50 is a bit higher.

Greg,

Sure. I would collect the info from pg_stat_bgwriter on regular intervals.

As we have too many transactions going on I am thinking to collect the info every 6 or 8 hrs.

Thanks

On Wed, Oct 5, 2011 at 4:02 AM, Greg Smith <greg@2ndquadrant.com> wrote:

On 10/04/2011 03:50 AM, Venkat Balaji wrote:
I had a look at the pg_stat_bgwriter as well.

Try saving it like this instead:

select now(),* from pg_stat_bgwriter;

And collect two data points, space a day or more apart. That gives a lot more information about the rate at which things are actually happening. The long-term totals are less interesting than that.

Generally the first round of tuning work here is to increase checkpoint_segments until most checkpoints appear in checkpoints_timed rather than checkpoints_req. After that, increasing checkpoint_timeout might also be useful.

--
Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: : PG9.0 - Checkpoint tuning and pg_stat_bgwriter

From

Greg Smith

Date:

05 October 2011, 05:32:20

On 10/04/2011 07:50 PM, Venkat Balaji wrote:
> I was thinking to increase checkpoint_segments to around 16 or 20.
>
> I think 50 is a bit higher.
>

Don't be afraid to increase that a lot.  You could set it to 1000 and
that would be probably turn out fine; checkpoints will still happen
every 5 minutes.

Checkpoints represent a lot of the I/O in a PostgreSQL database.  The
main downside to making them less frequent is that recovery after a
crash will take longer; a secondary one is that WAL files in pg_xlog
will take up more space.  Most places don't care much about either of
those things.  The advantage to making them happen less often is that
you get less total writes.  People need to be careful about going a long
*time* between checkpoints.  But there's very few cases where you need
to worry about the segment count going too high before another one is
triggered.

--
Greg Smith   2ndQuadrant US    greg@2ndQuadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us

Re: : PG9.0 - Checkpoint tuning and pg_stat_bgwriter

From

Venkat Balaji

Date:

24 October 2011, 17:35:45

Thanks Greg !

Sorry for delayed response.

We are actually waiting to change the checkpoint_segments in our production systems (waiting for the downtime).

Thanks

On Wed, Oct 5, 2011 at 11:02 AM, Greg Smith <greg@2ndquadrant.com> wrote:

On 10/04/2011 07:50 PM, Venkat Balaji wrote:
I was thinking to increase checkpoint_segments to around 16 or 20.

I think 50 is a bit higher.

Don't be afraid to increase that a lot. You could set it to 1000 and that would be probably turn out fine; checkpoints will still happen every 5 minutes.

Checkpoints represent a lot of the I/O in a PostgreSQL database. The main downside to making them less frequent is that recovery after a crash will take longer; a secondary one is that WAL files in pg_xlog will take up more space. Most places don't care much about either of those things. The advantage to making them happen less often is that you get less total writes. People need to be careful about going a long *time* between checkpoints. But there's very few cases where you need to worry about the segment count going too high before another one is triggered.

--
Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us

Re: : PG9.0 - Checkpoint tuning and pg_stat_bgwriter

From

Robert Haas

Date:

25 October 2011, 02:17:36

On Oct 24, 2011, at 8:16 AM, Venkat Balaji <venkat.balaji@verse.in> wrote:
> Thanks Greg !
>
> Sorry for delayed response.
>
> We are actually waiting to change the checkpoint_segments in our production systems (waiting for the downtime).

That setting can be changed without downtime.

...Robert

Re: : PG9.0 - Checkpoint tuning and pg_stat_bgwriter

From

Venkat Balaji

Date:

25 October 2011, 05:05:30

Oh yes.

Thanks a lot Robert !

Regards

On Tue, Oct 25, 2011 at 7:47 AM, Robert Haas <robertmhaas@gmail.com> wrote:

On Oct 24, 2011, at 8:16 AM, Venkat Balaji <venkat.balaji@verse.in> wrote:
> Thanks Greg !
>
> Sorry for delayed response.
>
> We are actually waiting to change the checkpoint_segments in our production systems (waiting for the downtime).

That setting can be changed without downtime.

...Robert