Home > mailing lists

Large values for duration of COMMITs and slow queries. Due to large WAL config values? - Mailing list pgsql-general

From	Cody Caughlan
Subject	Large values for duration of COMMITs and slow queries. Due to large WAL config values?
Date	November 12, 2011 03:04:52
Msg-id	CAPVp=gbKVbNr1zQM_LKauNY-U1PHB++y=Xq26K-dXdDsffv_PQ@mail.gmail.com Whole thread
Responses	Re: Large values for duration of COMMITs and slow queries. Due to large WAL config values? Re: Large values for duration of COMMITs and slow queries. Due to large WAL config values?
List	pgsql-general

Tree view

Postgres 9.1.1, master with 2 slaves via streaming replication.

I've enabled slow query logging of 150ms and am seeing a large number
of slow COMMITs:

2011-11-12 06:55:02 UTC pid:30897 (28/0-0) LOG:  duration: 232.398 ms
statement: COMMIT
2011-11-12 06:55:08 UTC pid:30896 (27/0-0) LOG:  duration: 1078.789 ms
 statement: COMMIT
2011-11-12 06:55:09 UTC pid:30842 (15/0-0) LOG:  duration: 2395.432 ms
 statement: COMMIT
2011-11-12 06:55:09 UTC pid:30865 (23/0-0) LOG:  duration: 2395.153 ms
 statement: COMMIT
2011-11-12 06:55:09 UTC pid:30873 (17/0-0) LOG:  duration: 2390.106 ms
 statement: COMMIT

The machine has 16GB of RAM and plenty of disk space. What I think
might be relevant settings are:

wal_buffers = 16MB
checkpoint_segments = 32
max_wal_senders = 10
checkpoint_completion_target = 0.9
wal_keep_segments = 1024
maintenance_work_mem = 256MB
work_mem = 88MB
shared_buffers = 3584MB
effective_cache_size = 10GB

Recently we have bumped up wal_keep_segments and checkpoint_segments
because we wanted to run long running queries on the slaves and we're
receiving cancellation errors on the slaves. I think the master was
recycling WAL logs from underneath the slave and thus canceling the
queries. Hence, I believed I needed to crank up those values. It seems
to work, I can run long queries (for statistics / reports) on the
slaves just fine.

But I now wonder if its having an adverse effect on the master, ala
these slow commit times and other slow queries (e.g. primary key
lookups on tables with not that many records), which seem to have
increased since the configuration change.

I am watching iostat and sure enough, when %iowait gets > 15 or so
then a bunch more slow queries get logged. So I can see its disk
related.

I just dont know what the underlying cause is.

Any pointers would be appreciated. Thank you.

pgsql-general by date:

From: Tom Lane
Date: 12 November 2011, 01:36:26
Subject: Re: Determine a function's volatility in C

From: hubert depesz lubaczewski
Date: 12 November 2011, 04:58:21
Subject: Re: Strange problem with create table as select * from table;

Large values for duration of COMMITs and slow queries. Due to large WAL config values? - Mailing list pgsql-general

Previous

Next