Re: Large values for duration of COMMITs and slow queries. Due to large WAL config values? - Mailing list pgsql-general

From Cody Caughlan
Subject Re: Large values for duration of COMMITs and slow queries. Due to large WAL config values?
Date
Msg-id CAPVp=gacjQhiV0-_UXYbfXvHr7E1J1fMiF2CF7_3b+XtTmG8XQ@mail.gmail.com
Whole thread Raw
In response to Large values for duration of COMMITs and slow queries. Due to large WAL config values?  (Cody Caughlan <toolbag@gmail.com>)
Responses Re: Re: Large values for duration of COMMITs and slow queries. Due to large WAL config values?  (Igor Polishchuk <igor@powerreviews.com>)
List pgsql-general
I've run VACUUM ANALYZE on all my tables to make sure the house has
been cleaned. I still see a lot of slow queries / commits, even on
primary key lookups and well indexed tables.

/Cody

On Fri, Nov 11, 2011 at 11:04 PM, Cody Caughlan <toolbag@gmail.com> wrote:
> Postgres 9.1.1, master with 2 slaves via streaming replication.
>
> I've enabled slow query logging of 150ms and am seeing a large number
> of slow COMMITs:
>
> 2011-11-12 06:55:02 UTC pid:30897 (28/0-0) LOG:  duration: 232.398 ms
> statement: COMMIT
> 2011-11-12 06:55:08 UTC pid:30896 (27/0-0) LOG:  duration: 1078.789 ms
>  statement: COMMIT
> 2011-11-12 06:55:09 UTC pid:30842 (15/0-0) LOG:  duration: 2395.432 ms
>  statement: COMMIT
> 2011-11-12 06:55:09 UTC pid:30865 (23/0-0) LOG:  duration: 2395.153 ms
>  statement: COMMIT
> 2011-11-12 06:55:09 UTC pid:30873 (17/0-0) LOG:  duration: 2390.106 ms
>  statement: COMMIT
>
> The machine has 16GB of RAM and plenty of disk space. What I think
> might be relevant settings are:
>
> wal_buffers = 16MB
> checkpoint_segments = 32
> max_wal_senders = 10
> checkpoint_completion_target = 0.9
> wal_keep_segments = 1024
> maintenance_work_mem = 256MB
> work_mem = 88MB
> shared_buffers = 3584MB
> effective_cache_size = 10GB
>
> Recently we have bumped up wal_keep_segments and checkpoint_segments
> because we wanted to run long running queries on the slaves and we're
> receiving cancellation errors on the slaves. I think the master was
> recycling WAL logs from underneath the slave and thus canceling the
> queries. Hence, I believed I needed to crank up those values. It seems
> to work, I can run long queries (for statistics / reports) on the
> slaves just fine.
>
> But I now wonder if its having an adverse effect on the master, ala
> these slow commit times and other slow queries (e.g. primary key
> lookups on tables with not that many records), which seem to have
> increased since the configuration change.
>
> I am watching iostat and sure enough, when %iowait gets > 15 or so
> then a bunch more slow queries get logged. So I can see its disk
> related.
>
> I just dont know what the underlying cause is.
>
> Any pointers would be appreciated. Thank you.
>

pgsql-general by date:

Previous
From: "Clark C. Evans"
Date:
Subject: CLONE DATABASE (with copy on write?)
Next
From: Tom Lane
Date:
Subject: Re: Re: [GENERAL] Need Help Installing Dblink…(Desperately…)