Thread: Debugging writing load

Debugging writing load

From
Borodin Vladimir
Date:
Hi all.

I have a database with quiet heavy writing load (about 20k tps, 5k of which do writes). And I see lots of writing I/O (I mean amount of data, not iops) to this database, much more than I expect. My question is how can I debug what for backend processes do lots of writes to the $PGDATA/base directory? Below are some details.

The database works on machine with 128 GB of RAM and md raid10 of 8 ssd disks (INTEL SSDSC2BB480G4 480 GB). It runs PostgreSQL 9.3.4 on Red Hat 6.5 with the following postgresql.conf - http://pastebin.com/LNLHppcb. Sysctl parameters for page cache are:

# sysctl -a | grep vm.dirty
vm.dirty_background_ratio = 0
vm.dirty_background_bytes = 104857600
vm.dirty_ratio = 40
vm.dirty_bytes = 0
vm.dirty_writeback_centisecs = 100
vm.dirty_expire_centisecs = 300
#

Total database size is now a bit more than 500 GB.

I have different raid10 arrays for PGDATA and pg_xlog directory (with different mount options). And under load iostat shows that it is written about 20 MB/s on array with xlogs and about 200 MB/s on array with PGDATA. Iotop shows me that ~ 80-100 MB/s of data is written by pdflush (and it is expected behavior for me). And the other ~100 MB is being written by backend processes (varying from 1 MB/s to 30 MB/s). Checkpointer process, bgwriter process and autovacuum workers do really little work (3-5 MB/s).

Lsof on several backend processes shows me that backend uses just database files (tables and indexes) and last xlog file. Is there any way to understand why is backend writing lots of data to $PGDATA/base directory? I have tried to use pg_stat_statements for it but I haven’t found a good way to understand what is happening. Is there a way to see something like "this backend process has written these pages to disk while performing this query"?

Would be very grateful for any help. Thanks.

--
Vladimir




Re: Debugging writing load

From
Craig Ringer
Date:
On 07/24/2014 10:22 PM, Borodin Vladimir wrote:
> Hi all.
>
> I have a database with quiet heavy writing load (about 20k tps, 5k of
> which do writes). And I see lots of writing I/O (I mean amount of data,
> not iops) to this database, much more than I expect. My question is how
> can I debug what for backend processes do lots of writes to the
> $PGDATA/base directory? Below are some details.

I'd  be using perf for this - if you set tracepoints on writes and
flushes you can associate that with the proceses performing the work.

That said, the bgwriter and checkpointer will accumulate data from many
different backends. perf can't track that back to the origin backend.

It's also currently impractical to pluck things like the current_user,
statement name, etc from the PostgreSQL process's memory when capturing
perf events, so it's still hard to drill down to "this query is causing
lots of I/O".

> Lsof on several backend processes shows me that backend uses just
> database files (tables and indexes) and last xlog file. Is there any way
> to understand why is backend writing lots of data to $PGDATA/base
> directory? I have tried to use pg_stat_statements for it but I haven’t
> found a good way to understand what is happening. Is there a way to see
> something like "this backend process has written these pages to disk
> while performing this query"?

Pg can't keep track of that its self. At most it can track what buffered
writes it's issued. The OS might write-merge them, or it might not flush
them to disk at all if they're superseded by a later write over the same
range.

You really need the kernel's help, and that's where perf comes in.

I wrote a little about this topic a while ago:

http://blog.2ndquadrant.com/tracing-postgresql-perf/

but it's really a bit introductory.

--
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services