On 10/31/2016 08:19 PM, Andre Henry wrote:
> My PG 9.4.5 server runs on Amazon RDS some times of the day we have a
> lot of checkpoints really close (less than 1 minute apart, see logs
> below) and we are trying to tune the DB to minimize the impact of the
> checkpoint or reduce the number of checkpoints.
>
> Server Stats
>
> · Instance Type db.r3.4xl
>
> • 16 vCPUs 122GB of RAM
>
> • PostgreSQL 9.4.5 on x86_64-unknown-linux-gnu, compiled by gcc
> (GCC) 4.8.2 20140120 (Red Hat 4.8.2-16), 64-bit
>
>
>
> Some PG Stats
>
> • Shared Buffers = 31427608kB
>
> • Checkpoint Segments = 64
>
> • Checkpoint completion target = .9
>
> • Rest of the configuration is below
>
>
>
> Things we are doing
>
> • We have a huge table where each row is over 1kB and its very
> busy. We are splitting that into multiple tables especially the one json
> field that making it large.
>
>
>
> Questions
>
> • Each checkpoint log writes out the following checkpoint
> complete: wrote 166481 buffers (4.2%); 0 transaction log file(s) added,
> 0 removed, 64 recycled; write=32.441 s, sync=0.050 s, total=32.550 s;
> sync files=274, longest=0.049 s, average=0.000 s
>
OK, each checkpoint has to write all dirty data from checkpoints. You
have ~170k buffers worth of dirty data, i.e. ~1.3GB.
> • What does buffers mean? How do I find out how much RAM that is
> equivalent to?
>
Buffer holds 8kB of data, which is the "chunk" of data files.
> • Based on my RDS stats I don't think IOPs will help, because I
> don't see any flat lines on my write operations / second graph. Is this
> a good assumption?
>
Not sure what you mean by this. Also, maybe you should talk to AWS if
you're on RDS.
> • What else can we tune to spread out checkpoints?
>
Based on the logs, your checkpoints are triggered by filling WAL. I see
your checkpoints happen every 30 - 40 seconds, and you only have 64
segments.
So to get checkpoints checkpoints triggered by timeout (which I assume
is 5 minutes, because you have not mentioned checkpoint_timeout), you
need to increase checkpoint_segments enough to hold 5 minutes worth of WAL.
That means 300/30 * 64, i.e. roughly 640 segments (it's likely an
overestimate, due to full page writes, but well).
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services