Home > mailing lists

Re: SSI slows down over time - Mailing list pgsql-performance

From	Ryan Johnson
Subject	Re: SSI slows down over time
Date	April 12, 2014 18:20:29
Msg-id	5342B219.9040001@gmail.com Whole thread Raw
In response to	Re: SSI slows down over time (Tom Lane <tgl@sss.pgh.pa.us>)
List	pgsql-performance

Tree view

On 06/04/2014 10:55 AM, Tom Lane wrote:
> Ryan Johnson <ryan.johnson@cs.utoronto.ca> writes:
>> I get a strange behavior across repeated runs: each 100-second run is a
>> bit slower than the one preceding it, when run with SSI (SERIALIZABLE).
>> ... So the question: what should I look for to diagnose/triage this problem?
> In the past I've seen behaviors like this that traced to the range of
> "interesting" transaction IDs getting wider as time went on, so that
> more pages of pg_clog were hot, leading to more I/O traffic in the
> clog SLRU buffers.  Possibly this is some effect like that.
The effect disappears completely if I run under SI instead of SSI,
though. That makes me suspect strongly that the problem lurks in
SSI-specific infrastrucure.

However, I did notice that the SLRU buffer that holds "old" SSI
transactions sometimes spikes from hundreds to millions of entries (by
annotating the source to ereport a warning whenever the difference
between buffer head and tale is at least 10% higher than the previous
record). Not sure if that's related, though: I'm pretty sure SSI never
scans SLRU, it's only used for random lookups.

>> I've tried linux perf, but all it says is that lots of time is going to
>> LWLock (but callgraph tracing doesn't work in my not-bleeding-edge
>> kernel).
> You could recompile with -DLWLOCK_STATS to enable gathering stats on
> which LWLocks are heavily contended.  That would be a starting point
> for localizing the cause.

Here are the offenders (100-second run, 24 clients, ~2.2ktps):
> lwlock        7 shacq         0 exacq 7002810 blk  896235 spindelay  213
> lwlock       28 shacq  94984166 exacq   3938085 blk 572762 spindelay  163
> lwlock       65 shacq   3347000 exacq   2933440 blk 255927 spindelay   90
> lwlock       79 shacq   1742574 exacq   3009663 blk 216178 spindelay   41
> lwlock       76 shacq   2293386 exacq   2892242 blk 205199 spindelay   70
> lwlock       66 shacq    498909 exacq   2987485 blk 171508 spindelay   48
> lwlock       80 shacq    512107 exacq   3181753 blk 165772 spindelay   43
> lwlock       71 shacq    815733 exacq   3088157 blk 165579 spindelay   48
> lwlock       74 shacq    603321 exacq   3065391 blk 159953 spindelay   56
> lwlock       67 shacq    695465 exacq   2918970 blk 149339 spindelay   28
> lwlock       69 shacq    411203 exacq   3044007 blk 148655 spindelay   34
> lwlock       72 shacq    515260 exacq   2973321 blk 147533 spindelay   43
> lwlock       30 shacq  41628636 exacq      8799 blk 143889 spindelay  186
> lwlock       75 shacq    409472 exacq   2987227 blk 143196 spindelay   38
> lwlock       77 shacq    409401 exacq   2946972 blk 139507 spindelay   34
> lwlock       73 shacq    402544 exacq   2943467 blk 139380 spindelay   43
> lwlock       78 shacq    404220 exacq   2912665 blk 137625 spindelay   21
> lwlock       70 shacq    603643 exacq   2816730 blk 135851 spindelay   37
> lwlock       68 shacq    403533 exacq   2862017 blk 131946 spindelay   30
> lwlock       29 shacq         0 exacq    255302 blk 75838 spindelay    1
> lwlock        0 shacq         0 exacq    561508 blk 12445 spindelay    3
> lwlock       11 shacq   1245499 exacq    219717 blk 5501 spindelay   10
> lwlock        4 shacq    381211 exacq    209146 blk 1273 spindelay    4
> lwlock        3 shacq        16 exacq    209081 blk 522 spindelay    0
> lwlock        8 shacq         0 exacq    137961 blk 50 spindelay    0
> lwlock  2097366 shacq         0 exacq    384586 blk 1 spindelay    0
> lwlock  2097365 shacq         0 exacq    370176 blk 1 spindelay    0
> lwlock  2097359 shacq         0 exacq    363845 blk 1 spindelay    0

The above aggregates the per-lock stats from all processes, filters out
locks with fewer than 10000 accesses (shared+exclusive) or with zero
blk, then sorts by highest blk first.

According to [1], locks {28, 29, 30} are {SerializableXactHashLock,
SerializableFinishedListLock, SerializablePredicateLockListLock}, all
SSI-related; locks 65-80 are the sixteen PredicateLockMgrLocks that the
post mentions. Looking in lwlock.h, lock 7 (which tops the list) is the
WALInsertLock. That lock was *not* mentioned in the pgsql-hackers post.

Re-running the same analysis for SI instead of SSI gives 4.6ktps and a
much shorter list:
> lwlock        7 shacq         0 exacq 14050121 blk  3429384 spindelay  347
> lwlock       11 shacq   3133994 exacq    450325 blk 23456 spindelay   29
> lwlock        0 shacq         0 exacq    684775 blk 19158 spindelay    3
> lwlock        4 shacq    780846 exacq    428771 blk 4539 spindelay    6
> lwlock        3 shacq        19 exacq    428705 blk 1147 spindelay    0
> lwlock       59 shacq         0 exacq    125943 blk 203 spindelay    0
> lwlock        8 shacq         0 exacq    287629 blk 116 spindelay    0
> lwlock  2097358 shacq         0 exacq    752361 blk 1 spindelay    0
> lwlock  2097355 shacq         0 exacq    755086 blk 1 spindelay    0
> lwlock  2097352 shacq         0 exacq    760232 blk 1 spindelay    0

However, all of this only confirms that SSI has a lock bottleneck; it
doesn't say why the bottleneck gets worse over time.

[1]
http://www.postgresql.org/message-id/CA+TgmoYAiSM2jWEndReY5PL0sKbhgg7dbDH6r=oXKYzi9B7KJA@mail.gmail.com

Thoughts?
Ryan

pgsql-performance by date:

From: Bruce Momjian
Date: 09 April 2014, 21:21:10
Subject: Re: SSI slows down over time

From: Hans Drexler
Date: 12 April 2014, 18:20:32
Subject: Re: batch update query performance

Re: SSI slows down over time - Mailing list pgsql-performance

Previous

Next