Re: measuring lwlock-related latency spikes - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: measuring lwlock-related latency spikes |
Date | |
Msg-id | CA+TgmoaY-EPMzgtrxGUa4jua5-MX7cgdQDXO-775=uWYZPCmwg@mail.gmail.com Whole thread Raw |
In response to | Re: measuring lwlock-related latency spikes (Simon Riggs <simon@2ndQuadrant.com>) |
Responses |
Re: measuring lwlock-related latency spikes
Re: measuring lwlock-related latency spikes |
List | pgsql-hackers |
On Mon, Apr 2, 2012 at 7:01 AM, Simon Riggs <simon@2ndquadrant.com> wrote: > Do you consider this proof that it can only be I/O? Or do we still > need to find out? I stuck a bunch more debugging instrumentation into the SLRU code. It was fairly clear from the previous round of instrumentation that the problem was that there was a lot of time passing between when SlruInternalWritePage acquires shared->buffer_locks[slotno] and when it releases that lock; I added some additional instrumentation to (a) confirm this and (b) further break down where the time is getting spent. Long story short, when a CLOG-related stall happens, essentially all the time is being spent in this here section of code: /* * If not part of Flush, need to fsync now. We assume this happens * infrequently enough that it's not a performanceissue. */ if (!fdata) { if (ctl->do_fsync && pg_fsync(fd)) { slru_errcause = SLRU_FSYNC_FAILED; slru_errno = errno; close(fd); return false; } if (close(fd)) { slru_errcause = SLRU_CLOSE_FAILED; slru_errno = errno; returnfalse; } } Here's what the debug output looks like: 2012-04-02 15:51:27 UTC [62397] LOG: SlruPhysicalWritePage(11) intervals: 0.005000 0.001000 0.013000 0.000000 0.073000 13162.557000 2012-04-02 15:51:27 UTC [62397] STATEMENT: UPDATE pgbench_accounts SET abalance = abalance + -3060 WHERE aid = 6501332; 2012-04-02 15:51:27 UTC [62430] LOG: lock 104: waited 13162.676 ms at slru.c:311 blocked by slru.c:529 spin 2 2012-04-02 15:51:27 UTC [62430] STATEMENT: UPDATE pgbench_accounts SET abalance = abalance + -3692 WHERE aid = 2516692; 2012-04-02 15:51:27 UTC [62428] LOG: lock 104: waited 13161.409 ms at slru.c:311 blocked by slru.c:529 spin 2 2012-04-02 15:51:27 UTC [62428] STATEMENT: UPDATE pgbench_accounts SET abalance = abalance + 3281 WHERE aid = 24527957; 2012-04-02 15:51:27 UTC [62443] LOG: lock 104: waited 13161.146 ms at slru.c:311 blocked by slru.c:529 spin 2 2012-04-02 15:51:27 UTC [62443] STATEMENT: UPDATE pgbench_accounts SET abalance = abalance + -360 WHERE aid = 6714054; 2012-04-02 15:51:27 UTC [62436] LOG: lock 104: waited 12094.996 ms at slru.c:311 blocked by slru.c:529 spin 2 2012-04-02 15:51:27 UTC [62436] STATEMENT: UPDATE pgbench_accounts SET abalance = abalance + -49 WHERE aid = 4080528; 2012-04-02 15:51:27 UTC [62389] LOG: lock 104: waited 13160.966 ms at slru.c:311 blocked by slru.c:529 spin 2 2012-04-02 15:51:27 UTC [62389] STATEMENT: UPDATE pgbench_accounts SET abalance = abalance + -563 WHERE aid = 21896604; 2012-04-02 15:51:27 UTC [62407] LOG: lock 104: waited 13161.034 ms at slru.c:311 blocked by slru.c:529 spin 2 2012-04-02 15:51:27 UTC [62407] STATEMENT: UPDATE pgbench_accounts SET abalance = abalance + 1437 WHERE aid = 17185681; 2012-04-02 15:51:27 UTC [62432] LOG: lock 104: waited 13160.983 ms at slru.c:311 blocked by slru.c:529 spin 2 2012-04-02 15:51:27 UTC [62432] STATEMENT: UPDATE pgbench_accounts SET abalance = abalance + 4330 WHERE aid = 6289956; 2012-04-02 15:51:27 UTC [62403] LOG: lock 104: waited 11953.875 ms at slru.c:311 blocked by slru.c:529 spin 2 2012-04-02 15:51:27 UTC [62403] STATEMENT: UPDATE pgbench_accounts SET abalance = abalance + -4717 WHERE aid = 18829978; 2012-04-02 15:51:27 UTC [62438] LOG: lock 104: waited 11953.987 ms at slru.c:311 blocked by slru.c:529 spin 2 2012-04-02 15:51:27 UTC [62438] STATEMENT: UPDATE pgbench_accounts SET abalance = abalance + 1361 WHERE aid = 26274208; 2012-04-02 15:51:27 UTC [62400] LOG: lock 104: waited 10471.223 ms at slru.c:311 blocked by slru.c:529 spin 2 2012-04-02 15:51:27 UTC [62400] STATEMENT: UPDATE pgbench_accounts SET abalance = abalance + -2002 WHERE aid = 19209246; 2012-04-02 15:51:27 UTC [62427] LOG: lock 104: waited 10248.041 ms at slru.c:311 blocked by slru.c:529 spin 2 2012-04-02 15:51:27 UTC [62427] STATEMENT: UPDATE pgbench_accounts SET abalance = abalance + -874 WHERE aid = 4042895; 2012-04-02 15:51:27 UTC [62419] LOG: lock 104: waited 13161.085 ms at slru.c:311 blocked by slru.c:529 spin 2 2012-04-02 15:51:27 UTC [62419] STATEMENT: UPDATE pgbench_accounts SET abalance = abalance + -2874 WHERE aid = 11997038; 2012-04-02 15:51:27 UTC [62394] LOG: lock 104: waited 10171.179 ms at slru.c:311 blocked by slru.c:529 spin 2 2012-04-02 15:51:27 UTC [62394] STATEMENT: UPDATE pgbench_accounts SET abalance = abalance + -3855 WHERE aid = 12744804; 2012-04-02 15:51:27 UTC [62410] LOG: lock 104: waited 10247.882 ms at slru.c:311 blocked by slru.c:529 spin 2 2012-04-02 15:51:27 UTC [62410] STATEMENT: UPDATE pgbench_accounts SET abalance = abalance + 3643 WHERE aid = 16152613; 2012-04-02 15:51:27 UTC [62440] LOG: lock 104: waited 10169.646 ms at slru.c:311 blocked by slru.c:529 spin 2 2012-04-02 15:51:27 UTC [62440] STATEMENT: UPDATE pgbench_accounts SET abalance = abalance + 215 WHERE aid = 3276253; 2012-04-02 15:51:27 UTC [62423] LOG: lock 104: waited 10170.909 ms at slru.c:311 blocked by slru.c:529 spin 2 2012-04-02 15:51:27 UTC [62423] STATEMENT: UPDATE pgbench_accounts SET abalance = abalance + 2846 WHERE aid = 10337813; 2012-04-02 15:51:27 UTC [62431] LOG: lock 104: waited 10279.255 ms at slru.c:311 blocked by slru.c:529 spin 2 2012-04-02 15:51:27 UTC [62431] STATEMENT: UPDATE pgbench_accounts SET abalance = abalance + -419 WHERE aid = 1547227; 2012-04-02 15:51:27 UTC [62397] LOG: SlruInternalWritePage(11) intervals: 0.007000 13162.736000 0.000000 0.001000 0.257000 2012-04-02 15:51:27 UTC [62397] STATEMENT: UPDATE pgbench_accounts SET abalance = abalance + -3060 WHERE aid = 6501332; 2012-04-02 15:51:27 UTC [62424] LOG: lock 104: waited 10169.181 ms at slru.c:311 blocked by slru.c:529 spin 2 2012-04-02 15:51:27 UTC [62424] STATEMENT: UPDATE pgbench_accounts SET abalance = abalance + 4771 WHERE aid = 6768873; 2012-04-02 15:51:27 UTC [62426] LOG: lock 104: waited 10169.890 ms at slru.c:311 blocked by slru.c:529 spin 2 2012-04-02 15:51:27 UTC [62426] STATEMENT: UPDATE pgbench_accounts SET abalance = abalance + 3644 WHERE aid = 13548862; 2012-04-02 15:51:27 UTC [62441] LOG: lock 104: waited 10078.669 ms at slru.c:311 blocked by slru.c:529 spin 2 2012-04-02 15:51:27 UTC [62441] STATEMENT: UPDATE pgbench_accounts SET abalance = abalance + -2318 WHERE aid = 3387689; 2012-04-02 15:51:27 UTC [62434] LOG: lock 104: waited 10472.164 ms at slru.c:311 blocked by slru.c:529 spin 2 2012-04-02 15:51:27 UTC [62434] STATEMENT: UPDATE pgbench_accounts SET abalance = abalance + 1884 WHERE aid = 11738348; 2012-04-02 15:51:27 UTC [62442] LOG: lock 104: waited 10472.116 ms at slru.c:311 blocked by slru.c:529 spin 2 2012-04-02 15:51:27 UTC [62442] STATEMENT: UPDATE pgbench_accounts SET abalance = abalance + 1314 WHERE aid = 1871577; 2012-04-02 15:51:27 UTC [62429] LOG: lock 104: waited 12094.703 ms at slru.c:311 blocked by slru.c:529 spin 2 2012-04-02 15:51:27 UTC [62429] STATEMENT: UPDATE pgbench_accounts SET abalance = abalance + 1188 WHERE aid = 21727162; 2012-04-02 15:51:27 UTC [62425] LOG: lock 104: waited 10471.408 ms at slru.c:311 blocked by slru.c:529 spin 2 2012-04-02 15:51:27 UTC [62425] STATEMENT: UPDATE pgbench_accounts SET abalance = abalance + -35 WHERE aid = 16735415; 2012-04-02 15:51:27 UTC [62435] LOG: lock 104: waited 10247.717 ms at slru.c:311 blocked by slru.c:529 spin 2 2012-04-02 15:51:27 UTC [62435] STATEMENT: UPDATE pgbench_accounts SET abalance = abalance + 457 WHERE aid = 20513750; 2012-04-02 15:51:27 UTC [62421] LOG: lock 104: waited 12094.718 ms at slru.c:311 blocked by slru.c:529 spin 2 2012-04-02 15:51:27 UTC [62421] STATEMENT: UPDATE pgbench_accounts SET abalance = abalance + -1989 WHERE aid = 26993577; 2012-04-02 15:51:27 UTC [62433] LOG: lock 104: waited 10471.231 ms at slru.c:311 blocked by slru.c:529 spin 2 2012-04-02 15:51:27 UTC [62433] STATEMENT: UPDATE pgbench_accounts SET abalance = abalance + -4369 WHERE aid = 17652000; 2012-04-02 15:51:27 UTC [62439] LOG: lock 104: waited 10170.579 ms at slru.c:311 blocked by slru.c:529 spin 2 2012-04-02 15:51:27 UTC [62439] STATEMENT: UPDATE pgbench_accounts SET abalance = abalance + -2453 WHERE aid = 27871868; 2012-04-02 15:51:27 UTC [62415] LOG: lock 104: waited 10170.867 ms at slru.c:311 blocked by slru.c:529 spin 2 2012-04-02 15:51:27 UTC [62415] STATEMENT: UPDATE pgbench_accounts SET abalance = abalance + -3110 WHERE aid = 28608274; 2012-04-02 15:51:27 UTC [62422] LOG: lock 104: waited 10082.338 ms at slru.c:311 blocked by slru.c:529 spin 2 2012-04-02 15:51:27 UTC [62422] STATEMENT: UPDATE pgbench_accounts SET abalance = abalance + -3942 WHERE aid = 12669792; 2012-04-02 15:51:27 UTC [62437] LOG: lock 104: waited 11955.005 ms at slru.c:311 blocked by slru.c:529 spin 2 This particular example shows the above chunk of code taking >13s to execute. Within 3s, every other backend piles up behind that, leading to the database getting no work at all done for a good ten seconds. My guess is that what's happening here is that one backend needs to read a page into CLOG, so it calls SlruSelectLRUPage to evict the oldest SLRU page, which is dirty. For some reason, that I/O takes a long time. Then, one by one, other backends comes along and also need to read various SLRU pages, but the oldest SLRU page hasn't changed, so SlruSelectLRUPage keeps returning the exact same page that it returned before, and everybody queues up waiting for that I/O, even though there might be other buffers available that aren't even dirty. I am thinking that SlruSelectLRUPage() should probably do SlruRecentlyUsed() on the selected buffer before calling SlruInternalWritePage, so that the next backend that comes along looking for a buffer doesn't pick the same one. Possibly we should go further and try to avoid replacing dirty buffers in the first place, but sometimes there may be no choice, so doing SlruRecentlyUsed() is still a good idea. I'll do some testing to try to confirm whether this theory is correct and whether the above fix helps. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: