Re: measuring lwlock-related latency spikes - Mailing list pgsql-hackers

From Simon Riggs
Subject Re: measuring lwlock-related latency spikes
Date
Msg-id CA+U5nMJBsGKKo6VmKqKy4gzHkU-XcGs6n_vt7t094xJCucLHZA@mail.gmail.com
Whole thread Raw
In response to measuring lwlock-related latency spikes  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: measuring lwlock-related latency spikes  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On Sat, Mar 31, 2012 at 4:41 AM, Robert Haas <robertmhaas@gmail.com> wrote:

> which means, if I'm not
> confused here, that every single lwlock-related stall > 1s happened
> while waiting for a buffer content lock.  Moreover, each event
> affected a different buffer.  I find this result so surprising that I
> have a hard time believing that I haven't screwed something up, so if
> anybody can check over the patch and this analysis and suggest what
> that thing might be, I would appreciate it.

Possible candidates are

1) pages on the RHS of the PK index on accounts. When the page splits
a new buffer will be allocated and the contention will move to the new
buffer. Given so few stalls, I'd say this was the block one above leaf
level.

2) Buffer writes hold the content lock in shared mode, so a delayed
I/O during checkpoint on a page requested by another for write would
show up as a wait for a content lock. That might happen to updates
where checkpoint write occurs between the search and write portions of
the update.

The next logical step in measuring lock waits is to track the reason
for the lock wait, not just the lock wait itself.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


pgsql-hackers by date:

Previous
From: Dean Rasheed
Date:
Subject: Tab completion of double quoted identifiers broken
Next
From: Dobes Vandermeer
Date:
Subject: Http Frontend implemented using pgsql?