Re: measuring lwlock-related latency spikes - Mailing list pgsql-hackers

From Simon Riggs
Subject Re: measuring lwlock-related latency spikes
Date
Msg-id CA+U5nMJ+mykGSOLeFrVzgVeAEsnjdgYHT0uVu-SAM0ScUx3PCw@mail.gmail.com
Whole thread Raw
In response to Re: measuring lwlock-related latency spikes  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: measuring lwlock-related latency spikes  (Greg Stark <stark@mit.edu>)
List pgsql-hackers
On Sun, Apr 1, 2012 at 1:34 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Sun, Apr 1, 2012 at 7:07 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
>> First, we need to determine that it is the clog where this is happening.
>
> I can confirm that based on the LWLockIds.  There were 32 of them
> beginning at lock id 81, and a gdb session confirms that
> ClogCtlData->shared->buffer_locks[0..31] point to exact that set of
> LWLockIds.
>
>> Also, you're assuming this is an I/O issue. I think its more likely
>> that this is a lock starvation issue. Shared locks queue jump
>> continually over the exclusive lock, blocking access for long periods.
>
> That is a possible issue in general, but I can't see how it could be
> happening here, because the shared lock is only a mechanism for
> waiting for an I/O to complete.  The backend doing the I/O grabs the
> control lock, sets a flag saying there's an I/O in progress, takes the
> buffer lock in exclusive mode, and releases the control lock.  The
> shared locks are taken when someone notices that the flag is set on a
> buffer they want to access.  So there aren't any shared lockers until
> the buffer is already locked in exclusive mode.  Or at least I don't
> think there are; please correct me if I'm wrong.

Agreed.

Before the exclusive lock holder releases the lock it must acquire the
control lock in exclusive mode (line 544).

So lock starvation on the control lock would cause a long wait after
each I/O, making it look like an I/O problem.

Anyway, just to note that it might not be I/O and we need to find out.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Speed dblink using alternate libpq tuple storage
Next
From: Greg Stark
Date:
Subject: Re: measuring lwlock-related latency spikes