Home > mailing lists

Re: measuring lwlock-related latency spikes - Mailing list pgsql-hackers

From	Robert Haas
Subject	Re: measuring lwlock-related latency spikes
Date	April 1, 2012 12:35:07
Msg-id	CA+TgmoZAffdS7jS7y8zPTCGTvZx-bB6q3GbLpA66citC=HftYQ@mail.gmail.com Whole thread Raw
In response to	Re: measuring lwlock-related latency spikes (Simon Riggs <simon@2ndQuadrant.com>)
Responses	Re: measuring lwlock-related latency spikes
List	pgsql-hackers

Tree view

On Sun, Apr 1, 2012 at 7:07 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
> First, we need to determine that it is the clog where this is happening.

I can confirm that based on the LWLockIds.  There were 32 of them
beginning at lock id 81, and a gdb session confirms that
ClogCtlData->shared->buffer_locks[0..31] point to exact that set of
LWLockIds.

> Also, you're assuming this is an I/O issue. I think its more likely
> that this is a lock starvation issue. Shared locks queue jump
> continually over the exclusive lock, blocking access for long periods.

That is a possible issue in general, but I can't see how it could be
happening here, because the shared lock is only a mechanism for
waiting for an I/O to complete.  The backend doing the I/O grabs the
control lock, sets a flag saying there's an I/O in progress, takes the
buffer lock in exclusive mode, and releases the control lock.  The
shared locks are taken when someone notices that the flag is set on a
buffer they want to access.  So there aren't any shared lockers until
the buffer is already locked in exclusive mode.  Or at least I don't
think there are; please correct me if I'm wrong.

Now... I do think it's possible that this could happen: backend #1
wants to write the buffer, so grabs the lock and writes the buffer.
Meanwhile some waiters pile up.  When the guy doing the I/O finishes,
he releases the lock, releasing all the waiters.  They then have to
wake up and grab the lock, but maybe before they (or some of them) can
do it somebody else starts another I/O on the buffer and they all have
to go back to sleep.  That could allow the wait time to be many times
the I/O time.  If that's the case we could just make this use
LWLockAcquireOrWait(); the calling code is just going to pick a new
victim buffer anyway, so it's silly to go through additional spinlock
cycles to acquire a lock we don't want anyway.

I bet I can add some more instrumentation to get clearer data on what
is happening here.  What I've added so far doesn't seem to be
affecting performance very much.

> I would guess that is also the case with the index wait, where I would
> guess a near-root block needs an exclusive lock, but is held up by
> continual index tree descents.
>
> My (fairly old) observation is that the shared lock semantics only
> work well when exclusive locks are fairly common. When they are rare,
> the semantics work against us.
>
> We should either implement 1) non-queue jump semantics for certain
> cases 2) put a limit on the number of queue jumps that can occur
> before we let the next x lock proceed instead. (2) sounds better, but
> keeping track might well cause greater overhead.

Maybe, but your point that we should characterize the behavior before
engineering solutions is well-taken, so let's do that.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

pgsql-hackers by date:

From: Dimitri Fontaine
Date: 01 April 2012, 11:23:25
Subject: Re: Command Triggers patch v18

From: Heikki Linnakangas
Date: 01 April 2012, 16:33:12
Subject: Autovacuum worker does not set stack_base_ptr

Re: measuring lwlock-related latency spikes - Mailing list pgsql-hackers

Previous

Next