Re: measuring lwlock-related latency spikes - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: measuring lwlock-related latency spikes |
Date | |
Msg-id | CA+TgmoZAffdS7jS7y8zPTCGTvZx-bB6q3GbLpA66citC=HftYQ@mail.gmail.com Whole thread Raw |
In response to | Re: measuring lwlock-related latency spikes (Simon Riggs <simon@2ndQuadrant.com>) |
Responses |
Re: measuring lwlock-related latency spikes
|
List | pgsql-hackers |
On Sun, Apr 1, 2012 at 7:07 AM, Simon Riggs <simon@2ndquadrant.com> wrote: > First, we need to determine that it is the clog where this is happening. I can confirm that based on the LWLockIds. There were 32 of them beginning at lock id 81, and a gdb session confirms that ClogCtlData->shared->buffer_locks[0..31] point to exact that set of LWLockIds. > Also, you're assuming this is an I/O issue. I think its more likely > that this is a lock starvation issue. Shared locks queue jump > continually over the exclusive lock, blocking access for long periods. That is a possible issue in general, but I can't see how it could be happening here, because the shared lock is only a mechanism for waiting for an I/O to complete. The backend doing the I/O grabs the control lock, sets a flag saying there's an I/O in progress, takes the buffer lock in exclusive mode, and releases the control lock. The shared locks are taken when someone notices that the flag is set on a buffer they want to access. So there aren't any shared lockers until the buffer is already locked in exclusive mode. Or at least I don't think there are; please correct me if I'm wrong. Now... I do think it's possible that this could happen: backend #1 wants to write the buffer, so grabs the lock and writes the buffer. Meanwhile some waiters pile up. When the guy doing the I/O finishes, he releases the lock, releasing all the waiters. They then have to wake up and grab the lock, but maybe before they (or some of them) can do it somebody else starts another I/O on the buffer and they all have to go back to sleep. That could allow the wait time to be many times the I/O time. If that's the case we could just make this use LWLockAcquireOrWait(); the calling code is just going to pick a new victim buffer anyway, so it's silly to go through additional spinlock cycles to acquire a lock we don't want anyway. I bet I can add some more instrumentation to get clearer data on what is happening here. What I've added so far doesn't seem to be affecting performance very much. > I would guess that is also the case with the index wait, where I would > guess a near-root block needs an exclusive lock, but is held up by > continual index tree descents. > > My (fairly old) observation is that the shared lock semantics only > work well when exclusive locks are fairly common. When they are rare, > the semantics work against us. > > We should either implement 1) non-queue jump semantics for certain > cases 2) put a limit on the number of queue jumps that can occur > before we let the next x lock proceed instead. (2) sounds better, but > keeping track might well cause greater overhead. Maybe, but your point that we should characterize the behavior before engineering solutions is well-taken, so let's do that. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: