Re: SKIP LOCKED DATA (work in progress) - Mailing list pgsql-hackers

From Craig Ringer
Subject Re: SKIP LOCKED DATA (work in progress)
Date
Msg-id 53FA9681.8050003@2ndquadrant.com
Whole thread Raw
In response to Re: SKIP LOCKED DATA (work in progress)  (Thomas Munro <munro@ip9.org>)
List pgsql-hackers
On 08/25/2014 09:44 AM, Thomas Munro wrote:
> On 24 August 2014 22:04, Thomas Munro <munro@ip9.org> wrote:
>> On 22 August 2014 23:02, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
>>> Did you consider heap_lock_updated_tuple?  A rationale for saying it
>>> doesn't need to pay attention to the wait policy is: if you're trying to
>>> lock-skip-locked an updated tuple, then you either skip it because its
>>> updater is running, or you return it because it's no longer running; and
>>> if you return it, it is not possible for the updater to be locking the
>>> updated version.  However, what if there's a third transaction that
>>> locked the updated version?  It might be difficult to hit this case or
>>> construct an isolationtester spec file though, since there's a narrow
>>> window you need to race to.
>>
>> Hmm.  I will look into this, thanks.
> 
> While trying to produce the heap_lock_updated_tuple_rec case you
> describe (so far unsuccessfully), I discovered I could make SELECT ...
> FOR UPDATE NOWAIT block indefinitely on unpatched 9.3 in a different
> code path after heap_lock_tuple returns: in another session, UPDATE,
> COMMIT, then UPDATE, all after the first session has taken its
> snapshot but before it tries to lock a given row.  The code in
> EvalPlanQualFetch (reached from the HeapTupleUpdated case in
> ExecLockRow) finishes up waiting for the uncommitted transaction.

I think that's the issue Andres and I patched for 9.3, but I don't know
if it got committed.

I'll need to have a look. A search in the archives for heap_lock_tuple
and nowait might be informative.

> The difficulty of course will be testing all these racy cases reproducibly...

Yep, especially as isolationtester can only really work at the statement
level, and can only handle one blocked connection at a time.

It's possible a helper extension could be used - set up some locks in
shmem, register two sessions for different "test roles" within a given
test to install appropriate hooks, wait until they're both blocked on
the locks the hooks acquire, then release the locks.

That might land up with hook points scattered everywhere, but they could
be some pretty minimal macros at least.

IMO that's a separate project, though. For this kind of testing I've
tended to just set conditional breakpoints in gdb, wait until they trap,
then once everything's lined up release the breakpoints in both sessions
as simultaneously as possible. Not perfect, but has tended to work.

-- Craig Ringer                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



pgsql-hackers by date:

Previous
From: Thomas Munro
Date:
Subject: Re: SKIP LOCKED DATA (work in progress)
Next
From: Alvaro Herrera
Date:
Subject: Re: SKIP LOCKED DATA (work in progress)