Home > mailing lists

Re: Race condition in HEAD, possibly due to PGPROC splitup - Mailing list pgsql-hackers

From	Robert Haas
Subject	Re: Race condition in HEAD, possibly due to PGPROC splitup
Date	December 15, 2011 13:13:42
Msg-id	CA+TgmoaaGYRdMLzYfgYmt2iiHxnO8NxERicnB_Shutp+UnDL+Q@mail.gmail.com Whole thread Raw
In response to	Re: Race condition in HEAD, possibly due to PGPROC splitup (Simon Riggs <simon@2ndQuadrant.com>)
Responses	Re: Race condition in HEAD, possibly due to PGPROC splitup (Simon Riggs <simon@2ndQuadrant.com>)
List	pgsql-hackers

Tree view

On Thu, Dec 15, 2011 at 8:19 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
> Comments?

I think there is a small bug here:

+             TransactionId xid = pgxact->xid;
+
+             /*
+              * Don't record locks for transactions if we know they have already
+              * issued their WAL record for commit but not yet released lock.
+              * It is still possible that we see locks held by already complete
+              * transactions, if they haven't yet zeroed their xids.
+              */
+             if (!TransactionIdIsValid(xid))
+                 continue;
             accessExclusiveLocks[index].xid = pgxact->xid;             accessExclusiveLocks[index].dbOid =
lock->tag.locktag_field1;

...because you're fetching pgxact->xid, testing whether it's valid,
and then fetching it again.  It could change in the meantime.  You
probably want to change the assignment to read:
             accessExclusiveLocks[index].xid = xid;

Also, we should probably change this pointer to be declared volatile:
             PGXACT       *pgxact = &ProcGlobal->allPgXact[proc->pgprocno];

Otherwise, I think the compiler might get cute and decide to fetch the
xid twice anyway, effectively undoing our attempt to pull it into a
local variable.

I think this comment could be clarified in some way, to make it more
clear that we had a bug at one point, and it was fixed - the "from a
time when they were possible" language wouldn't be entirely clear to
me after the fact:

+  * Zero xids should no longer be possible, but we may be replaying WAL
+  * from a time when they were possible.

It would probably make sense to break out of this loop if a match is found:

!             for (i = 0; i < nxids; i++)
!             {
!                 if (lock->xid == xids[i])
!                     found = true;
!             }

I'm not sure I fully understand the rest of this, but those are the
only things I noticed on a read-through.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

pgsql-hackers by date:

From: Heikki Linnakangas
Date: 15 December 2011, 12:51:57
Subject: Moving more work outside WALInsertLock

From: Andres Freund
Date: 15 December 2011, 13:32:32
Subject: Re: Moving more work outside WALInsertLock

Re: Race condition in HEAD, possibly due to PGPROC splitup - Mailing list pgsql-hackers

Previous

Next