Race conditions in relcache load (again) - Mailing list pgsql-hackers

From Tom Lane
Subject Race conditions in relcache load (again)
Date
Msg-id 2010.1208183154@sss.pgh.pa.us
Whole thread Raw
Responses Re: Race conditions in relcache load (again)  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Awhile back we did some significant rejiggering to ensure that no
relcache load would be attempted without holding at least
AccessShareLock on the relation.  (Otherwise, if someone else
is in process of making an update to one of the system catalog
rows defining the relation, there's a race condition for SnapshotNow
scans: the new row version might not be committed when you scan it,
and if you come to the old row version second, it could be committed
dead by the time you scan it, and then you don't see the row at all.)

While thinking about Darren Reed's repeat trouble report
http://archives.postgresql.org/pgsql-admin/2008-04/msg00113.php
I realized that we failed to plug all the gaps of this type,
because relcache.c contains *internal* cache load/reload operations
that aren't protected.  In particular the LOAD_CRIT_INDEX macro
calls invoke relcache load on indexes that aren't locked.  So they'd
be at risk from a concurrent REINDEX or similar on those system
indexes.  RelationReloadIndexInfo seems at risk as well.

AFAICS this doesn't explain Darren's problem because it would only
be a transient failure at the instant of committing the REINDEX;
and whatever he's being burnt by has persistent effects.  Nonetheless
it sure looks like a bug.  Anyone think it isn't necessary to lock
the target relation here?
        regards, tom lane


pgsql-hackers by date:

Previous
From: Csaba Nagy
Date:
Subject: Re: Cached Query Plans (was: global prepared statements)
Next
From: PFC
Date:
Subject: Re: Cached Query Plans (was: global prepared statements)