Re: FastPathStrongRelationLocks still has an issue in HEAD - Mailing list pgsql-hackers

From Tom Lane
Subject Re: FastPathStrongRelationLocks still has an issue in HEAD
Date
Msg-id 18337.1396882465@sss.pgh.pa.us
Whole thread Raw
In response to Re: FastPathStrongRelationLocks still has an issue in HEAD  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: FastPathStrongRelationLocks still has an issue in HEAD  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
Robert Haas <robertmhaas@gmail.com> writes:
> On Sun, Apr 6, 2014 at 1:23 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=rover_firefly&dt=2014-04-06%2017%3A04%3A00

> Uggh.  That's unfortunate, but not terribly surprising: I didn't think
> that missing volatile was very likely to be the cause of this.

Yeah.  That was a bug, but evidently it's not the bug we're looking for.

> Have
> we been getting random failures of this type since the fastlock stuff
> went in, and we're only just now noticing?  Or did some recent change
> expose this problem?

Not sure.  I used to rely on the pgbuildfarm-status-green daily digests
to cue me to look at transient buildfarm failures, but that list has been
AWOL for months.  However, I'm pretty sure this has not been happening
ever since 9.2, so yeah, it's at least become more probable in the last
few months.

> I'm a bit suspicious of the patches to
> static-ify stuff, since that might cause the compiler to think it
> could move things across function calls that it hadn't thought
> move-able before, but FastPathStrongLocks references would seem to be
> the obvious candidate for that, and volatile-izing it ought to have
> fixed it.  I would think.

Keep in mind also that prairiedog is running a pretty old gcc (4.0.1 if
memory serves), so I'd not expect it to be doing any crazy optimizations.
I suspect we are looking at some plain old logic bug, but as you say it's
hard to guess where exactly.

> [ LockRefindAndRelease ] lacks an
> Assert(FastPathStrongRelationLocks->count[fasthashcode] > 0).  I think
> we should add one.

Absolutely.
        regards, tom lane



pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: gsoc knn spgist
Next
From: Hai Qian
Date:
Subject: Re: GSoC 2014: Implementing clustering algorithms in MADlib