Re: [PERFORM] Cpu usage 100% on slave. s_lock problem. - Mailing list pgsql-hackers

From Merlin Moncure
Subject Re: [PERFORM] Cpu usage 100% on slave. s_lock problem.
Date
Msg-id CAHyXU0xUw0qwYCe1Fm43=Afh5iNk63BVpS4vAhjxZn9upQ-XjQ@mail.gmail.com
Whole thread Raw
In response to Re: [PERFORM] Cpu usage 100% on slave. s_lock problem.  (Ants Aasma <ants@cybertec.at>)
Responses Re: [PERFORM] Cpu usage 100% on slave. s_lock problem.  (Ants Aasma <ants@cybertec.at>)
List pgsql-hackers
On Wed, Oct 2, 2013 at 9:45 AM, Ants Aasma <ants@cybertec.at> wrote:
> I haven't reviewed the code in as much detail to say if there is an
> actual race here, I tend to think there's probably not, but the
> specific pattern that I had in mind is that with the following actual
> code:

hm.  I think there *is* a race.  2+ threads could race to the line:

LocalRecoveryInProgress = xlogctl->SharedRecoveryInProgress;

and simultaneously set the value of LocalRecoveryInProgress to false,
and both engage InitXLOGAccess, which is destructive.   The move
operation is atomic, but I don't think there's any guarantee the reads
to xlogctl->SharedRecoveryInProgress are ordered between threads
without a lock.

I don't think the memory barrier will fix this.  Do you agree?  If so,
my earlier patch (recovery4) is another take on this problem, and
arguable safer; the unlocked read is in a separate path that does not
engage InitXLOGAccess()

merlin



pgsql-hackers by date:

Previous
From: Andrew Gierth
Date:
Subject: Re: Who is pgFoundery administrator?
Next
From: Bruce Momjian
Date:
Subject: Re: [PATCH] pg_upgrade: support for btrfs copy-on-write clones