Re: [PERFORM] Cpu usage 100% on slave. s_lock problem. - Mailing list pgsql-hackers

From Ants Aasma
Subject Re: [PERFORM] Cpu usage 100% on slave. s_lock problem.
Date
Msg-id CA+CSw_uB5U8Wm1-t_GNHm382y7i+bPd=CasMErTGqN_u+j3KqQ@mail.gmail.com
Whole thread Raw
In response to Re: [PERFORM] Cpu usage 100% on slave. s_lock problem.  (Merlin Moncure <mmoncure@gmail.com>)
Responses Re: [PERFORM] Cpu usage 100% on slave. s_lock problem.  (Merlin Moncure <mmoncure@gmail.com>)
List pgsql-hackers
On Wed, Oct 2, 2013 at 4:39 PM, Merlin Moncure <mmoncure@gmail.com> wrote:
> On Mon, Sep 30, 2013 at 7:51 PM, Ants Aasma <ants@cybertec.at> wrote:
>> So we need a read barrier somewhere *after* reading the flag in
>> RecoveryInProgress() and reading the shared memory structures, and in
>> theory a full barrier if we are going to be writing data.
>
> wow -- thanks for your review and provided detail.  Considering there
> are no examples of barrier instructions to date, I think some of your
> commentary should be included in the in-source documentation.
>
> In this particular case, a read barrier should be sufficient?  By
> 'writing data', do you mean to the xlog control structure?  This
> routine only sets a backend local flag so that should be safe?

I haven't reviewed the code in as much detail to say if there is an
actual race here, I tend to think there's probably not, but the
specific pattern that I had in mind is that with the following actual
code:

Process A:
globalVar = 1;
write_barrier();
recoveryInProgress = false;

Process B:
if (!recoveryInProgress) {   globalVar = 2;   doWork();
}

If process B speculatively executes line 2 before reading the flag for
line 1, then it's possible that the store in process B is executed
before the store in process A, making globalVar move backwards. The
barriers as they are defined don't make this scenario impossible. That
said, I don't know of any hardware that would make speculatively
executed stores visible to non-speculative state, as I said, that
would be completely insane. However currently compilers consider it
completely legal to rewrite the code into the following form:

tmp = globalVar;
globalVar = 2;
if (!recoveryInProgress) {   doWork();
} else {   globalVar = tmp;
}

That still exhibits the same problem. An abstract read barrier would
not be enough here, as this requires a LoadStore barrier. However, the
control dependency is enough for the hardware and PostgreSQL
pg_read_barrier() is a full compiler barrier, so in practice a simple
pg_read_barrier() is enough.

Regards,
Ants Aasma
--
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de



pgsql-hackers by date:

Previous
From: Magnus Hagander
Date:
Subject: Prevent pg_basebackup -Fp -D -?
Next
From: Robert Haas
Date:
Subject: Re: logical changeset generation v6.1