Re: memory barriers (was: Yes, WaitLatch is vulnerable to weak-memory-ordering bugs) - Mailing list pgsql-hackers

From Robert Haas
Subject Re: memory barriers (was: Yes, WaitLatch is vulnerable to weak-memory-ordering bugs)
Date
Msg-id CA+TgmoYRXQeQq6RqhvoKs3LeygsB+cM9zxsnfK_PPwV_K6DMkw@mail.gmail.com
Whole thread Raw
In response to Re: memory barriers (was: Yes, WaitLatch is vulnerable to weak-memory-ordering bugs)  (Jeff Davis <pgsql@j-davis.com>)
Responses Re: memory barriers (was: Yes, WaitLatch is vulnerable to weak-memory-ordering bugs)
List pgsql-hackers
On Thu, Sep 22, 2011 at 5:45 PM, Jeff Davis <pgsql@j-davis.com> wrote:
> +This code turns out to be unsafe, because the writer might increment
> +q->num_items before it finishes storing the new item into the
> appropriate slot.
> +More subtly, the reader might prefetch the contents of the q->items
> array
> +before reading q->num_items.
>
> How would the reader prefetch the contents of the items array, without
> knowing how big it is?

By guessing or (I think) just by having a stale value left over in
some CPU cache.  It's pretty mind-bending, but it's for real.

I didn't, in either the implementation or the documentation, go much
into the difference between dependency barriers and general read
barriers.  We might need to do that at some point, but for a first
version I don't think it's necessary.  But since you asked... as I
understand it, unless you're running on Alpha, you actually don't need
a barrier here at all, because all currently-used CPUs other than
alpha "respect data dependencies", which means that if q->num_items is
used to compute an address to be read from memory, the CPU will ensure
that the read of that address is performed after the read of the value
used to compute the address.  At least that's my understanding.  But
Alpha won't.

So we could try to further distinguish between read barriers where a
data dependency is present and read barriers where no data dependency
is present, and the latter type could be a no-op on all CPUs other
than Alpha.  Or we could even jettison support for Alpha altogether if
we think it's hopelessly obsolete and omit
read-barriers-with-dependency altogether.  I think that would be a bad
idea, though.  First, it's not impossible that some future CPU could
have behavior similar to Alpha, and the likelihood of such a thing is
substantially more because of the fact that the Linux kernel, which
seems to be the gold standard in this area, still supports them.  If
we don't record places where a dependency barrier would be needed and
then need to go find them later, that will be a lot more work, and a
lot more error-prone.  Second, there's a natural pairing between read
barriers and write barriers.  Generally, for every algorithm, each
write barrier on the write side should be matched by a read barrier on
the read side.  So putting them all in will make it easier to verify
code correctness.  Now, if we find down the line that some of those
read barriers are hurting our performance on, say, Itanium, or
PowerPC, then we can certainly consider distinguishing further.  But
for round one I'm voting for not worrying about it.  I think it's
going to be a lot more important to put our energy into (1) adding
barrier implementations for any platforms that aren't included in this
initial patch that we want to support, (2) making sure that all of our
implementations actually work, and (3) making sure that the algorithms
that use them are correct.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


pgsql-hackers by date:

Previous
From: Jeff Davis
Date:
Subject: Re: memory barriers (was: Yes, WaitLatch is vulnerable to weak-memory-ordering bugs)
Next
From: Robert Haas
Date:
Subject: Re: patch: plpgsql - remove unnecessary ccache search when a array variable is updated