Re: Weak-memory specific problem in ResetLatch/WaitLatch (follow-up analysis) - Mailing list pgsql-hackers

From Michael Tautschnig
Subject Re: Weak-memory specific problem in ResetLatch/WaitLatch (follow-up analysis)
Date
Msg-id 20120321180934.GR1837@l04.local
Whole thread Raw
In response to Re: Weak-memory specific problem in ResetLatch/WaitLatch (follow-up analysis)  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
Hi,

[...]
> > Placing a sync (i.e., the strongest Power barrier) accordingly would, however,
> > still be insufficient for the second problem, as it would only fix the
> > reordering of read-read pairs by Worker 1 and the store atomicity issue from
> > Worker 0. But the writes on Worker 0 could still be reordered (problem number
> > 2). One possible fix consists of placing a sync between the two writes on Worker
> > 0, and an address dependency between the two reads on Worker 1. Clearly,
> > however, these are changes that cannot any longer be hidden behind the
> > ResetLatch/WaitLatch interface, but rather go in the code using these.
>
[...]
> However, your example is enough unlike the actual code that the
> conclusion you state following the word "clearly" isn't actually clear
> to me.  According to latch.h, the correct method of using a latch is
> like this:
>
>  * for (;;)
>  * {
>  *         ResetLatch();
>  *         if (work to do)
>  *                 Do Stuff();
>  *         WaitLatch();
>  * }
>
> Meanwhile, anyone who is creating additional work to do should add the
> work to the queue and then set the latch.
>

When writing the above statement, including the "clearly", we were possibly too
much thinking of the above usage hint, which just uses ResetLatch and WaitLatch.
As you say, ...

> So it seems to me that we could potentially fix this by inserting
> barriers at the end of ResetLatch and at the beginning of SetLatch and
> WaitLatch.  Then the latch has to get reset before we check whether
> there's work to do; and we've got to finish checking for work before
> we again try to wait for the latch.  Similarly, any work that was in
> progress before SetLatch was called will be forced to be committed to
> memory before SetLatch does anything else.  Adding that many barriers
> might not be very good for performance but it seems OK from a
> correctness point of view, unless I am missing something, which is
> definitely possible.  I'd appreciate any thoughts you have on this, as
> this is clearly subtle and tricky to get exactly right.
>

... placing another barrier in "SetLatch" could just do the trick. We will apply
our tools to actually prove this and come back with the conclusive answer.

Best,
Michael


pgsql-hackers by date:

Previous
From: lekon
Date:
Subject: how can i see the log..?
Next
From: Ants Aasma
Date:
Subject: pg_upgrade incorrectly equates pg_default and database tablespace