pgsql: Rewrite ConditionVariableBroadcast() to avoid live-lock. - Mailing list pgsql-committers

From Tom Lane
Subject pgsql: Rewrite ConditionVariableBroadcast() to avoid live-lock.
Date
Msg-id E1eXcF2-0003W7-Ia@gemulon.postgresql.org
Whole thread Raw
List pgsql-committers
Rewrite ConditionVariableBroadcast() to avoid live-lock.

The original implementation of ConditionVariableBroadcast was, per its
self-description, "the dumbest way possible".  Thomas Munro found out
it was a bit too dumb.  An awakened process may immediately re-queue
itself, if the specific condition it's waiting for is not yet satisfied.
If this happens before ConditionVariableBroadcast is able to see the wait
queue as empty, then ConditionVariableBroadcast will re-awaken the same
process, repeating the cycle.  Given unlucky timing this back-and-forth
can repeat indefinitely; loops lasting thousands of seconds have been
seen in testing.

To fix, add our own process to the end of the wait queue to serve as a
sentinel, and exit the broadcast loop once our process is not there
anymore.  There are various special considerations described in the
comments, the principal disadvantage being that wakers can no longer
be sure whether they awakened a real waiter or just a sentinel.  But in
practice nobody pays attention to the result of ConditionVariableSignal
or ConditionVariableBroadcast anyway, so that problem seems hypothetical.

Back-patch to v10 where condition_variable.c was introduced.

Tom Lane and Thomas Munro

Discussion: https://postgr.es/m/CAEepm=0NWKehYw7NDoUSf8juuKOPRnCyY3vuaSvhrEWsOTAa3w@mail.gmail.com

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/aced5a92bf46532466417ab485bc94006cf60d91

Modified Files
--------------
src/backend/storage/lmgr/condition_variable.c | 82 +++++++++++++++++++++++++--
1 file changed, 77 insertions(+), 5 deletions(-)


pgsql-committers by date:

Previous
From: Robert Haas
Date:
Subject: pgsql: Factor error generation out of ExecPartitionCheck.
Next
From: Tom Lane
Date:
Subject: pgsql: Reorder steps in ConditionVariablePrepareToSleep for moresafety