pgsql: Report an ERROR if a parallel worker fails to start properly. - Mailing list pgsql-committers

From Robert Haas
Subject pgsql: Report an ERROR if a parallel worker fails to start properly.
Date
Msg-id E1ee1Cz-0001sp-9t@gemulon.postgresql.org
Whole thread Raw
List pgsql-committers
Report an ERROR if a parallel worker fails to start properly.

Commit 28724fd90d2f85a0573a8107b48abad062a86d83 fixed things so that
if a background worker fails to start due to fork() failure or because
it is terminated before startup succeeds, BGWH_STOPPED will be
reported.  However, that only helps if the code that uses the
background worker machinery notices the change in status, and the code
in parallel.c did not.

To fix that, do two things.  First, make sure that when a worker
exits, it triggers the leader to read from error queues.  That way, if
a worker which has attached to an error queue exits uncleanly, the
leader is sure to throw some error, either the contents of the
ErrorResponse sent by the worker, or "lost connection to parallel
worker" if it exited without sending one.  To cover the case where
the worker never starts up in the first place or exits before
attaching to the error queue, the ParallelContext now keeps track
of which workers have sent at least one message via the error
queue.  A worker which sends no messages by the time the parallel
operation finishes will be checked to see whether it exited before
attaching to the error queue; if so, a new error message, "parallel
worker failed to initialize", will be reported.  If not, we'll
continue to wait until it either starts up and exits cleanly, starts
up and exits uncleanly, or fails to start, and then take the
appropriate action.

Patch by me, reviewed by Amit Kapila.

Discussion: http://postgr.es/m/CA+TgmoYnBgXgdTu6wk5YPdWhmgabYc9nY_pFLq=tB=FSLYkD8Q@mail.gmail.com

Branch
------
REL9_6_STABLE

Details
-------
https://git.postgresql.org/pg/commitdiff/2843c01a56eb2116a7cf871d7ed324ed3d5e522e

Modified Files
--------------
src/backend/access/transam/parallel.c | 118 +++++++++++++++++++++++++++++++---
src/include/access/parallel.h         |   1 +
2 files changed, 110 insertions(+), 9 deletions(-)


pgsql-committers by date:

Previous
From: Tom Lane
Date:
Subject: pgsql: In pg_dump,force reconnection after issuing ALTER DATABASE SET
Next
From: Robert Haas
Date:
Subject: pgsql: Update obsolete sentence in README.parallel.