pgsql: Fix management of pendingOpsTable in auxiliary processes. - Mailing list pgsql-committers

From Tom Lane
Subject pgsql: Fix management of pendingOpsTable in auxiliary processes.
Date
Msg-id E1Sra4R-0001kt-C3@gemulon.postgresql.org
Whole thread Raw
Responses Re: pgsql: Fix management of pendingOpsTable in auxiliary processes.  (Simon Riggs <simon@2ndQuadrant.com>)
List pgsql-committers
Fix management of pendingOpsTable in auxiliary processes.

mdinit() was misusing IsBootstrapProcessingMode() to decide whether to
create an fsync pending-operations table in the current process.  This led
to creating a table not only in the startup and checkpointer processes as
intended, but also in the bgwriter process, not to mention other auxiliary
processes such as walwriter and walreceiver.  Creation of the table in the
bgwriter is fatal, because it absorbs fsync requests that should have gone
to the checkpointer; instead they just sit in bgwriter local memory and are
never acted on.  So writes performed by the bgwriter were not being fsync'd
which could result in data loss after an OS crash.  I think there is no
live bug with respect to walwriter and walreceiver because those never
perform any writes of shared buffers; but the potential is there for
future breakage in those processes too.

To fix, make AuxiliaryProcessMain() export the current process's
AuxProcType as a global variable, and then make mdinit() test directly for
the types of aux process that should have a pendingOpsTable.  Having done
that, we might as well also get rid of the random bool flags such as
am_walreceiver that some of the aux processes had grown.  (Note that we
could not have fixed the bug by examining those variables in mdinit(),
because it's called from BaseInit() which is run by AuxiliaryProcessMain()
before entering any of the process-type-specific code.)

Back-patch to 9.2, where the problem was introduced by the split-up of
bgwriter and checkpointer processes.  The bogus pendingOpsTable exists
in walwriter and walreceiver processes in earlier branches, but absent
any evidence that it causes actual problems there, I'll leave the older
branches alone.

Branch
------
master

Details
-------
http://git.postgresql.org/pg/commitdiff/4a9c30a8a1d3a786abc4b8d95f0182463f66f919

Modified Files
--------------
src/backend/access/transam/xlog.c     |    2 +-
src/backend/bootstrap/bootstrap.c     |   20 +++++++++------
src/backend/postmaster/bgwriter.c     |   11 +-------
src/backend/postmaster/checkpointer.c |   13 ++++------
src/backend/postmaster/walwriter.c    |    4 +-
src/backend/replication/walreceiver.c |    6 +----
src/backend/storage/ipc/procsignal.c  |    1 -
src/backend/storage/smgr/md.c         |    7 ++---
src/include/bootstrap/bootstrap.h     |   12 ---------
src/include/miscadmin.h               |   42 ++++++++++++++++++++++++++++----
src/include/replication/walreceiver.h |    1 -
11 files changed, 62 insertions(+), 57 deletions(-)


pgsql-committers by date:

Previous
From: Tom Lane
Date:
Subject: pgsql: Fix management of pendingOpsTable in auxiliary processes.
Next
From: Tom Lane
Date:
Subject: pgsql: Fix statistics breakage from bgwriter/checkpointer process split