pgsql: Fix Hot-Standby initialization of clog and subtrans. - Mailing list pgsql-committers

From Heikki Linnakangas
Subject pgsql: Fix Hot-Standby initialization of clog and subtrans.
Date
Msg-id E1VjqKs-0002OY-IH@gemulon.postgresql.org
Whole thread Raw
List pgsql-committers
Fix Hot-Standby initialization of clog and subtrans.

These bugs can cause data loss on standbys started with hot_standby=on at
the moment they start to accept read only queries, by marking committed
transactions as uncommited. The likelihood of such corruptions is small
unless the primary has a high transaction rate.

5a031a5556ff83b8a9646892715d7fef415b83c3 fixed bugs in HS's startup logic
by maintaining less state until at least STANDBY_SNAPSHOT_PENDING state
was reached, missing the fact that both clog and subtrans are written to
before that. This only failed to fail in common cases because the usage
of ExtendCLOG in procarray.c was superflous since clog extensions are
actually WAL logged.

f44eedc3f0f347a856eea8590730769125964597/I then tried to fix the missing
extensions of pg_subtrans due to the former commit's changes - which are
not WAL logged - by performing the extensions when switching to a state
> STANDBY_INITIALIZED and not performing xid assignments before that -
again missing the fact that ExtendCLOG is unneccessary - but screwed up
twice: Once because latestObservedXid wasn't updated anymore in that
state due to the earlier commit and once by having an off-by-one error in
the loop performing extensions. This means that whenever a
CLOG_XACTS_PER_PAGE (32768 with default settings) boundary was crossed
between the start of the checkpoint recovery started from and the first
xl_running_xact record old transactions commit bits in pg_clog could be
overwritten if they started and committed in that window.

Fix this mess by not performing ExtendCLOG() in HS at all anymore since
it's unneeded and evidently dangerous and by performing subtrans
extensions even before reaching STANDBY_SNAPSHOT_PENDING.

Analysis and patch by Andres Freund. Reported by Christophe Pettus.
Backpatch down to 9.0, like the previous commit that caused this.

Branch
------
REL9_0_STABLE

Details
-------
http://git.postgresql.org/pg/commitdiff/b4f697f8ebfa2fdc36bf98c50550b21eb8e82c7d

Modified Files
--------------
src/backend/access/transam/clog.c   |    2 +-
src/backend/storage/ipc/procarray.c |   68 ++++++++++++++++++++---------------
2 files changed, 41 insertions(+), 29 deletions(-)


pgsql-committers by date:

Previous
From: Heikki Linnakangas
Date:
Subject: pgsql: Fix Hot-Standby initialization of clog and subtrans.
Next
From: Tom Lane
Date:
Subject: pgsql: Fix quoting in help messages in uuid-ossp extension scripts.