pgsql: Fix torn-page, unlogged xid and further risks from heap_update() - Mailing list pgsql-committers

From Andres Freund
Subject pgsql: Fix torn-page, unlogged xid and further risks from heap_update()
Date
Msg-id E1bOE3A-00085h-6j@gemulon.postgresql.org
Whole thread Raw
List pgsql-committers
Fix torn-page, unlogged xid and further risks from heap_update().

When heap_update needs to look for a page for the new tuple version,
because the current one doesn't have sufficient free space, or when
columns have to be processed by the tuple toaster, it has to release the
lock on the old page during that. Otherwise there'd be lock ordering and
lock nesting issues.

To avoid concurrent sessions from trying to update / delete / lock the
tuple while the page's content lock is released, the tuple's xmax is set
to the current session's xid.

That unfortunately was done without any WAL logging, thereby violating
the rule that no XIDs may appear on disk, without an according WAL
record.  If the database were to crash / fail over when the page level
lock is released, and some activity lead to the page being written out
to disk, the xid could end up being reused; potentially leading to the
row becoming invisible.

There might be additional risks by not having t_ctid point at the tuple
itself, without having set the appropriate lock infomask fields.

To fix, compute the appropriate xmax/infomask combination for locking
the tuple, and perform WAL logging using the existing XLOG_HEAP_LOCK
record. That allows the fix to be backpatched.

This issue has existed for a long time. There appears to have been
partial attempts at preventing dangers, but these never have fully been
implemented, and were removed a long time ago, in
11919160 (cf. HEAP_XMAX_UNLOGGED).

In master / 9.6, there's an additional issue, namely that the
visibilitymap's freeze bit isn't reset at that point yet. Since that's a
new issue, introduced only in a892234f830, that'll be fixed in a
separate commit.

Author: Masahiko Sawada and Andres Freund
Reported-By: Different aspects by Thomas Munro, Noah Misch, and others
Discussion: CAEepm=3fWAbWryVW9swHyLTY4sXVf0xbLvXqOwUoDiNCx9mBjQ@mail.gmail.com
Backpatch: 9.1/all supported versions

Branch
------
master

Details
-------
http://git.postgresql.org/pg/commitdiff/bfa2ab56bb8c512dc8613ee3ff0936568a1c8418

Modified Files
--------------
src/backend/access/heap/heapam.c | 96 ++++++++++++++++++++++++++++++----------
1 file changed, 73 insertions(+), 23 deletions(-)


pgsql-committers by date:

Previous
From: Andres Freund
Date:
Subject: pgsql: Fix torn-page, unlogged xid and further risks from heap_update()
Next
From: Andres Freund
Date:
Subject: pgsql: Fix torn-page, unlogged xid and further risks from heap_update()