Heads-Up: multixact freezing bug - Mailing list pgsql-hackers

From Andres Freund
Subject Heads-Up: multixact freezing bug
Date
Msg-id 20131128152853.GU31748@awork2.anarazel.de
Whole thread Raw
Responses Re: Heads-Up: multixact freezing bug  (Andres Freund <andres@2ndquadrant.com>)
List pgsql-hackers
Hello,

Investigating corruption in a client's database we unfortunately found
another data corrupting bug that's relevant for 9.3+:

Since 9.3 heap_tuple_needs_freeze() and heap_freeze_tuple() don't
correctly handle the xids contained in a multixact. They separately do a
check for their respective cutoffs but the xids contained in a multixact
are not checked.
That doesn't have too bad consequences for multixacts that lock only,
but it can lead to errors like:
ERROR:  could not access status of transaction 3883960912
DETAIL:  Could not open file "pg_clog/0E78": No such file or directory.
when accessing a tuple. Thats because the update-xid contained in the
multixact is lower than than the global datfrozenxid we've truncated the
clog to.

Unfortunately that scenario isn't too unlikely: We use
vacuum_freeze_min_age as the basis for both, the cutoff for xid and mxid
freezing. Since in many cases multis will be generated at a lower rate
than xids, we often will not have frozen away all mxids containing xids
lower than the xid cutoff.

To recover the data there's the lucky behaviour that
HeapTupleSatisfiesVacuum() sets the XMAX_INVALID hint bit if a updating
multixact isn't running anymore. So assuming that a contained update xid
outside ShmemVariableCache's [oldestXid, nextXid) committed will often
not cause rows to spuriously disappear.

I am working on a fix for the issue, but it's noticeably less simple
than I initially thought. With the current WAL format the freezing logic
needs to work the same during normal processing and recovery.

My current thoughts are that we need to check whether any member of a
multixact needs freezing. If we find one we do MultiXactIdIsRunning() &&
MultiXactIdWait() if!InRecovery. That's pretty unlikely to be necessary,
but afaics we cannot guarntee it is not.
During recovery we do *not* need to do so since the primary will have
performed all necessary waits.

The big problem with that solution is that we need to do a
GetMultiXactIdMembers() during crash recovery which is pretty damn
ugly. But I *think*, and that's where I really would like some input,
given the way multixact WAL logging works that should be safe.

I am not really sure what to do about this. It's quite likely to cause
corruption, but the next point release is coming up way too fast for a
nontrivial fix.

Thoughts? Better ideas?

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: ERROR during end-of-xact/FATAL
Next
From: Tom Lane
Date:
Subject: Re: Another bug introduced by fastpath patch