Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updatedtuple - Mailing list pgsql-committers
From | Andres Freund |
---|---|
Subject | Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updatedtuple |
Date | |
Msg-id | 20171104131500.w7enyo7olxk7ygca@alap3.anarazel.de Whole thread Raw |
In response to | Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updatedtuple (Peter Geoghegan <pg@bowt.ie>) |
Responses |
Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updatedtuple
Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updatedtuple |
List | pgsql-committers |
On 2017-11-03 12:36:59 -0700, Peter Geoghegan wrote: > Andres Freund <andres@anarazel.de> wrote: > > Here's that patch. I've stared at this some, and Robert did too. Robert > > mentioned that the commit message might need some polish and I'm not > > 100% sure about the error message texts yet. > > The commit message should probably say that the bug involves the > resurrection of previously dead tuples, which is different to there > being duplicates because a constraint is not enforced because HOT chains > are broken (that's a separate, arguably less serious problem). The reason for that is that I hadn't yet quite figured out how the bug I described in the commit message (and the previously committed testcase) would cause that. I was planning to diagnose / experiment more with this and write an email if I couldn't come up with an explanation. The committed test does *not* actually trigger that. The reason I couldn't quite figure out how the problem triggers is that the xmax removing branch in FreezeMultiXactId() can only be reached if the multi is from before the cutoff - which it can't have been for a single vacuum execution to trigger the bug, because that'd require the multi to be running to falsely return RECENTLY_DEAD rather than DEAD (by definition a multi can't be below the cutoff if running). For the problem to occur I think vacuum has to be executed *twice*: The first time through HTSV mistakenly returns RECENTLY_DEAD preventing the tuple from being pruned. That triggers FreezeMultiXactId() to create a *new* multi with dead members. At this point the database already is in a bad state. Then in a second vacuum HTSV returns DEAD, but * Ordinarily, DEAD tuples would have been removedby * heap_page_prune(), but it's possible that the tuple * state changed since heap_page_prune()looked. In * particular an INSERT_IN_PROGRESS tuple could have * changedto DEAD if the inserter aborted. So this * cannot be considered an error condition. * .. if (HeapTupleIsHotUpdated(&tuple) || HeapTupleIsHeapOnly(&tuple)) { nkeep += 1; prevents the tuple from being removed. If now the multi xmax is below the xmin horizon it triggers /* * If the xid is older than the cutoff, it has to have aborted, * otherwisethe tuple would have gotten pruned away. */ if (TransactionIdPrecedes(xid, cutoff_xid)) { if (TransactionIdDidCommit(xid)) elog(ERROR, "can't freeze committed xmax"); *flags|= FRM_INVALIDATE_XMAX; in FreezeMultiXact. Without the new elog, this then causes xmax to be removed, reviving the tuple. The current testcase, and I think the descriptions in the relevant threads, all actually fail to point out the precise way the bug is triggered. As e.g. evidenced that the freeze-the-dead case appears to not cause any failures in !assertion builds even if the bug is present. The good news is that the error checks I added in the patch upthread prevent all of this from happening, even though I'd not yet understood the mechanics fully - it's imnsho pretty clear that we need to be more paranoid in production builds around this. A bunch of users that triggered largely "invisible" corruption (the first vacuum described above) will possibly run into one of these elog()s, but that seems far preferrable to making the corruption a lot worse. I think unfortunately the check + elog() in the HeapTupleIsHeapOnly(&tuple)) { nkeep += 1; /* * If this were to happen for a tuple that actually * neededto be frozen, we'd be in trouble, because * it'd leave a tuple below the relation's xmin * horizon alive. */ if (heap_tuple_needs_freeze(tuple.t_data, FreezeLimit, MultiXactCutoff, buf)) { elog(ERROR, "encountered tuple from before xid cutoff, sleeping"); case needs to go, because it's too coarse - there very well could be lockers or such that need to be removed and where it's fine to do so. The checks the actual freezing code ought to be sufficient however. Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
pgsql-committers by date: