Thread: heap_tuple_needs_freeze false positive
Hi, I noticed that heap_tuple_needs_freeze might return true in cases where the Xmax is leftover junk from somebody who set HEAP_XMAX_INVALID in the far past without resetting the Xmax value itself to Invalid. I think this is incorrect usage; the rule, I think, is that one shouldn't even read Xmax at all unless HEAP_XMAX_INVALID is reset. This might cause unnecessary acquisitions of the cleanup lock, if a tuple is deemed freezable when in fact it isn't. Suggested patch attached. I'd backpatch this as far as it applies cleanly. -- Álvaro Herrera <alvherre@alvh.no-ip.org>
Attachment
Alvaro Herrera <alvherre@alvh.no-ip.org> writes: > ! if (!(tuple->t_infomask & HEAP_XMAX_INVALID)) > { > ! if (!(tuple->t_infomask & HEAP_XMAX_IS_MULTI)) How about just one test, if (!(tuple->t_infomask & (HEAP_XMAX_INVALID | HEAP_XMAX_IS_MULTI))) But other than that quibble, yeah, it's a bug. XMAX_INVALID means just that: the xmax is not to be thought valid. regards, tom lane
On Wed, Feb 1, 2012 at 8:01 PM, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > Suggested patch attached. I'd backpatch this as far as it applies > cleanly. This is new code in 9.2, but it's modelled on heap_freeze_tuple(), which is old. I'm not convinced that it's a bug. Suppose that xmax is set but is hinted as invalid. We process the table and advanced relfrozenxid; then, we crash. After recovery, it's possible that the hint bit is gone (after all, setting hint bits isn't WAL-logged). Now we're in big trouble, because the next CLOG lookup on that xmax value might not happen until it's been reused, and we might get a different answer than before. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > I'm not convinced that it's a bug. Suppose that xmax is set but is > hinted as invalid. XMAX_INVALID is not a "hint". When it's set, the contents of the field must be presumed to be garbage. Any code failing to adhere to that rule is broken. > We process the table and advanced relfrozenxid; > then, we crash. After recovery, it's possible that the hint bit is > gone (after all, setting hint bits isn't WAL-logged). Now we're in > big trouble, because the next CLOG lookup on that xmax value might not > happen until it's been reused, and we might get a different answer > than before. I believe we have adequate defenses against that, and even if we did not, that doesn't make the code in question less wrong. regards, tom lane
On Thu, Feb 2, 2012 at 11:27 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> I'm not convinced that it's a bug. Suppose that xmax is set but is >> hinted as invalid. > > XMAX_INVALID is not a "hint". When it's set, the contents of the field > must be presumed to be garbage. Any code failing to adhere to that rule > is broken. > >> We process the table and advanced relfrozenxid; >> then, we crash. After recovery, it's possible that the hint bit is >> gone (after all, setting hint bits isn't WAL-logged). Now we're in >> big trouble, because the next CLOG lookup on that xmax value might not >> happen until it's been reused, and we might get a different answer >> than before. > > I believe we have adequate defenses against that, and even if we did > not, that doesn't make the code in question less wrong. I believe the adequate defense that we have is precisely the logic you are proposing to change. Regardless of whether you want to call XMAX_INVALID a hint or, say, a giant tortoise, I am fairly sure that we don't WAL-log setting it. That means that a bit set before a crash won't necessarily still be set after a crash. But the corresponding relfrozenxid advancement will be WAL-logged, leading to the problem scenario I described. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Thu, Feb 2, 2012 at 12:45 PM, Robert Haas <robertmhaas@gmail.com> wrote: > I believe the adequate defense that we have is precisely the logic you > are proposing to change. Regardless of whether you want to call > XMAX_INVALID a hint or, say, a giant tortoise, I am fairly sure that > we don't WAL-log setting it. That means that a bit set before a crash > won't necessarily still be set after a crash. But the corresponding > relfrozenxid advancement will be WAL-logged, leading to the problem > scenario I described. To put that another way, the problem isn't that we might have code somewhere in the system that ignores HEAP_XMAX_INVALID. The problem is that HEAP_XMAX_INVALID might not still be set on that tuple the next time somebody looks at it, if a database crash intervenes after that bit is set and before it is flushed to disk. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company