Re: heap_lock_updated_tuple_rec can leak a buffer refcount - Mailing list pgsql-hackers

From Tom Lane
Subject Re: heap_lock_updated_tuple_rec can leak a buffer refcount
Date
Msg-id 31560.1520027792@sss.pgh.pa.us
Whole thread Raw
In response to Re: heap_lock_updated_tuple_rec can leak a buffer refcount  (Alexander Kuzmenkov <a.kuzmenkov@postgrespro.ru>)
Responses Re: heap_lock_updated_tuple_rec can leak a buffer refcount  (Amit Kapila <amit.kapila16@gmail.com>)
List pgsql-hackers
Alexander Kuzmenkov <a.kuzmenkov@postgrespro.ru> writes:
> Looks like a leak indeed, the fix seems right.

Yup, it's a leak.  It's hard to hit because you need to be starting
with an update of a tuple in an all-visible page; otherwise we never
pin the vm page so there's nothing to leak.  But if you lobotomize
the test a few lines above so that it always pins the vm page, then
the regression tests (specifically combocid) reveal the leak, and
show that the proposed patch indeed fixes it.

However ... with said lobotomization, the isolation tests trigger an
Assert(BufferIsPinned(buffer)) inside visibilitymap_pin, showing that
there's another bug here too.  That seems to be because at the bottom
of the outer loop, we do

        if (vmbuffer != InvalidBuffer)
            ReleaseBuffer(vmbuffer);

and then loop back around with vmbuffer still not equal to InvalidBuffer.
This causes the next loop iteration's visibilitymap_pin call to think it
needs to release that vmbuffer pin a second time; kaboom.

And eyeing this, I see still a third problem: if either of the "goto l4"
jumps occur, we'll loop back to l4 with vmbuffer possibly pinned, and then
if the new page isn't all-visible, we'll just set vmbuffer = InvalidBuffer
and leak the pin that way.  (If it is all-visible, we unpin the old page
correctly in the visibilitymap_pin call, but that can charitably be
described as accidental.)

In short, this is pretty darn broken.  We need to treat the vmbuffer
variable honestly as state that may persist across either the outer loop
or the "goto l4" sub-loop.  Furthermore, it's not really cool to use
"vmbuffer == InvalidBuffer" as the indicator of whether we acquired the
vmbuffer pin pre-lock.  To do that, we'd be forced to drop the old pin in
the not-all-visible path, even though we might need it right back again.
Also, remembering that one vm page serves many heap pages, even if we have
a vm pin for the "wrong" page it's conceivable that it'd be the right one
for the next time we actually need it.  So we should use the
visibilitymap_pin API the way it's meant to be used, and hold any vm pin
we've acquired until the very end of the function.

Hence, I propose the attached patch.  The test lobotomization
(the "if (1) //" change) isn't meant for commit but shows how I tested
the take-the-pin paths.  This passes make check-world with or without
the lobotomization change.

            regards, tom lane

diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index dc762f9..42eac78 100644
*** a/src/backend/access/heap/heapam.c
--- b/src/backend/access/heap/heapam.c
*************** heap_lock_updated_tuple_rec(Relation rel
*** 5677,5682 ****
--- 5677,5683 ----
                  new_xmax;
      TransactionId priorXmax = InvalidTransactionId;
      bool        cleared_all_frozen = false;
+     bool        pinned_desired_page;
      Buffer        vmbuffer = InvalidBuffer;
      BlockNumber block;

*************** heap_lock_updated_tuple_rec(Relation rel
*** 5698,5704 ****
               * chain, and there's no further tuple to lock: return success to
               * caller.
               */
!             return HeapTupleMayBeUpdated;
          }

  l4:
--- 5699,5706 ----
               * chain, and there's no further tuple to lock: return success to
               * caller.
               */
!             result = HeapTupleMayBeUpdated;
!             goto out_unlocked;
          }

  l4:
*************** l4:
*** 5710,5719 ****
           * someone else might be in the middle of changing this, so we'll need
           * to recheck after we have the lock.
           */
!         if (PageIsAllVisible(BufferGetPage(buf)))
              visibilitymap_pin(rel, block, &vmbuffer);
          else
!             vmbuffer = InvalidBuffer;

          LockBuffer(buf, BUFFER_LOCK_EXCLUSIVE);

--- 5712,5724 ----
           * someone else might be in the middle of changing this, so we'll need
           * to recheck after we have the lock.
           */
!         if (1) // (PageIsAllVisible(BufferGetPage(buf)))
!         {
              visibilitymap_pin(rel, block, &vmbuffer);
+             pinned_desired_page = true;
+         }
          else
!             pinned_desired_page = false;

          LockBuffer(buf, BUFFER_LOCK_EXCLUSIVE);

*************** l4:
*** 5722,5729 ****
           * all visible while we were busy locking the buffer, we'll have to
           * unlock and re-lock, to avoid holding the buffer lock across I/O.
           * That's a bit unfortunate, but hopefully shouldn't happen often.
           */
!         if (vmbuffer == InvalidBuffer && PageIsAllVisible(BufferGetPage(buf)))
          {
              LockBuffer(buf, BUFFER_LOCK_UNLOCK);
              visibilitymap_pin(rel, block, &vmbuffer);
--- 5727,5739 ----
           * all visible while we were busy locking the buffer, we'll have to
           * unlock and re-lock, to avoid holding the buffer lock across I/O.
           * That's a bit unfortunate, but hopefully shouldn't happen often.
+          *
+          * Note: in some paths through this function, we will reach here
+          * holding a pin on a vm page that may or may not be the one matching
+          * this page.  If this page isn't all-visible, we won't use the vm
+          * page, but we hold onto such a pin till the end of the function.
           */
!         if (!pinned_desired_page && PageIsAllVisible(BufferGetPage(buf)))
          {
              LockBuffer(buf, BUFFER_LOCK_UNLOCK);
              visibilitymap_pin(rel, block, &vmbuffer);
*************** l4:
*** 5749,5756 ****
           */
          if (TransactionIdDidAbort(HeapTupleHeaderGetXmin(mytup.t_data)))
          {
!             UnlockReleaseBuffer(buf);
!             return HeapTupleMayBeUpdated;
          }

          old_infomask = mytup.t_data->t_infomask;
--- 5759,5766 ----
           */
          if (TransactionIdDidAbort(HeapTupleHeaderGetXmin(mytup.t_data)))
          {
!             result = HeapTupleMayBeUpdated;
!             goto out_locked;
          }

          old_infomask = mytup.t_data->t_infomask;
*************** next:
*** 5957,5964 ****
          priorXmax = HeapTupleHeaderGetUpdateXid(mytup.t_data);
          ItemPointerCopy(&(mytup.t_data->t_ctid), &tupid);
          UnlockReleaseBuffer(buf);
-         if (vmbuffer != InvalidBuffer)
-             ReleaseBuffer(vmbuffer);
      }

      result = HeapTupleMayBeUpdated;
--- 5967,5972 ----
*************** next:
*** 5966,5976 ****
  out_locked:
      UnlockReleaseBuffer(buf);

      if (vmbuffer != InvalidBuffer)
          ReleaseBuffer(vmbuffer);

      return result;
-
  }

  /*
--- 5974,5984 ----
  out_locked:
      UnlockReleaseBuffer(buf);

+ out_unlocked:
      if (vmbuffer != InvalidBuffer)
          ReleaseBuffer(vmbuffer);

      return result;
  }

  /*

pgsql-hackers by date:

Previous
From: Pavel Stehule
Date:
Subject: Re: [HACKERS] PoC plpgsql - possibility to force custom or generic plan
Next
From: David Gould
Date:
Subject: Re: [patch] BUG #15005: ANALYZE can make pg_class.reltuplesinaccurate.