Thread: pgstattuple locking fix

pgstattuple locking fix

From
ITAGAKI Takahiro
Date:
Here is a trivial fix of locking issue in pgstattuple().
It was locking buffers around PageGetHeapFreeSpace()
in the heap scan loop, but not in the scanning of the tail.
I think we need locks in the tail, too.

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center

Attachment

Re: pgstattuple locking fix

From
Tom Lane
Date:
ITAGAKI Takahiro <itagaki.takahiro@oss.ntt.co.jp> writes:
> Here is a trivial fix of locking issue in pgstattuple().

Hmm, is this really a bug, and if so how far back does it go?
I'm thinking that having a pin on the buffer should be enough to
call PageGetHeapFreeSpace.

            regards, tom lane

Re: pgstattuple locking fix

From
ITAGAKI Takahiro
Date:
Tom Lane <tgl@sss.pgh.pa.us> wrote:

> ITAGAKI Takahiro <itagaki.takahiro@oss.ntt.co.jp> writes:
> > Here is a trivial fix of locking issue in pgstattuple().
>
> Hmm, is this really a bug, and if so how far back does it go?
> I'm thinking that having a pin on the buffer should be enough to
> call PageGetHeapFreeSpace.

Hmm... we might use pd_upper and pd_lower at different times,
but the error is ok for pgstattuple purpose.
(It might be dangerous for tuple insertion, though.)
Inconsistent usage of locks is confusable -- remove them, please.

Index: contrib/pgstattuple/pgstattuple.c
===================================================================
--- contrib/pgstattuple/pgstattuple.c    (HEAD)
+++ contrib/pgstattuple/pgstattuple.c    (working copy)
@@ -289,9 +289,7 @@
         while (block <= tupblock)
         {
             buffer = ReadBuffer(rel, block);
-            LockBuffer(buffer, BUFFER_LOCK_SHARE);
             stat.free_space += PageGetHeapFreeSpace((Page) BufferGetPage(buffer));
-            LockBuffer(buffer, BUFFER_LOCK_UNLOCK);
             ReleaseBuffer(buffer);
             block++;
         }

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center



Re: pgstattuple locking fix

From
"Heikki Linnakangas"
Date:
ITAGAKI Takahiro wrote:
> Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
>> ITAGAKI Takahiro <itagaki.takahiro@oss.ntt.co.jp> writes:
>>> Here is a trivial fix of locking issue in pgstattuple().
>> Hmm, is this really a bug, and if so how far back does it go?
>> I'm thinking that having a pin on the buffer should be enough to
>> call PageGetHeapFreeSpace.
>
> Hmm... we might use pd_upper and pd_lower at different times,
> but the error is ok for pgstattuple purpose.
> (It might be dangerous for tuple insertion, though.)
> Inconsistent usage of locks is confusable -- remove them, please.

No I think the original patch was right. You can indeed read
inconsistent values for pd_upper and pd_lower, though the window is very
small. But a bigger issue is that in 8.3 PageGetHeapFreeSpace counts the
used line pointers to see if MaxHeapTuplesPerPage has been reached, and
I'm not convinced you can do that safely without holding a lock.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

Re: pgstattuple locking fix

From
"Heikki Linnakangas"
Date:
Heikki Linnakangas wrote:
> ITAGAKI Takahiro wrote:
>> Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>
>>> ITAGAKI Takahiro <itagaki.takahiro@oss.ntt.co.jp> writes:
>>>> Here is a trivial fix of locking issue in pgstattuple().
>>> Hmm, is this really a bug, and if so how far back does it go?
>>> I'm thinking that having a pin on the buffer should be enough to
>>> call PageGetHeapFreeSpace.
>> Hmm... we might use pd_upper and pd_lower at different times,
>> but the error is ok for pgstattuple purpose.
>> (It might be dangerous for tuple insertion, though.)
>> Inconsistent usage of locks is confusable -- remove them, please.
>
> No I think the original patch was right. You can indeed read
> inconsistent values for pd_upper and pd_lower, though the window is very
> small. But a bigger issue is that in 8.3 PageGetHeapFreeSpace counts the
> used line pointers to see if MaxHeapTuplesPerPage has been reached, and
> I'm not convinced you can do that safely without holding a lock.

On second thought, we do call PageGetHeapFreeSpace without holding a
lock in heap_page_prune_opt as well, so it better be safe. Looking
closer at PageGetHeapFreeSpace, I think it is. The
return value can be bogus, of course.

That's worth noting in the comments:

*** src/backend/storage/page/bufpage.c  21 Sep 2007 21:25:42 -0000      1.75
--- src/backend/storage/page/bufpage.c  22 Oct 2007 10:06:02 -0000
***************
*** 506,511 ****
--- 506,514 ----
   * or dead line pointers it'd be possible to have too many line pointers.
   * To avoid breaking code that assumes MaxHeapTuplesPerPage is a hard
limit
   * on the number of line pointers, we make this extra check.)
+  *
+  * You don't need to hold a lock on the page to call this function,
but if
+  * you don't, the return value should be considered a hint only.
   */
  Size
  PageGetHeapFreeSpace(Page page)

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

Re: pgstattuple locking fix

From
Tom Lane
Date:
"Heikki Linnakangas" <heikki@enterprisedb.com> writes:
> ITAGAKI Takahiro wrote:
>> Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> I'm thinking that having a pin on the buffer should be enough to
>>> call PageGetHeapFreeSpace.
>>
>> Hmm... we might use pd_upper and pd_lower at different times,

> No I think the original patch was right. You can indeed read
> inconsistent values for pd_upper and pd_lower, though the window is very
> small.

Good point.  I'll apply the original patch and also Heikki's suggested
comment.

            regards, tom lane

Re: pgstattuple locking fix

From
Tom Lane
Date:
"Heikki Linnakangas" <heikki@enterprisedb.com> writes:
> On second thought, we do call PageGetHeapFreeSpace without holding a
> lock in heap_page_prune_opt as well, so it better be safe. Looking
> closer at PageGetHeapFreeSpace, I think it is. The
> return value can be bogus, of course.

> That's worth noting in the comments:

On closer look, the comments in heap_page_prune_opt seem to address
the issue already, and I'm not sure why we should comment
PageGetHeapFreeSpace this way but not all the other variants in
bufpage.c.

            regards, tom lane