Thread: Various silliness in heap_getnext and related routines

Various silliness in heap_getnext and related routines

From
Tom Lane
Date:
Robert Bruccoleri (bruc@stone.congen.com) wrote:
> It's not clear to me why the spinlock needs be grabbed at the
> beginning of RelationGetBufferWithBuffer,

I believe you are right: the spinlock doesn't need to be grabbed,
because if a valid buffer is passed in, it must already be pinned
(since the returned buffer is expected to be pinned).  Hence the check
for same-buffer could be done without first grabbing the spinlock.

In fact, closer analysis reveals a lot of silliness here.  There is an
inherent error in RelationGetBufferWithBuffer: if it allocates a fresh
buffer then that buffer gets pinned, but if it returns the passed-in
buffer then no extra pin is added.  Since the caller does not know which
case applies, there must be either a leaked pin or a bogus pin count
decrement later on.  The reason we see no failure is that the only use
of this routine is from heapgettup, and it turns out that the passed
buffer is *always* either invalid or the target buffer.  (I have run the
regression tests with Asserts in place that confirm this deduction.)
What's more, heapgettup conveniently forgets to decrement the pin count
of the buffer it passes, thus cancelling out
RelationGetBufferWithBuffer's failure to increment the pin count.  If
the target page were ever different from the passed buffer, these errors
would not cancel out and we'd have a buffer leak.

I think that we ought to get rid of RelationGetBufferWithBuffer
entirely, since it is really a broken variant of ReleaseAndReadBuffer.
We should add a test to ReleaseAndReadBuffer to short-circuit the work
if the passed buffer is the same as the requested page, and change the
calls in heapgettup to call ReleaseAndReadBuffer instead, which makes
the heapam.c code correct not to do its own decrements of the pin count.
(The handwaving comments in heap_getnext's calls of heapgettup could
then be fixed.)

While I'm looking at this, I can't help noticing that heapam.c goes to a
large amount of trouble to maintain "previous" and "next" tuple pointers
and pins in a seqscan, not only "current".  These pointers are useless
except in the very unusual case where one steps forward and then back
in a sequential scan (for example, "FETCH 1; FETCH BACKWARD 1;" in a
cursor).  It seems to me that this is wrongheaded.  We could simplify
and speed up the normal case by maintaining only a "current" pointer,
which would be well worth the extra work in the forward/back step case.

Comments, objections?
        regards, tom lane


Re: Various silliness in heap_getnext and related routines

From
bruc@stone.congenomics.com (Robert E. Bruccoleri)
Date:
Dear Tom,
> 
> 
> Robert Bruccoleri (bruc@stone.congen.com) wrote:
> > It's not clear to me why the spinlock needs be grabbed at the
> > beginning of RelationGetBufferWithBuffer,
> 
> I believe you are right: the spinlock doesn't need to be grabbed,
> because if a valid buffer is passed in, it must already be pinned
> (since the returned buffer is expected to be pinned).  Hence the check
> for same-buffer could be done without first grabbing the spinlock.

For my immediate problem, would removing the spinlock acquisition
be OK?

Thanks for looking into this problem.

Sincerely,
Bob

+----------------------------------+------------------------------------+
| Robert E. Bruccoleri, Ph.D.      | Phone: 609 737 6383                |
| President, Congenomics, Inc.     | Fax:   609 737 7528                |
| 114 W Franklin Ave, Suite K1,4,5 | email: bruc@acm.org                |
| P.O. Box 314                     | URL:   http://www.congen.com/~bruc |
| Pennington, NJ 08534             |                                    |
+----------------------------------+------------------------------------+


Re: Various silliness in heap_getnext and related routines

From
Tom Lane
Date:
bruc@stone.congenomics.com (Robert E. Bruccoleri) writes:
> For my immediate problem, would removing the spinlock acquisition
> be OK?

It'd be interesting to remove the marked lines:
    bufHdr = &BufferDescriptors[buffer - 1];
-        SpinAcquire(BufMgrLock);    if (bufHdr->tag.blockNum == blockNumber &&
RelFileNodeEquals(bufHdr->tag.rnode,relation->rd_node))    {
 
-            SpinRelease(BufMgrLock);        return buffer;    }
-        return ReadBufferWithBufferLock(relation, blockNumber, true);

and see how that affects your performance issue, if at all.
        regards, tom lane


Re: Various silliness in heap_getnext and related routines

From
bruc@stone.congenomics.com (Robert E. Bruccoleri)
Date:
Dear Tom,
> 
> 
> bruc@stone.congenomics.com (Robert E. Bruccoleri) writes:
> > For my immediate problem, would removing the spinlock acquisition
> > be OK?
> 
> It'd be interesting to remove the marked lines:
> 
>         bufHdr = &BufferDescriptors[buffer - 1];
> -        SpinAcquire(BufMgrLock);
>         if (bufHdr->tag.blockNum == blockNumber &&
>             RelFileNodeEquals(bufHdr->tag.rnode, relation->rd_node))
>         {
> -            SpinRelease(BufMgrLock);
>             return buffer;
>         }
> -        return ReadBufferWithBufferLock(relation, blockNumber, true);
> 
> and see how that affects your performance issue, if at all.
> 

I have made those changes, ran the regression tests, and then issued
eight simultaneous retrieval jobs against the database. Performance is
now greatly improved with all the jobs making quick progress through
the sequential query and completing in very reasonable times.

Thanks very much for your help.

BTW, given the high level of support that you provide to the PostgreSQL
community, it's very accurate to state that support for PostgreSQL
is far superior to that of Oracle, especially for SGI systems.

Sincerely,
Bob

+----------------------------------+------------------------------------+
| Robert E. Bruccoleri, Ph.D.      | Phone: 609 737 6383                |
| President, Congenomics, Inc.     | Fax:   609 737 7528                |
| 114 W Franklin Ave, Suite K1,4,5 | email: bruc@acm.org                |
| P.O. Box 314                     | URL:   http://www.congen.com/~bruc |
| Pennington, NJ 08534             |                                    |
+----------------------------------+------------------------------------+


Re: Various silliness in heap_getnext and related routines

From
Tom Lane
Date:
bruc@stone.congenomics.com (Robert E. Bruccoleri) writes:
> BTW, given the high level of support that you provide to the PostgreSQL
> community, it's very accurate to state that support for PostgreSQL
> is far superior to that of Oracle, especially for SGI systems.

It's all about having the source code available, I think.  After all,
it was you who identified the location of the problem...
        regards, tom lane


Re: Various silliness in heap_getnext and related routines

From
bruc@stone.congenomics.com (Robert E. Bruccoleri)
Date:
Dear Tom,
> 
> 
> bruc@stone.congenomics.com (Robert E. Bruccoleri) writes:
> > BTW, given the high level of support that you provide to the PostgreSQL
> > community, it's very accurate to state that support for PostgreSQL
> > is far superior to that of Oracle, especially for SGI systems.
> 
> It's all about having the source code available, I think.  After all,
> it was you who identified the location of the problem...

Yes, but it's not just having the source code. Although I could see
a potential problem, it was your knowledge of the source code to recommend
a patch that worked immediately, and your willingness to help
that together makes PostgreSQL support so good. The same knowledge and
helpful attitude applies to all the PostgreSQL developers.

Thanks. --Bob

+----------------------------------+------------------------------------+
| Robert E. Bruccoleri, Ph.D.      | Phone: 609 737 6383                |
| President, Congenomics, Inc.     | Fax:   609 737 7528                |
| 114 W Franklin Ave, Suite K1,4,5 | email: bruc@acm.org                |
| P.O. Box 314                     | URL:   http://www.congen.com/~bruc |
| Pennington, NJ 08534             |                                    |
+----------------------------------+------------------------------------+


Re: Re: Various silliness in heap_getnext and related routines

From
"Ross J. Reedstrom"
Date:
On Mon, Jun 11, 2001 at 12:21:56PM -0400, Robert E. Bruccoleri wrote:
> Dear Tom,
> > 
> > 
> > bruc@stone.congenomics.com (Robert E. Bruccoleri) writes:
> > > BTW, given the high level of support that you provide to the PostgreSQL
> > > community, it's very accurate to state that support for PostgreSQL
> > > is far superior to that of Oracle, especially for SGI systems.
> > 
> > It's all about having the source code available, I think.  After all,
> > it was you who identified the location of the problem...
> 
> Yes, but it's not just having the source code. Although I could see
> a potential problem, it was your knowledge of the source code to recommend
> a patch that worked immediately, and your willingness to help
> that together makes PostgreSQL support so good. The same knowledge and
> helpful attitude applies to all the PostgreSQL developers.

Gee, guys, I'm tearing up here. ;-)

sniff - 

Ross



Re: Re: Various silliness in heap_getnext and related routines

From
Bruce Momjian
Date:
> > Yes, but it's not just having the source code. Although I could see
> > a potential problem, it was your knowledge of the source code to recommend
> > a patch that worked immediately, and your willingness to help
> > that together makes PostgreSQL support so good. The same knowledge and
> > helpful attitude applies to all the PostgreSQL developers.
> 
> Gee, guys, I'm tearing up here. ;-)
> 
> sniff - 

Let's have a group hug...
--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026