Thread: make BufferGetBlockNumber() a macro

make BufferGetBlockNumber() a macro

From
Neil Conway
Date:
The attached patch re-implements BufferGetBlockNumber() as a macro,
for performance reasons. It also adds an assertion that should probably
be present.

I ran into this while profiling the hash index creation code (and
trying to determine why it's so slow). I noticed that this simple
function was being called about 2,500,000 times while creating a hash
index on a reasonably sized table (225,000 rows). By implementing this
as a macro, the performance of hash index creation improves by 8% (in
a totally unscientific and probably unreliable benchmark).

Cheers,

Neil

--
Neil Conway <neilconway@rogers.com>
PGP Key ID: DB3C29FC

Attachment

Re: make BufferGetBlockNumber() a macro

From
Tom Lane
Date:
Neil Conway <nconway@klamath.dyndns.org> writes:
> The attached patch re-implements BufferGetBlockNumber() as a macro,
> for performance reasons.

The trouble with this is that it forces storage/buf_internals.h to
become part of bufmgr.h, thereby completely destroying any information
hiding that we had from the division of the two headers --- any client
of bufmgr.h might now access bufmgr internal stuff without noticing
that it was doing wrong.

I am not convinced that BufferGetBlockNumber is a bottleneck anyway;
at least not if you don't have Assert checking enabled.

There are a few places that do things like
    ItemPointerSet(&(tuple->t_self), BufferGetBlockNumber(buffer), offnum);
which I believe results in two calls to BufferGetBlockNumber because of
the way BlockIdSet is coded.  This would be good to clean up, but it's
BlockIdSet's problem not BufferGetBlockNumber's.

> It also adds an assertion that should probably
> be present.

FWIW, BufferIsPinned also checks BufferIsValid; you do not need both.

            regards, tom lane

Re: make BufferGetBlockNumber() a macro

From
Neil Conway
Date:
On Mon, 01 Apr 2002 17:59:06 -0500
"Tom Lane" <tgl@sss.pgh.pa.us> wrote:
> Neil Conway <nconway@klamath.dyndns.org> writes:
> > The attached patch re-implements BufferGetBlockNumber() as a macro,
> > for performance reasons.
>
> The trouble with this is that it forces storage/buf_internals.h to
> become part of bufmgr.h, thereby completely destroying any information
> hiding that we had from the division of the two headers --- any client
> of bufmgr.h might now access bufmgr internal stuff without noticing
> that it was doing wrong.

Yeah, I considered that. IMHO, this isn't that big of a drawback:
C's information hiding is weak enough that you need to rely on the
client programmer to exercise good judgement anyway; for example, there
is absolutely nothing to stop someone from just including
storage/buf_internals.h to begin with. Furthermore, in order to actually
_see_ the declarations in buf_internals.h, a client programmer still
needs to look at it, not bufmgr.h (i.e. the #include only effects
cpp).

If you prefer, I can move the definition of BufferDesc from
storage/buf_internals.h to storage/bufmgr.h; this still destroys
a little of the information hiding of the old code, but if a client
programmer is stupid enough to manipulate a structure that is
marked as internal and isn't packaged with any support functions
or other code for dealing with it, they deserve their code getting
broken.

> I am not convinced that BufferGetBlockNumber is a bottleneck anyway;
> at least not if you don't have Assert checking enabled.

I didn't have assert checking enabled. As for it being a bottleneck,
I'm sure the reason hash index creation is so slow lies elsewhere;
but that doesn't stop this from being a performance improvement.

> There are a few places that do things like
>     ItemPointerSet(&(tuple->t_self), BufferGetBlockNumber(buffer), offnum);
> which I believe results in two calls to BufferGetBlockNumber because of
> the way BlockIdSet is coded.  This would be good to clean up, but it's
> BlockIdSet's problem not BufferGetBlockNumber's.

If BufferGetBlockNumber is inlined (i.e. implemented as either a macro
or a C99 inline function), it becomes just a conditional and an array
access, so there's little need to optimize the usage of it any further.
As it stands, a full function call is quite a lot of overhead,
particularly for something that is called so often (at least for the
situation I was looking at).

> > It also adds an assertion that should probably
> > be present.
>
> FWIW, BufferIsPinned also checks BufferIsValid; you do not need both.

Ah, ok. So BufferIsPinned() is the correct assertion. An updated patch
is attached. Also, I removed an obsolete comment + line of code from
storage/buf.h, and deleted an incoherent comment from storage/bufmgr.h

Cheers,

Neil

--
Neil Conway <neilconway@rogers.com>
PGP Key ID: DB3C29FC

Attachment

Re: make BufferGetBlockNumber() a macro

From
Tom Lane
Date:
Neil Conway <nconway@klamath.dyndns.org> writes:
> C's information hiding is weak enough that you need to rely on the
> client programmer to exercise good judgement anyway;

This is not a good argument for making the hiding even weaker.  I still
object to this patch.

It would be interesting to try to understand *why* BufferGetBlockNumber
is showing up so high in your profile; AFAICS it is not called in any
paths where it'd be a bottleneck --- ie, there is other much more
expensive stuff in the same code paths.  Do you have call counts by
caller?  What exactly was the test case, anyway?

            regards, tom lane

Re: make BufferGetBlockNumber() a macro

From
Neil Conway
Date:
On Mon, 01 Apr 2002 20:50:27 -0500
"Tom Lane" <tgl@sss.pgh.pa.us> wrote:
> Neil Conway <nconway@klamath.dyndns.org> writes:
> > C's information hiding is weak enough that you need to rely on the
> > client programmer to exercise good judgement anyway;
>
> This is not a good argument for making the hiding even weaker.  I still
> object to this patch.

Even if it were modified to just move the BufferDesc structure into
storage/bufmgr.h?

> It would be interesting to try to understand *why* BufferGetBlockNumber
> is showing up so high in your profile;

It's showing up "high" in the sense that it's being called a lot of
times (2.5 million); each call, of course, is fast, and is dominated
by other parts of the profile (I just sent in the patch since I thought
it was an easy way to improve the performance a little bit).

> Do you have call counts by caller?

Yes, I have the full output of gprof. I didn't send it along since it's
~400 KB; if you're interested, let me know.

> What exactly was the test case, anyway?

I created a table with a single TEXT column; I copied the contents of
/usr/share/dict/words into the column, 4 times. I then quit the session
and started a new one in which I created a hash index on the column,
and ran gprof on the resulting profile data.

Cheers,

Neil

--
Neil Conway <neilconway@rogers.com>
PGP Key ID: DB3C29FC

Re: make BufferGetBlockNumber() a macro

From
Neil Conway
Date:
On Mon, 01 Apr 2002 20:50:27 -0500
"Tom Lane" <tgl@sss.pgh.pa.us> wrote:
> Neil Conway <nconway@klamath.dyndns.org> writes:
> > C's information hiding is weak enough that you need to rely on the
> > client programmer to exercise good judgement anyway;
>
> This is not a good argument for making the hiding even weaker.  I still
> object to this patch.

Ok. The attached trivial patch removes some obsolete or incoherent
comments and fixes an assertion in BufferGetBlockNumber. The
macro-ization of BufferGetBlockNumber is _not_ included.

Please apply.

Cheers,

Neil

--
Neil Conway <neilconway@rogers.com>
PGP Key ID: DB3C29FC

Attachment

Re: make BufferGetBlockNumber() a macro

From
Bruce Momjian
Date:
Your patch has been added to the PostgreSQL unapplied patches list at:

    http://candle.pha.pa.us/cgi-bin/pgpatches

I will try to apply it within the next 48 hours.

---------------------------------------------------------------------------


Neil Conway wrote:
> On Mon, 01 Apr 2002 20:50:27 -0500
> "Tom Lane" <tgl@sss.pgh.pa.us> wrote:
> > Neil Conway <nconway@klamath.dyndns.org> writes:
> > > C's information hiding is weak enough that you need to rely on the
> > > client programmer to exercise good judgement anyway;
> >
> > This is not a good argument for making the hiding even weaker.  I still
> > object to this patch.
>
> Ok. The attached trivial patch removes some obsolete or incoherent
> comments and fixes an assertion in BufferGetBlockNumber. The
> macro-ization of BufferGetBlockNumber is _not_ included.
>
> Please apply.
>
> Cheers,
>
> Neil
>
> --
> Neil Conway <neilconway@rogers.com>
> PGP Key ID: DB3C29FC

[ Attachment, skipping... ]

>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to majordomo@postgresql.org so that your
> message can get through to the mailing list cleanly

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026