Re: Memory prefetching while sequentially fetching from SortTuple array, tuplestore - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Memory prefetching while sequentially fetching from SortTuple array, tuplestore
Date
Msg-id 20150902141225.GB5286@alap3.anarazel.de
Whole thread Raw
In response to Re: Memory prefetching while sequentially fetching from SortTuple array, tuplestore  (Peter Geoghegan <pg@heroku.com>)
Responses Re: Memory prefetching while sequentially fetching from SortTuple array, tuplestore  (Peter Geoghegan <pg@heroku.com>)
List pgsql-hackers
On 2015-07-19 16:34:52 -0700, Peter Geoghegan wrote:
> +# PGAC_C_BUILTIN_PREFETCH
> +# -------------------------
> +# Check if the C compiler understands __builtin_prefetch(),
> +# and define HAVE__BUILTIN_PREFETCH if so.
> +AC_DEFUN([PGAC_C_BUILTIN_PREFETCH],
> +[AC_CACHE_CHECK(for __builtin_prefetch, pgac_cv__builtin_prefetch,
> +[AC_COMPILE_IFELSE([AC_LANG_PROGRAM([],
> +[int i = 0;__builtin_prefetch(&i, 0, 3);])],
> +[pgac_cv__builtin_prefetch=yes],
> +[pgac_cv__builtin_prefetch=no])])
> +if test x"$pgac_cv__builtin_prefetch" = xyes ; then
> +AC_DEFINE(HAVE__BUILTIN_PREFETCH, 1,
> +          [Define to 1 if your compiler understands __builtin_prefetch.])
> +fi])# PGAC_C_BUILTIN_PREFETCH

Hm. Is a compiler test actually test anything reliably here? Won't this
just throw a warning during compile time about an unknown function?

> +/*
> + * Prefetch support -- Support memory prefetching hints on some platforms.
> + *
> + * pg_rfetch() is specialized for the case where an array is accessed
> + * sequentially, and we can prefetch a pointer within the next element (or an
> + * even later element) in order to hide memory latency.  This case involves
> + * prefetching addresses with low temporal locality.  Note that it's rather
> + * difficult to get any kind of speedup using pg_rfetch();  any use of the
> + * intrinsic should be carefully tested.  Also note that it's okay to pass it
> + * an invalid or NULL address, although it's best avoided.
> + */
> +#if defined(USE_MEM_PREFETCH)
> +#define pg_rfetch(addr)        __builtin_prefetch((addr), 0, 0)
> +#endif

What is that name actually supposed to mean? 'rfetch' doesn't seem to be
particularly clear - or is it a meant to be a wordplay combined with the
p?

I don't think you should specify the read/write and locality parameters
if we don't hand-pick them - right now you're saying the memory will
only be read and that it has no temporal locality.

I think there should be a empty fallback definition even if the current
only caller ends up not needing it - not all callers will require it.

> +                    /*
> +                     * Perform memory prefetch of tuple that's three places
> +                     * ahead of current (which is returned to caller).
> +                     * Testing shows that this significantly boosts the
> +                     * performance for TSS_INMEM "forward" callers by
> +                     * hiding memory latency behind their processing of
> +                     * returned tuples.
> +                     */
> +#ifdef USE_MEM_PREFETCH
> +                    if (readptr->current + 3 < state->memtupcount)
> +                        pg_rfetch(state->memtuples[readptr->current + 3]);
> +#endif

Hm. Why do we need a branch here? The target of prefetches don't need to
be valid addresses and adding a bunch of arithmetic and a branch for the
corner case doesn't sound worthwhile to me.


What worries me about adding explicit prefetching is that it's *highly*
platform and even micro-architecture dependent. Why is looking three
elements ahead the right value?

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: Horizontal scalability/sharding
Next
From: Fujii Masao
Date:
Subject: Re: track_commit_timestamp and COMMIT PREPARED