Re: Memory prefetching while sequentially fetching from SortTuple array, tuplestore - Mailing list pgsql-hackers
From | Andres Freund |
---|---|
Subject | Re: Memory prefetching while sequentially fetching from SortTuple array, tuplestore |
Date | |
Msg-id | 20150902141225.GB5286@alap3.anarazel.de Whole thread Raw |
In response to | Re: Memory prefetching while sequentially fetching from SortTuple array, tuplestore (Peter Geoghegan <pg@heroku.com>) |
Responses |
Re: Memory prefetching while sequentially fetching from
SortTuple array, tuplestore
(Peter Geoghegan <pg@heroku.com>)
|
List | pgsql-hackers |
On 2015-07-19 16:34:52 -0700, Peter Geoghegan wrote: > +# PGAC_C_BUILTIN_PREFETCH > +# ------------------------- > +# Check if the C compiler understands __builtin_prefetch(), > +# and define HAVE__BUILTIN_PREFETCH if so. > +AC_DEFUN([PGAC_C_BUILTIN_PREFETCH], > +[AC_CACHE_CHECK(for __builtin_prefetch, pgac_cv__builtin_prefetch, > +[AC_COMPILE_IFELSE([AC_LANG_PROGRAM([], > +[int i = 0;__builtin_prefetch(&i, 0, 3);])], > +[pgac_cv__builtin_prefetch=yes], > +[pgac_cv__builtin_prefetch=no])]) > +if test x"$pgac_cv__builtin_prefetch" = xyes ; then > +AC_DEFINE(HAVE__BUILTIN_PREFETCH, 1, > + [Define to 1 if your compiler understands __builtin_prefetch.]) > +fi])# PGAC_C_BUILTIN_PREFETCH Hm. Is a compiler test actually test anything reliably here? Won't this just throw a warning during compile time about an unknown function? > +/* > + * Prefetch support -- Support memory prefetching hints on some platforms. > + * > + * pg_rfetch() is specialized for the case where an array is accessed > + * sequentially, and we can prefetch a pointer within the next element (or an > + * even later element) in order to hide memory latency. This case involves > + * prefetching addresses with low temporal locality. Note that it's rather > + * difficult to get any kind of speedup using pg_rfetch(); any use of the > + * intrinsic should be carefully tested. Also note that it's okay to pass it > + * an invalid or NULL address, although it's best avoided. > + */ > +#if defined(USE_MEM_PREFETCH) > +#define pg_rfetch(addr) __builtin_prefetch((addr), 0, 0) > +#endif What is that name actually supposed to mean? 'rfetch' doesn't seem to be particularly clear - or is it a meant to be a wordplay combined with the p? I don't think you should specify the read/write and locality parameters if we don't hand-pick them - right now you're saying the memory will only be read and that it has no temporal locality. I think there should be a empty fallback definition even if the current only caller ends up not needing it - not all callers will require it. > + /* > + * Perform memory prefetch of tuple that's three places > + * ahead of current (which is returned to caller). > + * Testing shows that this significantly boosts the > + * performance for TSS_INMEM "forward" callers by > + * hiding memory latency behind their processing of > + * returned tuples. > + */ > +#ifdef USE_MEM_PREFETCH > + if (readptr->current + 3 < state->memtupcount) > + pg_rfetch(state->memtuples[readptr->current + 3]); > +#endif Hm. Why do we need a branch here? The target of prefetches don't need to be valid addresses and adding a bunch of arithmetic and a branch for the corner case doesn't sound worthwhile to me. What worries me about adding explicit prefetching is that it's *highly* platform and even micro-architecture dependent. Why is looking three elements ahead the right value? Greetings, Andres Freund
pgsql-hackers by date: